At SP Energy Networks, we are dedicated to sharing high-quality data with our stakeholders and being transparent about its' quality. This is why we openly share the results of our Data Quality Assessments. We collaborate closely with Data Owners to address any identified issues and enhance our overall data quality. To demonstrate progress we will conduct, at a minimum, annual assessments of our data quality.
We welcome feedback and questions from our stakeholders regarding our approach to data quality. Our Open Data team is available to answer any enquiries or receive feedback on the assessments. You can contact them via our Open Data mailbox at opendata@spenergynetworks.co.uk
To access our data, please visit our Open Data Portal by clicking the following link: Open Data Portal.
Data Quality Dimensions
As a part of our comprehensive Data Quality Assessment, we measure the quality of our datasets across six dimensions. See below for a description of each of the dimensions and an example of how they may be used.
Validity | Validity measures whether the values in a dataset are within the correct range or format. This dimension ensures that the data adheres to predefined criteria set by the Data Owner, such as acceptable value ranges, formats and types. An example of this may be ‘the value must be greater than 1’ or ‘the value must be alphanumeric.’ |
---|
Completeness | Completeness checks whether the cells in a dataset are filled or empty. The score is based on a simple 'Yes/No' - if the cell is filled, it counts as complete. This check does not consider if the value in the cell is correct or valid. An example of this may be ‘the "Postcode" column must be populated’ or ‘the "Address" column cannot be blank.’ A completeness rule contains no specifications about what it should contain or how this should be structured. |
---|
Uniqueness | Uniqueness measures how many values in a dataset are unique. Any duplicate values will lower this score. This measure is important for data that must be unique to be correct, such as Customer ID or Project Reference ID. An example of this may be ‘every value in the "Customer ID" column must be different’ or ‘values in the "Project Reference ID" column must not be duplicated.’ |
---|
Timeliness | Timeliness measures whether a dataset is available within the agreed timeframe or Service-Level Agreement (SLA). The score reflects this availability. A score of 100% means the data is always available within the SLA. An example of this may be ‘the refreshed dataset must be available every 24 hours to ensure it remains accurate and is in line with the service-level agreement (SLA).’ |
---|
Consistency | Consistency measures whether data remains the same across different instances. This can be checked by comparing values in two different datasets to see if they match. It can also involve measuring the number of records to see if they stay the same or if there are any significant increases or decreases. An example of this may be 'the number of customer names must be the same in Dataset 1 and Dataset 2' or ‘the values in the "Postcode" column must match in Dataset 1 and Dataset 2.’ |
---|
Accuracy | Accuracy measures whether the contents of the dataset are correct and represent the truth. One way to measure data accuracy is to compare values to a standard or reference values provided by a reliable source. An example of this may be 'the “Customer Postcode” values are compared to Postcodes published by Ordnance Survey, alongside positive confirmation that the customer resides at this address.' |
---|
Data Quality Assessments
We have now completed the first phase of our Quality Assessments, measuring our datasets against the three dimensions of validity, completeness and uniqueness. We are now in the process of expanding our Quality Assessments to include the dimensions of timeliness, consistency and accuracy to provide a comprehensive evaluation, and will publish these in line with our roadmap which will be delivering across the year.
The data quality scores associated with each dataset are published on our Open Data Portal and can be accessed via the ‘Supporting Information’ theme on the homepage. Please note, to access our data quality checks you must be a registered user and logged in to our Portal.