Unrivalled data integrity
To guarantee data quality at its core, SustainabilityDisclosures takes the data service to a whole new level by making the data fully traceable to its source, immutable, and validated for logical and contextual integrity.
Traceability
Every datapoint from SustainabilityDisclosures is traceable to its source. This is achieved by embedding a web link in the metadata of each datapoint. These links are always included in API outputs, Excel files (as clickable cell hyperlinks), and the online platform. The links lead to an online PDF viewer at pdf.SustainabilityDisclosures.com, which highlights specific areas in company reports where the datapoint was sourced.
Company reports are stored in the internal database of SustainabilityDisclosures. Each link includes a coordinate string containing all necessary information to identify the document and the highlighted area. This functionality allows clients to recreate a similar PDF viewer on-premise, even without access to the SustainabilityDisclosures database. For more details on implementing this functionality, contact support@SustainabilityDisclosures.com.
Immutability
Immutability, a concept popularized by the cryptocurrency industry, ensures that data cannot be altered without leaving a transparent and verifiable record.
A major aspect of immutability is supported by the traceability feature, but traceability alone isn’t sufficient when datapoints are corrected or normalized to ensure standardization. In cases where data collection remains predominantly manual (as is common among many data vendors today), establishing a transparent record of how data has been modified is challenging. Often, there is no reliable way to determine if the data has been altered at all.
SustainabilityDisclosures clearly differentiates between a datapoint’s original value, as reported in the original publication, and its amended value. This distinction is reflected across all outputs:
- API Outputs: Each datapoint includes two properties: .value (original) and .amendedValue (adjusted).
- Excel Outputs: Amended values are italicized, and the respective cells include comments showing the original values.
- Web Pages: Amended values are similarly highlighted in italics.
SustainabilityDisclosures automates most calculations and corrections, allowing these to be transparently displayed in the output. In Excel files, formulas used in the calculations are embedded directly in the cells where the values are inserted. Additionally, web links that enable traceability will reference all datapoints involved in the calculation in such cases.
There are three primary scenarios where a datapoint may need to be amended:
Normalisation
The most common reason for amendment is normalisation, where we ensure that data is standardized and comparable across the entire database. This often involves straightforward adjustments, such as converting measurement units.
In corporate EU Taxonomy disclosures, normalisation is particularly crucial in cases where companies disclose contributions of aligned activities to specific environmental goals. Some companies report these as shares of total aligned activities (shown on the left side of the regulatory template table), while others report them as proportions of grand total revenues, opex, or capex at the bottom of the table.
SustainabilityDisclosures standardizes all such disclosures according to the first approach, where contributions are presented as shares of total aligned activities. This adjustment is necessary for approximately half of the companies in the database.
Filling Omitted Fields
The second most common reason for amending data involves completing fields omitted in company disclosures. Regulatory reporting templates specify required fields, yet many companies fail to report even critical metrics, such as the share of aligned activities. This issue is particularly prevalent among companies reporting zero alignment. While they may explicitly disclose zero absolute amounts, they often leave key KPI fields—such as the share of aligned activities—empty. In these instances, SustainabilityDisclosures.com automatically calculates and populates the missing fields based on the available reported data, ensuring completeness and consistency.
Correcting Reporting Errors
The final scenario involves correcting mistakes in company disclosures. Often, broader disclosure details provide enough evidence to justify corrections. For example, a company might report all aligned and eligible activities, along with totals, absolute amounts, and shares. While most figures may align correctly, a single typo can disrupt the mathematical relationships within the disclosure, making the error easily identifiable. In such cases, corrections are applied.
However, if insufficient evidence exists to identify the error, the .amendedValue field remains blank, and the original (suspected incorrect) value is retained in the .value field. This approach ensures transparency and avoids propagating untrustworthy data.
We currently do not classify amendment types in the outputs. However, this can be provided upon request. For more details, please contact support@SustainabilityDisclosures.com.
Logic and Context Validation
Logic and Context Validation ensures that every datapoint makes sense from the analyst’s perspective. This is a complex concept that involves evaluating data from multiple angles to determine its coherence. While most data vendors limit such checks to identifying statistical anomalies, outliers, or values that appear significantly "off", SustainabilityDisclosures takes this process to an entirely new level by aligning data validation with the way analysts approach it themselves.
We verify whether each datapoint aligns with the overall presentation pattern, ensuring that mathematical relationships hold true, the data adheres to the standards (which are particularly strict for regulatory disclosures), and the logic of the presentation is consistent and coherent.
For example, in EU Taxonomy disclosures, all activities must add up to the total at the bottom. Proportions should match the respective absolute amount to the left, divided by the total at the bottom. Subtotals must align with grand totals. Textual fields must follow the regulatory standard and remain mathematically connected to the reported proportions, among other things.
As a result, the entire disclosure is governed by dozens of logical rules that interconnect all reported fields, forming a unified framework that elevates data integrity to a whole new level. Rather than assessing each datapoint in isolation, we evaluate the entire disclosure as a cohesive piece. This approach mirrors the reasoning of a human analyst and essentially represents artificial intelligence designed to replicate an analyst's way of thinking.
Our Excel file output offers a glimpse into how the Logic and Context Validation process works. Some (though not all) presentation patterns are reflected in Excel files through conditional formatting rules. If a pattern is violated, the rule triggers, highlighting the respective cells and indicating likely errors.