Single Source of Truth
Encyclopedia
In Information Systems
design and theory, as instantiated at the Enterprise Level, Single Source Of Truth (SSOT) refers to the practice of structuring information models and associated schemata such that every data element is stored exactly once (e.g. in no more than a single row of a single table). Any possible linkages to this data element (possibly in other areas of the relational schema or even in distant federated databases) are by reference only. Thus, when any such data element is updated, this update propagates to the enterprise at large, without the possibility of a duplicate value somewhere in the distant enterprise not being updated (because there would be no duplicate values that needed updating).
Deployment of this architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements (a direct consequence of intentional or unintentional denormalization
of any explicit data model) poses a risk for retrieval of outdated, and therefore incorrect, information. A common example would be the electronic health record
, where it is imperative to accurately validate patient identity against a single referential repository, which serves as the SSOT. Duplicate representations of data within the enterprise would be implemented by the use of pointers rather than duplicate database tables, rows or cells. This ensures that data updates to elements in the authoritative location are comprehensively distributed to all federated database constituencies in the larger overall enterprise architecture.
SSOT systems provide data that is authentic, relevant and referable.
For organisations (with more than one information system) wishing to implement a Single Source of Truth (without modifying all but one master system to store pointers to other systems for all entities), three supporting technologies are commonly used. These are an Enterprise Service Bus
, Master Data Management
(or MDM), and a Data Warehouse
.
(An alternative approach is point-to-point data updates, however these become exponentially more expensive to maintain as the number of systems increases, and this approach is increasingly out of favour as an IT architecture.)
Customer Data Integration
(CDI) is a common application of Master Data Management, and is sometimes abbreviated CDI-MDM.
) means that the data warehouse is often used as a defacto SSOT. Generally however, the data available from the data warehouse is not used to update other systems; rather the DW becomes the "single source of truth" for reporting to multiple stakeholders. In this context the Data Warehouse is more correctly referred to as a "Single Version of the Truth
" since other versions of the truth may exist in its data sources.
Information systems
Information Systems is an academic/professional discipline bridging the business field and the well-defined computer science field that is evolving toward a new scientific area of study...
design and theory, as instantiated at the Enterprise Level, Single Source Of Truth (SSOT) refers to the practice of structuring information models and associated schemata such that every data element is stored exactly once (e.g. in no more than a single row of a single table). Any possible linkages to this data element (possibly in other areas of the relational schema or even in distant federated databases) are by reference only. Thus, when any such data element is updated, this update propagates to the enterprise at large, without the possibility of a duplicate value somewhere in the distant enterprise not being updated (because there would be no duplicate values that needed updating).
Deployment of this architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements (a direct consequence of intentional or unintentional denormalization
Denormalization
In computing, denormalization is the process of attempting to optimise the read performance of a database by adding redundant data or by grouping data. In some cases, denormalisation helps cover up the inefficiencies inherent in relational database software...
of any explicit data model) poses a risk for retrieval of outdated, and therefore incorrect, information. A common example would be the electronic health record
Electronic Health Record
An electronic health record is an evolving concept defined as a systematic collection of electronic health information about individual patients or populations...
, where it is imperative to accurately validate patient identity against a single referential repository, which serves as the SSOT. Duplicate representations of data within the enterprise would be implemented by the use of pointers rather than duplicate database tables, rows or cells. This ensures that data updates to elements in the authoritative location are comprehensively distributed to all federated database constituencies in the larger overall enterprise architecture.
SSOT systems provide data that is authentic, relevant and referable.
SSOT categories
- Party SSOT
- Product SSOT
- Partner SSOT
- Geo SSOT
- Campaign SSOT
- Services SSOT
Implementing a Single Source of Truth
The "ideal" implementation of SSOT as described above is rarely possible in most enterprises. This is because most organisations have multiple information systems, each of which needs access to data relating to the same entities (e.g. customer). Usually these systems are purchased "off the shelf" from vendors and cannot be modified in non-trivial ways. Each of these various systems therefore needs to store its own version of common data or entities, and therefore each system must retain its own copy of a record (hence immediately violating the SSOT approach defined above). For example, an ERP (Enterprise Resource Planning) system (such as SAP or Oracle e-Business Suite) may store a customer record; the CRM (Customer Relationship Management) system also needs a copy of the customer record (or part of it) and the warehouse despatch system may also need a copy of some or all of the customer data (e.g. shipping address). It may not be possible to replace these records with pointers to the SSOT as the vendor(s) may not support such modifications.For organisations (with more than one information system) wishing to implement a Single Source of Truth (without modifying all but one master system to store pointers to other systems for all entities), three supporting technologies are commonly used. These are an Enterprise Service Bus
Enterprise service bus
An enterprise service bus is a software architecture model used for designing and implementing the interaction and communication between mutually interacting software applications in Service Oriented Architecture...
, Master Data Management
Master Data Management
In computing, master data management comprises a set of processes and tools that consistently defines and manages the non-transactional data entities of an organization...
(or MDM), and a Data Warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...
.
Enterprise Service Bus
An Enterprise Service Bus (ESB) allows any number of systems in an organisation to receive updates of data that has changed in another system. To implement a Single Source of Truth, a single source system of correct data for any entity must be identified. Changes to this entity (updates, creates and deletes) are then published via the ESB; other systems which need to retain a copy of that data subscribe to this update, and update their own records accordingly. For any given entity, the master source must be identified (sometimes called the Golden Record). It should be noted that any given system may publish (be the source of truth for) information on one entity (e.g. customer) while subscribing to updates from another system for information on other entities (e.g. product).(An alternative approach is point-to-point data updates, however these become exponentially more expensive to maintain as the number of systems increases, and this approach is increasingly out of favour as an IT architecture.)
Master Data Management
An MDM system can act as the source of truth for any given entity that may or may not have an alternative "source of truth" in another system. Typically the MDM acts as a hub for multiple systems, many of which may allow (be the source of truth for) updates to different aspects of information on a given entity. For example, the CRM system may be the "source of truth" for most aspects of the customer, and is updated by a call centre operator. However, a customer may (for example) also update their address via a customer service web site, with a different back-end database from the CRM system. The MDM application receives updates from multiple sources, acts as a broker to determine which updates are to be regarded as authoritative (the Golden Record) and then syndicates this updated data to all subscribing systems. The MDM application normally requires an ESB to syndicate its data to multiple subscribing systems.Customer Data Integration
Customer Data Integration
In data processing, customer data integration combines the technology, processes and services needed to set up and maintain an accurate, timely, complete and comprehensive representation of a customer across multiple channels, business-lines, and enterprises — typically from multiple sources of...
(CDI) is a common application of Master Data Management, and is sometimes abbreviated CDI-MDM.
Data Warehouse
While the primary purpose of a data warehouse is to support reporting and analysis of data that has been combined from multiple sources, the fact that that data has been combined (according to business logic embedded in the data transformation and integration processesExtract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
) means that the data warehouse is often used as a defacto SSOT. Generally however, the data available from the data warehouse is not used to update other systems; rather the DW becomes the "single source of truth" for reporting to multiple stakeholders. In this context the Data Warehouse is more correctly referred to as a "Single Version of the Truth
Single version of the truth
In computerized business management, svot, or Single Version of the Truth, is a technical concept describing the data warehousing ideal of having either a single centralised database, or at least a distributed synchronised database, which stores all of an organisation's data in a consistent and...
" since other versions of the truth may exist in its data sources.
See also
- Don't Repeat YourselfDon't repeat yourselfIn software engineering, Don't Repeat Yourself is a principle of software development aimed at reducing repetition of information of all kinds, especially useful in multi-tier architectures...
- Solid (object-oriented design)
- Database normalizationDatabase normalizationIn the design of a relational database management system , the process of organizing data to minimize redundancy is called normalization. The goal of database normalization is to decompose relations with anomalies in order to produce smaller, well-structured relations...
- Single Version of the TruthSingle version of the truthIn computerized business management, svot, or Single Version of the Truth, is a technical concept describing the data warehousing ideal of having either a single centralised database, or at least a distributed synchronised database, which stores all of an organisation's data in a consistent and...