In information science and information technology, single source of truth (SSOT) architecture, or single point of truth (SPOT) architecture, for information systems is the practice of structuring information models and associated data schemas such that every data element is mastered (or edited) in only one place, providing data normalization to a canonical form (for example, in database normalization or content transclusion). Any possible linkages to this data element (possibly in other areas of the relational schema or even in distant federated databases) are by reference only. Because all other locations of the data just refer back to the primary "source of truth" location, updates to the data element in the primary location propagate to the entire system, providing multiple advantages simultaneously: greater efficiency/productivity, easy prevention of mistaken inconsistencies (such as a duplicate value/copy somewhere being forgotten), and greatly simplified version control. Without SSOT architecture, rampant forking impairs clarity and productivity, imposing laborious maintenance needs.

Deployment of an SSOT architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements (a direct consequence of intentional or unintentional denormalization of any explicit data model) pose a risk for retrieval of outdated, and therefore incorrect, information. Common examples (i.e., example classes of implementation) are as follows:

Ideally, SSOT systems provide data that are authentic (and authenticatable), relevant, and referable.[1]

Implementation

Ontologic interactions

An acknowledged prerequisite (of the notion that any given single source of truth can exist) is that it depends on the ontologic condition that no more than a single truth (about any particular fact or idea) exists, an assertion that is ontologic in both the IT sense and the general sense of that word. In many instances, this presents no problem (for example, within particular namespaces, or even across them, as long as naming collisions or broader name conflicts are adequately handled). The broadest contexts (and thus thorniest, regarding ontologic discrepancies) require adequate epistemic regime comparison and reconciliation (or at least negotiation or transactional exchanges). An archetypal example of this class of reconciliation is that two theological seminary libraries, from two different religions (X and Y), could exchange information with an SSOT architecture, but the unification of truth would reside on the level of the statement that "religion X asserts that God is purple whereas religion Y asserts that God is green", rather than on the level of "God is purple" or "God is green". This platform-agnostic concept has civil applications and foreign relations applications as well, regarding jurisdictional differences in legal definitions: for example, whether any particular economic lens is good, bad, or syncretizible (e.g., capitalism, socialism, mixed economy), legal definitions of marriage (EHR example: is patient X married or not, according to which state law or its interstate reciprocity), whether sex assignment equals legal gender (EHR example: what is the gender or gender identity of patient X, according to the patient themselves (self-report) or according to someone else), and so on.

Architectures or architectural features

An ideal implementation of SSOT is rarely possible in most enterprises. This is because many organisations have multiple information systems, each of which needs access to data relating to the same entities (e.g., customer). Often these systems are purchased as commercial off-the-shelf products from vendors and cannot be modified in trivial ways. Each of these various systems therefore needs to store its own version of common data or entities, and therefore each system must retain its own copy of a record (hence immediately violating the SSOT approach defined above). For example, an enterprise resource planning (ERP) system (such as SAP or Oracle e-Business Suite) may store a customer record; the customer relationship management (CRM) system also needs a copy of the customer record (or part of it) and the warehouse dispatch system might also need a copy of some or all of the customer data (e.g., shipping address). In cases where vendors do not support such modifications, it is not always possible to replace these records with pointers to the SSOT.

For organisations (with more than one information system) wishing to implement a Single Source of Truth (without modifying all but one master system to store pointers to other systems for all entities), four supporting architectures are commonly used:[citation needed]

Enterprise service bus (ESB)

An enterprise service bus (ESB) allows any number of systems in an organisation to receive updates of data that has changed in another system. To implement a Single Source of Truth, a single source system of correct data for any entity must be identified. Changes to this entity (creates, updates, and deletes) are then published via the ESB; other systems which need to retain a copy of that data subscribe to this update, and update their own records accordingly. For any given entity, the master source must be identified (sometimes called the golden record). Any given system could publish (be the source of truth for) information on a particular entity (e.g., customer) and also subscribe to updates from another system for information on some other entity (e.g., product).[citation needed]

An alternative approach is point-to-point data updates, but these become excessively expensive to maintain as the number of systems increases, and this approach is increasingly out of favour as an IT architecture.[citation needed]

Master data management (MDM)

An MDM system can act as the source of truth for any given entity that might not necessarily have an alternative "source of truth" in another system. Typically the MDM acts as a hub for multiple systems, many of which could allow (be the source of truth for) updates to different aspects of information on a given entity. For example, the CRM system may be the "source of truth" for most aspects of the customer, and is updated by a call centre operator. However, a customer may (for example) also update their address via a customer service web site, with a different back-end database from the CRM system. The MDM application receives updates from multiple sources, acts as a broker to determine which updates are to be regarded as authoritative (the golden record) and then syndicates this updated data to all subscribing systems. The MDM application normally requires an ESB to syndicate its data to multiple subscribing systems.[3]

Data warehouse (DW)

While the primary purpose of a data warehouse is to support reporting and analysis of data that has been combined from multiple sources, the fact that such data has been combined (according to business logic embedded in the data transformation and integration processes) means that the data warehouse is often used as a de facto SSOT. Generally, however, the data available from the data warehouse are not used to update other systems; rather the DW becomes the "single source of truth" for reporting to multiple stakeholders. In this context, the Data Warehouse is more correctly referred to as a "single version of the truth" since other versions of the truth exist in its operational data sources (no data originates in the DW; it is simply a reporting mechanism for data loaded from operational systems).[citation needed]

Event store and event sourcing (ES)

In event oriented architectures, it has become increasingly common to find an implementation of the Event Sourcing pattern which consists of going to store the system state as an ordered sequence of state changes.[4] To do this, you need an Event Store, a particular type of database designed to hold all the events that change the state of the system. The event store in an Event Sourcing + Command Query Responsibility Separation + Domain Driven Design + Messaging architecture is in fact a "single source of truth", with the additional advantage that it can also act as an Enterprise Service Bus as it can be put into I listen directly to the event store for status changes as everyone passes by. In addition, by saving all the events, it also plays the role of Data Warehouse. As a last advantage, it has that through this system the Shared Database pattern can be implemented, another technique not mentioned to obtain a single source of truth.

Solid and source code

In software design, the same schema, business logic and other components are often repeated in multiple different contexts, while each version refers to itself as "Source Code". To address this problem, the concepts of SSOT can also be applied to software development principles using processes like recursive transcompiling to iteratively turn a single source of truth into many different kinds of source code, which will match each other structurally because they are all derived from the same SSOT.[5]

Distributed SaaS data (DSD)

In cases where storing data centrally and managing it in reference locations is impractical, such as in B2B software data ecosystems where there are multiple sources of truth, companies use a DSD system. This system plays air traffic controller to provide a veneer of central data management and control by pushing updates to and enforcing data accuracy in the locations where it is stored.

Data access and field productivity

Adoption of a single source of truth execution model is on the rise in the energy sector, where the technological advancements brought about by Industry 4.0 have enabled operators to improve field productivity. With an accessible SSOT for an industrial asset, owners are able to maximize worker efficiency by providing wireless mobility that enables on-demand access to verifiable field data, engineering drawings and inventory and communications with centralized operations experts.[6]

See also

  • Blockchain, distributed data store for digital transactions
  • Circular reporting, problem where a source gets info from somewhere, that then uses that source as a reference
  • Database normalization, technique for designing tables in relational databases such that duplication of information is minimised
  • Don't repeat yourself (DRY), a principle in software development that aims to reduce repetitive software patterns, replacing them with abstractions that use data normalization to avoid redundancy
  • Keep it simple, stupid (KISS),
  • Single version of the truth, ideal where all the data of an organisation is stored in a consistent and non-redundant form
  • Single version of facts, concept in data vault
  • SOLID (object-oriented design), five design principles for making object-oriented designs understandable, flexible, and maintainable (Single-responsibility; Open–closed; Liskov substitution; Interface segregation: Dependency)
  • System of record, the authorative data source for a given data element
  • Unix philosophy, a collection of cultural norms and philosophical approaches

References