Alveo Blog Data Management

Towards a Data Bill of Rights

The solutions and tooling available for data management have developed rapidly. Driven by the advent of public cloud ecosystems and the continuously increasing data intensity of the financial services industry, every employee of a financial services firm has had to work on their data literacy.

The self-service data access and analytics firepower available to staff today comes with increased responsibility. Sometimes firms have learned the hard way that data – as any other asset – needs to be maintained and cared for. Not properly looking after data can lead to huge backlashes from legislators, regulators, customers and investors. The concept of data citizens as employees who use data in an informed and responsible manner has rightfully found fruitful ground.

However, with increasing automation, business process integration and a move in prescriptive analytics from decision support to decision automation, the role of the human is sometimes turning into that of overall process design and the supervisory and control function. The advent of generative AI is driving further automation. It may therefore be helpful to look at the rights and responsibilities from the perspective of the data that is entering organizations, flowing through business processes, created from scratch and that is part of the interface of a firm to its customers, investors and regulators.

To what extent does the data look after itself? What would be the fundamental principles of data management looking at it from the point of view of a data element and not a human running a business process? What could a bit of data reasonably expect from a data management function to maximize its value to a firm. What are its rights?

If I were to sum up the fundamental principles of sound data management through this lens of a Bill of Rights for data we could come to something as follows:

  1. The right to be easily accessible. Data needs to be seen and to be discoverable to play its part. Within the constraints of regulatory and content licensing, anyone that benefits from knowing this data exists should be able to access it via different methods. Needless to say, data has freedom of movement, within the constraints mentioned above.
  2. The right to lineage. Data needs to know where it came from including its ancestry. This can either be the internal or external data provider, a human who entered it or an analytical model that produced it.
  3. The right to be properly administered. Data needs to be looked after and cared for, this includes keeping track of its date of birth and demise in case its use is discontinued or archived. Caring for data should include tracking any changes it has undergone in the form of an audit trail. This should also include any changes in ownership and access permissions. Metadata is data too.
  4. The right to shelter. Data needs to be housed properly in an environment where the principles above can be guaranteed and where its value to the organization can be maximized. The data management equivalent of habeas corpus means that data should not be lying around somewhere where the principles of easy access and clarity on metadata are compromised.
  5. The right to assemble. Data often becomes more valuable when linked and combined with other data sets coming from other external sources or other parts of an organization. This way it can contribute to new insights and produce new, often highly proprietary, information. Cross-referencing symbologies and identifiers is a necessary precondition for this.
  6. The right to proper care. Data should not be abused or used inappropriately where duties of care, content licensing, permission or access constraints or indeed legal principles are violated. The accuracy of data values should be checked via proper data cleansing procedures and values kept up to date and periodically ascertained.
  7. The right to not be overlooked. Data has the right to be seen and to play its part in adding value to a firm’s operations. It has the right to be used to produce new information, again within the constraints mentioned above and provided it does not infringe the rights of other data.

 

To sum it up, keep the points above in mind to make the most of your data assets. If you are kind to your data, the data will reciprocate.