How to resolve the data control versus data access conundrum
Data management is often said to consist of sourcing, mastering and distribution. This understanding very much casts data management as a control function: getting the required source data in, cross-referencing, validating and approving it, and then releasing it to stakeholders in a controlled fashion, often through one or more batches during the day. Data management teams responsible for delivery to support the daily cycles of settlement, valuation and reporting prefer to work in this staged and controlled way to minimize the risk of disruption to operations.
However, increasingly users downstream demand ad-hoc access to cleansed data for use cases that require data exploration and where it is harder or impossible to define parameters a-priori. Data exploration can entail executing large queries and running advanced analytics including machine learning algorithms on massive sets of time series of financial information. To address these users, typically separate data stores are constructed for e.g. quantitative research, risk modelling or scenario management. Because these data solutions are not closely linked to the data acquisition and mastering platform issues arise in timeliness, quality and mapping. Also, in larger organizations this can happen multiple times over. All this leads to a high maintenance cost on top of the operational risk linked to using various unconnected or only loosely synced local data copies – as firms increasingly need to be able to explain the data values they used in their modelling.
Existing financial services focused Enterprise Data Management solutions fall short when it comes to dynamically acquiring, preparing and provisioning data sets for data exploration while generic NoSQL and analytics solutions miss the financial services data domain experience. This has left firms between a rock and a hard place with a choice of staying with focused, but rigid solutions or spending significant resources on implementing and customizing generic solutions.
Alveo has taken a new approach. Long known for its highly scalable and comprehensive solution for the automated acquisition and mastering of pricing and reference data, it has expanded its offering to cater to quants and data scientists in risk, product control and investment decision support.
Data operations are complicated due to the proliferation of identifiers, taxonomies, data formats and standards. Alveo’s Prime data mastering solution addresses this through services integration with all main data providers, cross-referencing this to a common industry model. Alveo’s data exploration solution Alpha takes mastered data in real-time from Prime and provides a dedicated environment for modelers and data scientists. The data uses the same business domain model – bringing structure to the underlying NoSQL technology and offering easy access to a range of data providers. Data lineage provides end-to-end transparency into the data supply chain – always securing the explainability of model results and data values used. State of the art user interfaces and APIs offer easy access and integration with customer libraries in Python, R or other languages.
With our new approach addressing the need for data management for insight, we make sure users make the most of the data sets the firm has and can be assured of its quality – with data lineage to drill down into the origins, transformation and business rules data has undergone. This should help users in sourcing, mastering and distribution too.