Dr. David R. Hardoon is a Visiting Faculty at Sim Kee Boon Institute for Financial Economics, Singapore Management University and previously the Chief Data Officer at the Monetary Authority of Singapore.
Shubhangi Agarwal, Research Assistant at Sim Kee Boon Institute for Financial Economics, Singapore Management University.
Sri Siva Shankar R, CFA, Research Assistant at Sim Kee Boon Institute for Financial Economics, Singapore Management University.
— – – –

Ever wondered how much your data assets are worth? Rest assured, you are not alone. Today, organizations appreciate how data is a key asset for realizing the ambition in unlocking new value, new revenue streams, and new opportunities in the unfolding digital reality. Currently, it may only be possible to estimate a return-on-investment a-posteriori to data acquisition and investment. There is value in this approach. However, as a firm’s management trying to maximize shareholder value, or external investors assessing whether to buy, build or invest in an asset, or even as a nation’s government operating amidst the international data flow, we may benefit from having an a-priori estimate of our assets’ value.
Presently, data assets are valued using traditional methodologies used in the evaluation of other tangible assets. These methodologies, by virtue of their design, pose certain challenges when applied to data assets. Specifically, they yield an incomplete estimate of the value of data assets by ignoring the fact that no two data assets are the same and failing to account for the possibility that a data asset may have certain features not found in other intangible assets.
Furthermore, the importance of data has only been amplified due to Covid where the dependency on digital has grown exponentially due to necessity. New users are coming online; online transactions are happening at a faster rate and work from home is the new standard. Organizations are undergoing an accelerated digital transformation to ramp up their teleworking, cloud computing and e-commerce capacities. Consequently, not only is more data available, but more data can also be collected. Therefore, the question of "How much are my data assets worth?" is more pertinent now than ever and it is imperative that we figure out a way to answer this question.
Like most contemporary economic models, one way to approach this is by building upon existing methodologies/frameworks. The existing cost and income methods of estimating the value of intangible assets provides a good starting point as these methods compute an observable or realized value of an asset. The challenge with these methods is that they are not a complete a-priori estimate of how much your data assets are worth as further to the realized value. Data assets also have a future potential value attached to them that needs to be accounted for.
To draw a simple analogy, imagine you enroll in a 21-day fitness program. This involves a $200 as program fee, 2 hours/day of exercise, and the mental effort to restrain oneself from eating unhealthy food. At the end of the program, you have lost 10 pounds, your waist has reduced by 2 inches, and you have received tons of compliments. All of these are ‘costs’ and ‘benefits’ which are observable and therefore constitute the realized value of the fitness program. However, is that the entirety of the fitness program value? No. There would be additional value in being less likely to gain weight, less likely to contract chronic diseases and so forth. These additional costs and/or benefits occur in the future and are unobservable at the current point in time. These constitute the fitness program’s potential value.

Similar to the fitness program, the complete value of data assets when viewed through an a-priori lens is a dichotomous one with both realized (observable) and potential (unobservable) components. How can we determine the value of a data asset with certainty if the potential value is unobservable? Moreover, in doing so, how do we account for the fact that data assets are extremely heterogeneous i.e., no two data assets are the same and in fact, their value changes dynamically over time, user, geography, etc.? One possible approach, is to view the value of data assets as an interplay between the set of characteristics of the data asset (e.g., quality), the capacity of the organization employing the asset (e.g., technological capability) and the external environment (e.g., laws and regulations).
To give an example, imagine a data asset that is of high quality and out of which insights are easily extractable but the firm employing the data asset does not possess the necessary technical knowledge to work with the data. In this scenario the data asset may be worth nothing to the firm employing it but may be worth a significant amount to a firm that possesses the know-how of working with the data. Alternatively, imagine the scenario where the data asset is of high quality and the firm has the required knowledge to extract insights, but a new law is passed which prohibits the usage of the data by firms. The value of the data asset is reduced to nothing for all firms.
The data heterogeneities can also be understood through the set of characteristics that jointly make up the data asset. The extent to which these factors are relevant for a particular data and how they interrelate with each other is what makes a data asset different from any other data asset and this subtlety is what makes the valuation of data assets complex. A framework that is able to capture these nuances would enable an a-priori valuation of data assets.
The only questions that remain are: How do we go about creating an exhaustive list of characteristics that play a part in this interplay and how do we assess the quantitative worth and relevance of each such characteristic. Perhaps machine learning can provide answers to these. For instance, it may be useful to incorporate a structural break (at the implementation of data assets) and analyze the change in organizational performance pre and post the structural break. Then, machine learning techniques can be used to attribute performance to the different characteristics and a relative worth can be deduced from thereon. However, this is an area that demands continued and focused research.
Nevertheless, building upon the existing methodologies will provide a robust conceptual tool and starting point to derive a quantitative estimate of data assets in order to answer the question, "How much are my data assets worth?"