Data 'half-life': Perils of outdated and incomplete data

Data point possess an inherent ‘best before’ date, after which any insight or information derived from the same could be sub-optimal

Published: 29, May 2018

Manish Sinha is a Managing Director at Dun & Bradstreet.

Image: Shutterstock
Image: Shutterstock

The importance of data cannot be overstated as it fuels business growth creating competitive advantage and, even, new products and services. Digital data is extracted, refined, valued, bought and sold in different ways, unlike any previous resource. It fluctuates the rules for markets and demands new approaches from regulators. In the coming years, many battles will be fought over who should possess, and benefit from, data.

Corporate data is rapidly growing, at roughly 40 percent each year; 20 percent of which is either duplicate, incorrect or both. That said, the concept of data ‘shelf-life’ has been surpassed by a new reality. Out-of-date data is not merely inefficient. Today, it represents a business liability.

Data half-life
Data point possess an inherent ‘best before’ date, after which any insight or information derived from the same could be sub-optimal.  Assuming data to be a component of knowledge (complemented by experience, observation, and human instinct), the frequency with which an organisation refreshes its data could have a direct impact on business performance.

Out-of-date data is not merely a question of business inefficiency. It can represent a tangible and increasing liability on the balance sheet. Thus, for an asset of such importance, the concepts of ‘shelf-life’ and updated data remain relevant. The specifics of which will be peculiar to each organization and the market in which it operates.

This is the essence of the principle of ‘Data Half-Life’. The value (relevance and usefulness) of data declines exponentially as a function of time.  The longer the data is left, the less valuable it becomes. At a certain threshold, that data transforms from an asset to a liability. Irrelevant, out-of-date data can lead not only to sub-optimal decision-making, the cost of maintaining it securely can be more than the value it generates. In fact, reports suggest redundant data and its associated problems will cost organizations around the world a staggering $3.3 trillion approximately by 2020s.

Data storage versus breaches
First, data storage represents an overhead; the fall in storage costs is being surpassed by the sheer volume of data generated. According to the Center for Democracy & Technology (CDT), organisations are currently spending approximately $5 million per petabyte to retain ‘old’ information.

Secondly, data breaches and data loss represent a liability, regardless of whether it’s ‘shelf-life’ is over or not. The global cyber liability insurance market is expected to generate $14 billion in gross premiums by 2022, up from just $3 billion in 2015, according to estimates.  And that’s just the expense of provisioning in the event of a data loss, the supplementary costs to recover/mitigate data breaches, legal costs and reputational damage would be incremental. The case in point being the recent Facebook and Cambridge Analytica data controversy. In short, expired data represents just the same liability on the balance sheet as current data with respect to data privacy and breaches.

Garbage in, garbage out
One of the major issues facing businesses operating across multiple and legacy data sources, is simply the mechanics of cleaning and refreshing data in a timely manner.  The risk associated with business strategies formulated based on outdated data can be ruinous.

This has been a reason that has propelled the adoption of the Master Data Management (MDM) solutions, ensuring the quality and accessibility of business information from across an organization. MDM operates at the intersection between finance (risk, credit management), marketing (identifying and addressing opportunities), and information technology. Therefore, it serves as a catalyst of providing most updated, insightful and holistic view of data collected from various functions of the organisation.

By limiting the idea of data quality to the idea of ‘shelf-life’ (the ‘freshness’ and relevance of data being used to inform business decisions), businesses risk neglecting the full consequences of poor data occupying their servers and informing their decisions.  The key insight behind the Data Half-Life principle is that poor data won’t just compromise your decision making, it could represent a liability for your entire business.

Data is no more a passive entity but rather like a living and breathing organism and business data is no exception. Organisations today face multiple customer data challenges like deluge, accuracy, relevancy, and redundancy. Most organisations do not even realise the impact it can have on business bottom line. Perhaps, if CEOs themselves were aware of the full implications, of the shift beyond ‘shelf-life’ to ‘half-life’, this issue would receive the priority it merits.

The author is a Managing Director at Dun & Bradstreet

Post Your Comment
Required
Required, will not be published
All comments are moderated
Prev
Why we like consumer brands
Next
From hyper growth to hyper fall