A water-level meter, inside a water tank above a large apartment building, monitors the water level in the tank. If the level falls below a threshold, a pump is started and water is replenished, or the administration is alerted and they order water tankers to be brought in. The meter measures the water level every few minutes and sends this information via the internet to software that starts a motor or sends out a text message. This is an example of an internet-of-things (IoT) device that collects data at a steady rate and processes it for action. IoT devices are spreading to many different types of uses, across industries, homes, hospitals, educational institutions and governments. One of their biggest by-products is data. Even though the water meter simply measures the water level in the tank, it also records how much water is consumed, when, and when water is replaced. Directly and indirectly it collects and processes data about the residents of the building, their habits and patterns of behaviour. The water meter was not designed for this purpose and yet that is how its use is playing out.
The challenge for governments and regulators is to understand the ways in which data is created and the many ways in which it may be used. With the widespread use of IoT, social media, mobile phones and other devices, data is being produced at a massive rate. A widely quoted estimate is that 2.5 billion billion units (bytes) of data are produced per day in the world. Consider a few issues that arise with regard to the use and storage of such a large volume of data.
- The firm that installs and runs the water meter collects the data. Does it have the right to sell this data, perhaps to firms that have complementary products? Though the answer seems to be a simple matter of having appropriate contracts in place, the issue becomes complex when analysis is involved. If not the raw data, can the firm sell the analysis of the data?
- Can the firm analyze the data for purposes other than water management? Can it estimate, for example, living and consumption patterns from the data? Will this, in any way, violate the privacy rights of the residents, something that is now asserted as a fundamental right by the Supreme Court of India.
- Once the data is collected and used, and serves the purpose for which it was collected, for how long should the data be stored? Should the firm store the data for as long as it is cost effective to do so? Or should it be mandated by governments that all data after a certain duration of time should be automatically purged from servers?
- If the data is stolen from the firm, possibly by rival firms or by hackers, then what recourse do the residents of the building have to protect themselves? Is the firm obligated to report to its clients that their data is stolen?
- Should the firm inform its clients about the manner in which the data is used, how it is analysed, whether it will be sold to others, and how well protected it is? In other words, should the firm inform its customers about all the ways in which data is collected, analysed and stored? If yes, then how should it do so? Will everyone be in a position to understand the myriad consequences of data collection, use and abuse?
Many governments around the world are grappling with regulations that address some of the issues highlighted above. As are firms that are in the business of collecting and analysing data. As data is now considered to be the new 'oil' of the digital world, a resource with boundless potential for revenue generation, regulations that address these issues become all the more relevant. A few principles that help guide such regulations are outlined below. As these regulations are being formulated by the government of India, they remain an active area of discussion within academia, civil society and commercial firms.
First, the basis of any policy on data should have the person or citizen as the core entity. Data about the individual is most important and should be safeguarded and used for her/his benefit.
Second, any regulation should address the entire life cycle of data, from its capture and inception, its use, its storage, and its destruction. The focus has to be more than just on capture and use.
Third, informing citizens and users should be based on simple language, not technical or legal jargon that is incomprehensible to most. Where possible, citizens should be made aware of when data is being collected and how it is being used. The information should also specify what are potential liabilities of loss or theft of data.
Fourth, data can and should help with innovations in products and services. It should help governments and private firms to provide benefits to citizens at large and not to a select few.
As we move towards a data-rich future, these principles should help in understanding how to regulate and manage the considerable challenges data poses. For a water-level meter is not just a simple device for starting a pump, it is the source of a resource that poses many challenges for its management and use.
Rahul De’ Hewlett-Packard Chair Professor, IIM Bangalore