The digital shift is fundamentally altering the way we do business, in effect taking away the power of big business – and handing more of this power over to the consumer who is now more able to call the shots on what they want and how they want it served.
By Gary Allemann, MD of Master Data Management
Businesses are evolving to meet these demands, using data and data-driven insights to understand their market and adapt accordingly. Those who don’t, holding firm to traditional business practices, are finding themselves drifting in the wake of their competitors, whether new fully digital businesses, or the big names who are making the effort to transform themselves.
However, the value of data analytics and the insight it provides is only as good as the quality of data used. This is why data curation is so critical to the management of the data lifecycle.
What is data curation?
Data curation is not a new concept, and is defined by TechTarget as the management of data throughout its lifecycle, from creation and initial storage to the time when it is archived for posterity or becomes obsolete and is deleted. Tied tightly to data governance, data curation differs in that governance defines the set of rules with which to manage data, while curation – a stewardship function – is the physical act of managing data by those rules.
Data curation makes data useful, enabling success not only in finding and analysing data, but in doing so with the right purpose and goal in mind. Curators gather data from multiple sources within – and external to – a business, integrating it into repositories that, collectively, offer more value than each piece of data on its own. The process includes discovery, authentication, archiving, preservation retrieval, and representation.
Why do we need it?
Integral to managing the data lifecycle in IT, data curation enables organisations to find, understand and – most importantly – trust their data. The inability to do so leads to the inability to use data to properly and accurately answer the questions that need to be answered, through analytics of the data. Essentially data analytics will be unsuccessful without data curation.
Beyond data analytics, other digital functions also benefit from proper data curation. Machine learning becomes more effective when proper curation directs algorithmic behaviour to establish suitable learning patterns. Raw data and data lakes can be better managed and accessed.
Perhaps most importantly, data quality can be assured. Businesses that can trust their data are better able to trust the results that their data delivers. For example, if customer data is analysed with the goal of developing a new product tailored to market demands, but the data used is collected from the wrong source or the incorrect analysis parameters are used, the results will not paint an accurate picture.
Efficiencies created by curating data enable businesses to innovate faster than their competitors – when the right data is found and used quicker to answer the right questions, businesses can leverage a first-to-market advantage.
Businesses cannot properly govern, manage or use their data without data curation. It’s impossible to know what data a business has, where it came from, its validity, or how it can be used to answer business-critical requirements unless curation is carried out at every point of the data lifecycle.
Data curation is demanded by law
Legislations also demands data curation. Acts such as the Protection of Personal Information (PoPI) Act in South Africa and General Data Protection Regulation (GDPR) in the European Union are basically data curatorship bills which advocate how certain type of data should be used, managed and stored. While PoPI and GDPR fall under the umbrella of ‘governance’, they also define guidelines for the physical parameters of data curation, as required by their regulations.
What are the risks?
In a digital economy where every industry is being redefined by a “game changer”, and organisations are innovating faster than we’ve ever seen, businesses simply cannot afford static or poor-quality data. Data is arguably a business’s most valuable asset when it comes to ensuring ongoing sustainability and market leadership. Improper – or a complete lack of – data curation means that this indispensable asset will remain buried, burying the business along with it and at a far quicker rate than seems possible.