South African enterprises are actively looking to next generation application and analytics platforms, but they need to be aware that the platforms alone cannot assure quality data and effective data governance, warns Johann van der Walt, director of operations at Knowledge Integration Dynamics (KID).
There’s no doubt that SAP HANA as the platform for next-generation applications and analytics is a major buzzword among South African enterprises today. Enterprises know that if they want to stay relevant and competitive, it is an enabler that will allow them to process more data, faster.
However, HANA cannot address the problem of poor quality or obsolete data – all it can do is allow enterprises to get their poor quality data faster. Unless enterprises address their data quality and data management practices before they migrate, they will dilute the value they get from HANA.
With modern enterprises grappling with massive databases, often with a significant amount of poor quality and obsolete data, simply migrating all of this data to a next generation platform would be a mistake. For effective governance and compliance, enterprises need to be very conscious of where the data going into HANA comes from and how it gets there.
Once the data has been migrated, they also need to ensure that they are able to maintain their data quality over time. In our consulting engagements with local enterprises, we have discovered that most CIOs are well aware of their data quality flaws, and most are anxious to address data quality and governance. But they are often challenged in actually doing so.
A typical challenge they face is the ability to come up with a unique dataset and to address inconsistencies. Most CIOs know they are sitting with more data they need to have, with as much as 50% – 90% of their data actually obsolete, but many battle to identify this obsolete data. Simply identifying inconsistencies in datasets – like a Johannesburg dialing code for a Cape Town-based customer – could require months of manual work by a large team of employees; and achieving the budget necessary to do so can prove tricky.
In our experience, an effective way to secure budget and launch a data governance project is to do so by piggybacking off a larger enterprise project – such as a SAP HANA migration. A move such as this gives the enterprise an ideal opportunity to run a data cleansing process as the first step towards a comprehensive data governance project. An effective place to begin is to identify what data is active, and what is obsolete, and then focus data quality improvement efforts on only the active data before migrating it to HANA. In this way, you are moving only the data that is needed, and it is accurate and cleansed.
This is just the start of the governance journey. It is easier to get clean than stay clean in the world of data quality. Typically, when data quality is improved in projects like migration, decay sets in over time. This is where data governance comes in: after the cleansing and migration to HANA, enterprises need to put the toolsets and policies in place to ensure continuously good data. In most companies, this entails a programme of passive data governance, in which the quality of data is monitored and addressed in a reactive way.
However, some may also move to active data governance – an arguable more invasive approach in which the data capturing process is more closely controlled, to ensure that the data meets governance rules. Active data governance might also be supported by being highly selective of who is allowed to input data and moving to a more centralised organisation for data management – instead of a distributed, non-governed environment.