Data storage has quickly moved from gigabytes to terabytes, petabytes, exabytes and even zettabytes in some industries. The need is greater than ever before to collect, cleanse, correlate and analyse large stores of data, says Johan Jurd, MD of InfoBuild, a representative of Information Builders in SA, suppliers of iWay and WebFOCUS, the integration, data cleansing and BI tools.
For data collection, users need to assess whether the information will provide a return on the investment made in capturing and storing it.

Most companies don’t have a problem with collecting information. In fact, they already have much more than they actually use. Even after more than a decade of maturation of the business intelligence (BI), analytics and enterprise search markets, most users spend hours a week looking for information they need.

To take advantage of vast amounts of information, users need to have access to information in a way they can understand.

In terms of storage, users will need more than the data itself. Metadata should be included, such as the creator or creating system, the time of creation, the channel on which it was delivered and sentiment contained.

Products such as iWay help collect all types of information, whether needed in real time or for historical purposes. It could include unstructured data, such as blog posts and social media streams; cloud-based data from Web services or API queries; structured data from ERP, CRM and legacy; or sensor data, such as RFID or UPC scans and utility gauge readings.

Cleansing and correlation

It is inefficient to clean data when it is already collected into giant data stores. The best is to clean data where it lives, as transactions flow.

The data integration software therefore should include a data quality solution that acts as a firewall to ensure the quality of data before it spreads into other parts of the enterprise.

Information also needs to be correlated from multiple systems. For instance, you may improveyour one-to-one marketing dramatically if you can tell that “jdoe1968” on your Web site is “Jonathan Doe” who used a credit card on the phone last month, as well as the person who identified himself as “Jon Doe” who has just entered your store.

The data integration software therefore should also provide master data management (MDM) technology that can correlate disparate information from different system types – and can be overseen by data stewards, who are able to manage data from their mobile devices.

The result is better operational processes, better business intelligence, and – as the data moves into the realm of big data – better correlated and managed big data analytics.