A Data Affair

A Data Affair


Today is Valentine’s Day, a day on which countless cards and gifts are sent each year. It’s the busiest day in the calendar for florists and restauranteurs with millions of couples worldwide enjoying romantic candlelit dinners.

Valentine’s Day is also a very popular day for tying the knot or for proposing marriage. In 2013, six million people worldwide planned to ‘pop the question’ on Valentine’s Day. For others though, today is a time for improving relationships and showing one’s partner or spouse that although the relationship may be rocky, the ‘spark is still there’.

In the data world, organisations too need to invest in improving their relationship with data. In a blog entitled ‘Don’t treat data like an asset’, Data Source describes data as the lifeblood of any organisation. This is a great analogy as data literally, keeps organisations alive.

 An organisation's dealings with data should be governed by a set of processes, which ensure that this essential resource is formally managed throughout the enterprise. These processes can collectively be referred to as data governance. Data governance ensures trust in the data as well as accountability for any adverse event that stems from its quality or misuse. In building a data governance framework, one needs to look at three key attributes of data, namely security, privacy and quality:

Security
In 2015, the ITRC (The Identity Theft Resource Centre) reported 781 known data breaches In the US.  Personal information is being targeted with ever greater frequency, with 165 million records containing Social Security numbers as compared to 800,000 records exposing credit and debit card information.

Data security should be an organization’s primary consideration in setting up its processes. Authentication and Encryption are fundamental to securing access to data with role-based authentication, audit records and anonymizing all playing a part.

Privacy
The volume of data collected is exploding and is expected to increase from 4.4 zettabytes in 2013 to 44 zettabytes in 2020 and 180 zettabytes in 2025. 

A large portion of this data will comprise financial, behavioural and other personal identifiers of customers. While this data are essential to delivering an overall improved service or product, the information contained is highly sensitive. The collection and use of this data must be subject to privacy laws and high ethical standards.  

Quality
The concept “garbage in, garbage out” (GIGO) is well-entrenched in science, maths and computer science and is certainly a truism when it comes to data. We made the point in a previous blog post ‘My Big Dirty Data’’, that all data is inevitably dirty. However, there are some important aspects of data one needs to consider to ensure that the data are of acceptable quality.

Data sources: To ensure the integrity and validity of data sources:

  • manually entered data must be validated and strict rules put in place to ensure the inputted data meets business rules.
  • the source itself needs to be validated when the data we are receiving is from other systems and especially if collected as part of some automated process.

Data cleaning: Post processing of data and clean up as per business rules reduces errors and increases data quality.

So if you intend to build a strong, lasting and fruitful relationship with your data, you need to put in some hard work and investment. Just like Valentine’s Day.

Share This Artcle :

Previous Post

Wise Men Listen

About Author

Before joining Ixio Analytics, Stefan provided technology support to IBM clients and also designed and built software solutions. Now, Stefan leverages his extensive experience in infrastructure and software engineering to develop, build and deliver innovative analytical solutions that meet current and future demands. He understands the value of data and the valuable insights that carefully structured analysis can deliver to organisations. He's passionate about deep learning and its impact on the future.

Comments