Lessons on Integration
  • 14th November, 2017
  •  

Lessons on Integration

By Phil Massie

In “On the Origin of Species”, Charles Darwin wrote these words: “In the long history of humankind (and animal kind, too) those who learned to collaborate and improvise most effectively have prevailed.” In the world of data science this idea of success through collaboration rings equally true.


Through collaborative efforts between different parties, it is possible to enrich your data and empower your choices in very meaningful ways. One great example of this is the University of San Diego’s WIFIRE system. This project integrates heterogenous satellite data with real-time weather sensor data into a tool for visualising, simulating and predicting wildfire spread. 


Clever data integration can also be an important part of your businesses data model. Increasingly, we see businesses integrating social media data streams and using them to enrich their decision-making processes. integration of these streaming commentaries can give valuable insight into what people are talking about as well as how they feel about it. 


Integrating disparate data streams is not without inherent difficulties. Consider integration of realtime streaming weather sensor data with the high-pressure firehose that is Twitter. Both are continuous streams but come from fundamentally different sources, contain very different data types, data rates and data intervals. Robust solutions for integrating these types of streams may require bringing on specific skillsets or even investment in appropriate infrastructure.


Integrating existing static data within institutions can be challenging and can lead to frustration. I have discussed consolidation of business-critical siloed data into data lakes in a previous blog post. Large institutions may collect large amounts of data, of different types (for instance data associated with different departments) in different formats and repositories (linked perhaps to the long-gone contractors who first implemented the systems). A company may even acquire another business with fundamentally different data strategies. These situations can result in a great deal of time spent navigating access permissions, missing or unintelligible codebooks and fragmented IP.


But it is worth the effort. The value of integrated data can far outweigh that of its unintegrated parts. Just think of how traffic data from sources like your smart phone’s GPS has dramatically increased the value of navigation apps. Additionally, well integrated data with unified access protocols can substantially increase the potential for further collaboration by making data accessible to players with different goals.
Integrated data sets can save large amounts of time otherwise spent trawling through long forgotten archives and trying to reconstruct partial or ad-hoc data structures. The simple reality is that an integrated view of your company underpins sensible, holistic decision making and drives your bottom line.


Whether you’re interested in enriching your data with third party data or just integrating your own disparate data sets, get in touch with Ixio and let’s collaborate.
 

Share This Artcle :

Comments