Big Data, Big Questions: IDM in Pictures 4/12

Data, Facts and Evidence

What makes a ‘fact’ a fact?  What makes evidence, evidence?

The fictional detective Joe Friday is famous for something that he actually did not say – “Just the facts, Ma’am”.  Many a CEO used to request the same.  But is it possible to lift ‘facts’ out of context and for them still to be meaningful?  Take Julius Caesar’s crossing of the Rubicon in 49 BC.  Many others crossed the Rubicon both before and after but their crossings are not recorded as ‘facts’. So why Caesar?  Caesar’s crossing with his army became a historical ‘fact’  worthy of remembering only because it is considered to have led to the Roman Civil War, which ultimately resulted in Caesar becoming dictator for life and the rise of the imperial era of Rome. In other words, it is the ‘context’ that makes this particular crossing of the Rubicon a ‘fact’.  

Data is meaningless, until…

Data, by itself, is without meaning.  It is mere blibs – on paper or on screen. It only yields meaning when interrogated.   Ask the wrong questions and we get wrong answers. In his presentation to the IPWEA  Congress in July, Jeff Roorda observed that when Copernicus proposed that the earth moved around the sun, the data that had been collected about the movement of the stars and the sun did not change.  It was just that it was interrogated through a different question.   

Not all data is created equal

While natural data may remain constant (although perhaps not well understood), the same is not true of data that we use today in our management systems. That data has been selected, chosen and shaped by the questions that we have asked of it in the past.  For example, if the focus of our questions has been on the current value of our asset portfolios, we may have calculated, and recorded as data, the depreciated value but have failed to record the age distribution without which we have no way of knowing what that value means in terms of renewal timing.  If we recorded our assets in historical cost terms, we may be able to account for their accuracy by referral to the invoices of the time, but it yields no management information.  If we recorded our maintenance expenditures by site but not by what was done on that site, we may know where it has gone, but we won’t know what was done and so what we may have to do later.  

Today, our ability to access and analyse data enables us to answer old questions with more precision.  This is exciting.  The easy thing to do is to collect more of what we have collected in the past and to address the same questions.   In other words we can use new data to answer our old questions, questions that were themselves shaped by the data limitations at the time.

So, an even more important question might be – are we prepared to ask new questions that cannot be answered simply by inserting more data in our current current data frameworks?

Are we prepared to rethink our data frameworks? Our thinking models? 

With new technology we can now create an abundance of data very cheaply, not only historical but real time data.  In fact, we may well argue that anything we can think of we can now collect data on.  To take advantage of this, we cannot let ourselves be hamstrung by the assumptions that have ruled in the past and the questions that these assumptions have generated.  But having – at great expense – assembled the data we now possess – how do we ask questions that move beyond the models we have already developed?

This is one of the areas that we will be examining in our future podcast – “Talking Infrastructure: creating better questions”