What do you do when the unexpected happens? Go to your event knowledge base of past historical unexpected happenings… of course.
“Hang on!” I hear you say, “We don’t have one.”
I would ask, “Why not?”
Forecasting events
Is it because you have not experienced the pain of forecasting company outcomes and getting it horribly wrong? I have fond memories of walking out of performance result meetings with a number of virtual daggers in my back because my forecasters did not see the tumultuous occurrence of an event that generated “What the…”
This is where unformatted big data comes into its own. If you need to establish a knowledge base of what on earth happened to drive unforeseen happenings in the past (as we had to do) then your unstructured big data can help.
Performing forensics on a crash
As a hypothetical example, on the 29th of January the web traffic on a company site increased to a point the website crashed. Your big data database contains the unformatted information that will support the event knowledge base to substantiate why this crash occurred.
The criteria for the event (the date and keywords like "sales campaign" or "special offers") gets passed to the big data database, which then extracts all the relevant knowledge, which is then stored in the event knowledge base for future reference. The next big decision you have to make is how to estimate the likelihood of a similar event occurring in the future, when it will occur, and whether it will have a similar impact.
In this example, the event knowledge base is used to assist with forecasting future web traffic and for identifying similar occurrences in the future. On the 7th of February a similar outcome occurred and was much more sustained. This information is once again passed via the big data information sources to augment the information asset that will assist with determining the relevant drivers for forecasting future website traffic.
This is a very brief explanation on how to make this work for you. Post a comment if you want to discuss things in additional detail.
Terry Simmonds,
User Rank: Blogger 3/4/2013 | 5:31:42 PM
Re: What the....! You are 100% correct Saul. People skills are essential for successfully navigating the stakeholder challenges a Data Scientist faces on a regular basis.
Saul Sherry,
User Rank: Blogger 3/4/2013 | 12:25:19 PM
Re: What the....! That's understandable and I am sure you do a great job on it Terry - but it could be a potential political timebomb in teams. I guess that's one reason to push the 'people skills' as essential in the Data Scientists, so they can managed these situations with kid gloves?
Saul Sherry,
User Rank: Blogger 2/26/2013 | 12:12:42 PM
Re: What the....! Thanks for the insight @Terry - I guess this kind of retroactive work can show up a lot of red faces among the decision makers in that case?
Saul Sherry,
User Rank: Blogger 2/26/2013 | 12:11:10 PM
Re: What the....! @Daniel - Ouch! This kind of stuff could push people's faith in NLP back years!
"We think of John Lennon as the most intellectual of the Beatles, but, in fact, Paul McCartney's lyrics had more flexible and diverse structures and George Harrison's were more cognitively complex. "
AlphaEdge,
User Rank: Exabyte Executive 2/26/2013 | 11:08:43 AM
Re: What the....! @mharden, I would add it could be very useful for rare event prediction which traditionally rely on poisson regression to get it done.
AlphaEdge,
User Rank: Exabyte Executive 2/26/2013 | 10:58:46 AM
Re: What the....! Without historic data for the subject we study, we might get some benchmark data from other places as comparables. Otherwise, as you mentioned, it becomes pure guess work. It is a good suggestion to utilise this approach at the top predictor level or as inputs for determining, modelling and managing the drivers of your factor models.
Terry Simmonds,
User Rank: Blogger 2/26/2013 | 1:16:45 AM
Re: What the....! Hi and thanks for your comments. Yes you have pretty much got the idea.
The whole aim of building a knowledge event database is to reduce the amount of guess work. You cannot remove it completely however. You will be able to undertake quite a bit of scientific rigour around the analysis as the knowledge base grows.
In the past, as things grew we captured professional judgement adjustments to the forecasts in the knowledge database as well which also helped to develop a scorecard for internal users so they could assess who was making useful changes and who was not.
Big data is set to get bigger in 2013, and the focus on moving away from a single warehouse means new technical considerations with significant cost implications.
Join this radio show to truly understand what a CIO needs to do to build a successful private cloud and what skills and values the IT team will need to embody.
To save this item to your list of favorite Big Data Republic content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.