To lay some solid foundations for your future systems, you should start your big data conversation around the stack.
The water cooler discussion on big data for most people is about the data -- lots of it and of different types. But that's just the tip of the iceberg. Big data discussions in the IT department are vastly different from those taking place in the more customer-centric areas of marketing or brand management. Technology workers talk of stacks -- a metaphor for layering technologies one upon the other so that a device or user can perform an operation. For example, the ability to use an everyday application such as Microsoft Office relies on a stack that has a network layer, a hardware layer, an operating system layer, perhaps a virtualization layer, a database layer, and an application layer, which is the Office Application suite.
Big data is dependent upon a stack that is similar to the Office stack. It has a hardware layer, an operating system layer, a data layer, and an application layer. The difficulty is that best-practices are still being developed, so CIOs and their technical staff must establish their own. This is not an easy process, because the landscape of solution areas and technologies is wide and varied. Additionally, they usually have to tie into established systems and platforms and be managed with your current technical human capital.
Here are some key jumping off points for a conversation within the corporation for creating a big data stack and telling the business side of the organization about it.
First, the big data stack buildout is complicated by the technology philosophy through which the IT organization functions and operates. For example, if you leverage cloud-based as-a-service (platform, infrastructure, or data) provisioning, the basis of the stack buildout will differ from those who lean toward in-house solutions. As such, the move to big data may be an opportune time to determine whether your technology philosophy will provide a rich enough environment for supporting a big data stack, or if the entire corporate technology philosophy should be revisited.
No easy fit
The second complication of the big data stack is that there is no easy, one-size-fits-all stack in a box that CIOs can select and easily justify. At the hardware layer, the stack integrates technologies in the networking, processing, and storage arenas. There are operational and analytics infrastructure options to evaluate, create, and build upon. Each has its own cadre of vendors, from established names to startups. To assess these technologies, organizations will need to create an internal team capable of scanning the environment of emergent technologies. This team will need skills that reach up and down the stack, enabling them to consider issues that cross over layers and assess the overall impact of a specific decision. Should this talent not be available in-house, it will need to be acquired through hiring or consultants.
The third big data stack issue is the data and its hosting. The cloud option (public, private, or hybrid) is one possible route. This is compounded by the use of data-as-a-service and the acquisition of third-party datasets. Another dimension to the data is the hardware and the high-performance compute environments hosting it. Key issues that need to be resolved are when to move to an in-memory database (such as HANA) and what hardware solution to run. Many options are available: clusters, grids, and clouds such as Amazon's EC2.
Unfortunately for CIOs, new technologies are being introduced every day, and the key dilemma is deciding when to acquire the hardware capabilities and from whom. To simplify this, decision models from outsourcing can be used. A useful model to consider is based on competencies versus cost -- the market option is assessed against that of internal capabilities. If the market is superior and the costs are lower, then it makes sense to outsource; if the reverse is true, then develop and maintain those capabilities internally. If the market is better in terms of price but not capability, then a selective sourcing best-practice can be made looking for low-cost providers for noncritical functions. If the market is better in terms of capability but not cost, then firms can look to create strategic partnerships for key functions or buy-in capabilities.
Overall, for many organizations about to move to big data, this is an opportune time to start developing a plan for their stack and lay the foundations upon which future systems will be built. By starting with a conversation around the stack, firms will benefit from a fuller discussion than would potentially result from developing a set of point solutions or vertical apps. That set of solutions would need to be grafted together later through a Frankenstein stack, and starting again would be unpalatable due to sunk costs and an increasingly reliant user group.
Re: A short stack, at least for health care Forbes issued a caution to local governments (and gov in general) that "big data" doesn't become a "big quest", and it's a good point.
But I agree with you that local governments will start to see the value in aggregating and understanding large local data sets (an example: visualization of how zones changed over time compared to number of civic letters for or against those zone changes)
User Rank: Blogger 12/26/2012 | 2:40:59 PM
Re: A short stack, at least for health care @technetronics That's true. Both the government and the healtcare industry have good reason to wish to minimize risk. Still, both are now committing to finding big data solutions. I doubt we will see it in every local government or doctor's office right away, but we will see progress in that direction.
Re: A short stack, at least for health care I'll defend healthcare a little bit with an argument I use to defend government.
Because health and government -- these massive social structures -- are so critical, their evolutions need to be, in theory, much more calculated to best avoid failure. Basically, risk appetite is very low.
I think that explains why EMRs (electronic medical records) have taken so long to get in place. The sensitivity of the information and the importance of that information means that there needs to be very limited failure risk.
I'm optimistic about America's healthcare prospects because we're seeing a shift in the big insurance and medical players (including the Dept. of Health and Human Services) at the same time we're seeing individuals innovate around their health (esp. wearable devices)
This joint health effort is going to bring down treatment costs, shifting it more to preventative behavior.
User Rank: Exabyte Executive 12/26/2012 | 10:04:15 AM
Re: In house out house Moving IT infrastructure to the cloud helps avoid some capital investment such as designing, building and managing a data centre in-house. But a flawed contract with a cloud provider could lead to high operating expenses inefficient barriers to effectively using their big data solutions. CIOs need to make sure the right agreement is in place or at a minimum they should make sure that the cloud providers is capable to address the scalabiltiy, reliability and performance needs of the rapidly changing big data stack solution.
User Rank: Exabyte Executive 12/12/2012 | 4:51:34 PM
Re: A short stack, at least for health care The right leadership must be in place. The area is quite complex. Numerous stakeholders need to be involved in defining information required within the organization. Challenges still arise in the management's decision making. As urgent as the need is for healthcare to adapt, there are many areas that definitely have there work cut out for improvement.
User Rank: Bit Player 12/5/2012 | 6:14:52 PM
Re: In house out house @Saul More excitement!
Does the configuration of the stack exclude you from newer releases?
No, not at all, most don't require any changes to the code. If it did, its just a case of provisioning extra cloud resource to test it before moving it into production.
Is that question behind all your decisions?
If we had a rigid plan behind what our stack will look like we'd really struggle, it's a case of learning everyday, sometimes what you have invested resource into learning might not matter anymore, it's sad sometimes, but, if you see a 0.001% improvement in a process that runs 70 times per second, its a no brainer to implement it and move on.
User Rank: Exabyte Executive 12/5/2012 | 5:27:20 PM
Re: In house out house I see a hybrid model @saul. Develop in house but put your developers in the cloud. As for the evolution of data, a lot of us are tied to legacy systems, including microsoft. They get to dictate the data format. The challenge of Big Data is to be inclusive of old, new, and evolving stacks and make sense out of them.
Re: A short stack, at least for health care Yes, its amazing that Healthcare is so far behind and has been for decades, it is interesting to think that this is an outcome in a free maket economy and that we as users, taxpayers and providers are happy with this. Even the big ERP companies have steered clear. Perhaps the giant UK NHS experience was a red flag and a preminition that this was an area to avoid. Perhaps by the end of this Presidential term all will be solved and healthcare costs in align with other countries, or then again we can just go on and be the most expensive healthcare per capita nation on earth without significant life expenctancy benefits.
A short stack, at least for health care It's so true that conversations about Big Data have drastically different alignments between business units and IT. Just last night I was skulking around a local "Health 2.0" event over in LA's Silicon Beach, trying to get a vibe on Big Data's acceptance in the health field. I wasn't really surprised to learn they're about 10 years behind in realizing any true value. One conversation I had was rather telling - I chatting with a woman in population care management for a very large health care provider. She indicated that any sort of initiative relating to data science and better utilizing their massive stores of patient data in any meaningful way was still tied to the business unit level. Corporate IT had no overseeing effect. This means that Big Data won't be a factor for some time, as there's no global oversight into data aggregation. So any talk of a Big Data stack, at least at this particular health provider, is way premature.
Re: In house out house So that's the next battle ground @technetronic... the whole ideal of believing in data as a science hinges on getting away from the 'highest paid person's opinion'. Nice idea, but if that HIPPO is influencing your tech loadout, and these management reluctancies are holding back progress it will be an uphill battle.