Sponsored by:
 
 

Hadoop 2.0 & Beyond: Reinventing Hadoop

Christian Prokopp
100%
0%
Newest First | Oldest First | Threaded View
comments
Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
3/7/2013 | 4:33:51 AM


Re: Real time a real option
Yes (http://www.forbes.com/sites/bwoo/2013/02/28/311/) indeed, it basically makes big data accessible to a lot fo people with a common skill - SQL. That makes Hive so attractive. However, in big companies where people are trained on standards and vendors rule the land things have to be full standard compatible (and it sometimes makes sense to be compatible with existing code/queries). So we are seening a big push this year by the big boys to SQL standards and fast/interactive querying on top of Hadoop.

Saul Sherry
50%
50%
Saul Sherry, User Rank: Blogger
3/7/2013 | 4:26:08 AM


Re: Real time a real option
Is the SQL support (or lack of) just an issue because it is a language a lot of teams will already be familiar with?

Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
3/7/2013 | 3:03:52 AM


Re: Real time a real option
Certainly, full SQL support (on Hive) is not yet there. On the other hand Hive gives you awesome power through custom map/reduce scripts you can inject.

Larger companies sometimes use TerraData to join data in an SQL like fashion from traditional stores like RDBMS with Hadoop and add Tableau or similar to visualise and explore the data. Hadoop is nowhere near replacing such a comprehensive toolset. We can, however, see that with its maturity it eats vertically and horizontally into other products' markets. On the data warehouse core this might become an alternative for some soon - maybe in combination with a lightweight RDBMS for fast access of final aggregated data.

netcrawl
50%
50%
netcrawl, User Rank: Exabyte Executive
3/7/2013 | 12:13:16 AM


Re: Real time a real option
Can we see any limitations there?I'm sure Haddop has some drawback! in areas like analytics and SQL support.

Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
3/5/2013 | 8:46:35 AM


Re: Real time a real option
It is not an out of the box solution and wont be for a while. There is a lot of know how required internally and it competes (on the data warehouse side) with feature-rich matured products. So nothing for the faint hearted but an opportunity for daring, well run small companies? You can, however, throw money at the problem and get support and consulting services from distribution suppliers. Time will tell if it is a viable option. It is early days.

amrith
50%
50%
amrith, User Rank: Bit Player
3/5/2013 | 6:41:15 AM


Re: Real time a real option
Next they'll want a simple set of constructs to enable concurrent access by multiple clients.

Sound familiar?

Saul Sherry
50%
50%
Saul Sherry, User Rank: Blogger
3/5/2013 | 6:05:21 AM


Re: Real time a real option
Can you see any drawbacks to that approach @Christian?

Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
3/4/2013 | 4:40:06 PM


Re: Real time a real option
Realtime might be a bit too ambitious with Pig and Hive. It certainly makes data warehousing withinteractive querying and fast, scalable processing using Hive/Pig/Impala/Drill/Hue/Beeswax a realistic option. I think getting a 2 in 1 solution with Hadoop - MR & cluster resource management (YARN) as well as viable data warehousing interface - makes it very interesting indeed. Not only are the solutions tighly integrated tit also means one plattform, filesystem and HA/SLA challenge to 'only' worry about. So the investment in hardware can be utilised in two-three ways. Not bad.

Saul Sherry
100%
0%
Saul Sherry, User Rank: Blogger
3/4/2013 | 12:20:28 PM


Real time a real option
@Christian do you think the speed upgrades through Pig/Hive will mean more companies will see realtime insight through big data as a more clear option?

More Blogs from Christian Prokopp
ORC could well be the ultimate solution for businesses looking to optimize the way they pull data from Hadoop.
RCFile, as co-developed by Facebook, could be the ideal option for optimized Hive.
Storing data in text is slow, but enables rapid development.
Exploring the option of using text as an option to store and exchange data with Hive.
Lazy Indexing and Adaptivity in Hadoop could be the next step in bringing real velocity to Hadoop.
Flash Poll
Information Resources
Data Visualization Showcase
This Tableau visualization of international debt demonstrates how simple visualizations can give great insight
Explore this data here.
More Data Visualization Showcase
BDR in your Inbox
Digital Audio
Latest Archived Broadcast
Join this radio show to truly understand what a CIO needs to do to build a successful private cloud and what skills and values the IT team will need to embody.
Featured Video
3
Video: Visualization Is a Team Sport
James Robinson, co-founder of Open Signal, tells us why it takes two to get great visualizations.
Watch This Video
Follow Us on Twitter
Like Us on Facebook
Accolades
Accolades