Sponsored by:
 
 

Hadoop 2.0 & Beyond: Bypassing MapReduce

Christian Prokopp
100%
0%
Newest First | Oldest First | Threaded View
comments
Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
3/5/2013 | 8:55:05 AM


Re: So long MapReduce?
I agree, the convergance is happening. With Hive/Drill/Impala data warehouse and HPC solutions might see a competition. If we can explore data on Hadoop and use it as a generic cluster resource manager then smaller data warehouses may be a result. HPC may also become an increasingly niche product.

amrith
50%
50%
amrith, User Rank: Bit Player
3/5/2013 | 6:51:32 AM


Re: So long MapReduce?
@Christian, great post. I think you hit the nail on the head; this is a natural evolution.

What MR did was highlight the inherent weakness of other data processing methedologies for processing large amounts of data in an environment where inexpensive scale-out was the only way to go because a scale-up solution was infeasible/impractical/impossible.

MR has done for conventional ETL the same thing that MPP did for databases. Interestingly there seems to be a convergence on the horizon.

Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
2/28/2013 | 12:16:06 PM


Re: So long MapReduce?
For developers a more generic cluster resource management as YARN provides (see 2nd part coming up soon) is very exciting. It gives us a framework to deploy any kind of distributed workload, e.g. simple tasks like web crawlers or workers ingesting a queue and so on.

Saul Sherry
50%
50%
Saul Sherry, User Rank: Blogger
2/28/2013 | 12:05:18 PM


Re: So long MapReduce?
I guess that makes sense... MapR is a foundation, but not one that can;t be switched out if a better method is found to serve its purpose.

Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
2/28/2013 | 11:22:01 AM


Re: So long MapReduce?
I am excited to see how business stakeholder use it for data warehouse like application and BI. That should beocme an interesting new aspect of Hadoop.

mharden
50%
50%
mharden, User Rank: Exabyte Executive
2/28/2013 | 11:01:13 AM


Re: So long MapReduce?
Nice article Christian.  It's good to see Hadoop is maturing.  With the emergence of Impala and Drill we will most likely see new languages or extensions to existing languages that will make working with Hadoop easier for developers.

Christian Prokopp
50%
50%
Christian Prokopp, User Rank: Blogger
2/28/2013 | 8:29:12 AM


Re: So long MapReduce?
I think it is natural evolution after old shcool batch oriented MR has been well understood and utilised. Now people want to get to the data faster and in more ways. So breaking up Hadoop into a more flexible cluster resource control mechanism (see 2nd part) is a good step. And if you have all these nodes with (potentially) free memory and CPU cycles hanging around the question is how can we use it to get to data faster. Lastly, the idea of streaming and online models and updates are a big push at the moment.

Saul Sherry
50%
50%
Saul Sherry, User Rank: Blogger
2/28/2013 | 6:47:24 AM


So long MapReduce?
Great article @Christian. The concept of bypassing MapR is sort of mind blowing. It feels like the majority of big data querying technologies stem from this particular way of returning data results. 

More Blogs from Christian Prokopp
ORC could well be the ultimate solution for businesses looking to optimize the way they pull data from Hadoop.
RCFile, as co-developed by Facebook, could be the ideal option for optimized Hive.
Storing data in text is slow, but enables rapid development.
Exploring the option of using text as an option to store and exchange data with Hive.
Lazy Indexing and Adaptivity in Hadoop could be the next step in bringing real velocity to Hadoop.
Flash Poll
Information Resources
Data Visualization Showcase
This Tableau visualization of international debt demonstrates how simple visualizations can give great insight
Explore this data here.
More Data Visualization Showcase
BDR in your Inbox
Digital Audio
Latest Archived Broadcast
Join this radio show to truly understand what a CIO needs to do to build a successful private cloud and what skills and values the IT team will need to embody.
Featured Video
1
Video: Visualization Is a Team Sport
James Robinson, co-founder of Open Signal, tells us why it takes two to get great visualizations.
Watch This Video
Follow Us on Twitter
Like Us on Facebook
Accolades
Accolades