Sponsored by:
 
Latest Comments
Blogs
Saul Sherry
14
Saul Sherry, Editor, 5/10/2013   Comment now
It is said that if you put an infinite number of monkeys to work on an infinite number of typewriters, you'll eventually end up with the complete works of Shakespeare.
Mike Lata
12
Mike Lata, Freelance Writer, 5/7/2013   Comment now
It's time to take a look through the big data hype to see which industries are making the most from their big data initiatives.
James M. Connolly
12
James M. Connolly, US Correspondent, 5/2/2013   Comment now
The General Electric announcement that it is investing $105 million in the EMC/VMware initiative known as Pivotal is worth a closer look, perhaps more for the big data implications than the cloud aspects that seemed to draw the most early attention.
Istvan Szegedi
4
Istvan Szegedi, IT Technical Architect, Vodafone UK, 4/18/2013   Comment now
Predictive analytics allows companies to reduce risk, supports more attractive customer experience, and instills better decision making. It is used in financial services, insurance, telecommunications, healthcare... you can name the rest of the industries.
John Edwards
3
John Edwards, Technology Journalist & Author, 4/15/2013   Comment now
How big does data have to be for it to be considered big data? Try asking that tongue-twisting question three times, fast -- then check these points.
John Edwards
11
John Edwards, Technology Journalist & Author, 4/8/2013   Comment now
It's hard to think of any companies more data driven than insurance firms.
Most recent post, Mithun Sridharan, 5/13/2013 8:24:41 PM
Hi Saul, There's another angle to Big Data and automobiles beyond the insurance...
most commented last month
20
Infographic: Big Data in the Financial World
Saul Sherry, Editor, 4/5/2013
Video Blogs
Message Boards
Chat
Flash Poll
  LinkedIn     RSS
Data Visualization Showcase
Tableau visualizes political sentiment, focusing on feelings about finance, in the runup to this year's US Election.
Explore this data here.
More Data Visualization Showcase
BDR in your Inbox
Like Us on Facebook
Follow Us on Twitter
Accolades
Accolades
 


Saul Sherry
Big Data Explained: What Is HDFS?

Part of 9   |  
See complete series
4|4|13   |   1:05   |   (13) comments


Big data is awash with acronyms at the moment, none more widely used than HDFS. Let's cut to the chase... it stands for Hadoop Distributed File System.

This is the system of distributing files that allows Hadoop to work on huge data sets at speed. It spreads blocks of data across different servers, as well as duplicating those blocks of data, and storing them distinctly.

Let's see why with an example.

Sarianne works in the financial markets, and runs a lot of predictive models to make sure her investments are minimum risk.

Utilising HDFS, her queries through Hadoop can run quickly because the data blocks are stored separately -- meaning all the computation can happen in one go, rather than queuing up behind each other.

As an added benefit, if one server fails (as one is bound to, given the amount of servers and disk drives needed to run big data projects) it won't stop Sarianne's models from pulling the data they need, because HDFS duplicated those blocks -- meaning Hadoop can return Sarianne's results in double quick time.

Saul Sherry
Big Data Explained: What Is Velocity?

Part of 9   |  
See complete series
1|18|13   |   1:53   |   (13) comments


Today we're going to take a look at the V that allows big data to be immediate and reactive: Velocity.

As well as having to master the sheer volume and variety of information within big data, organizations also have to be able to contend with the speed at which all of this data is generated. Real benefit can be gained by pouncing on this data in real-time -- affecting outcomes while they are still forming.

What kind of benefit?

Well, as we've already established, data can take many different forms. How working on this stream of real-time big data will benefit you will depend on your industry. For this example I'll focus on the financial services sector.

Andy is in charge of online security for a big bank, trying to make sure his customers' money is safe. When he can detect fraud after the event, it's fairly useless, but if he can spot it as it happens, it can be priceless. If a malicious computerized attack is started on Andy's bank, it will be generating thousands of events every second -- but Andy has put the right system in place to detect these events by comparing them to the way actual, normal customers behave. And it happens in real time, so alarms are going off to let him know.

As Frank Bria told us in his Big Data Republic article, Big Data Tackles Fraud:

Many fraudsters will access online banking and go directly to the transfer section of a Website without first checking balances and transactions. That clickstream is foreign and unfamiliar to the complex event processing engine and thus gets flagged.
In this way the bank can stamp down on the illegal activity as it happens, rather than chasing up after the event.