As data from a rapidly-multiplying number of sources pours into organizations of all types and sizes, security concerns are growing. Specifically, a growing number of enterprises are wondering if it's possible to maintain an iron grip over increasingly massive pools of data.
A number of cloud providers and data vendors are wondering the same thing. Earlier this year, the Cloud Security Alliance and Fujitsu Labs joined forces to create the Big Data Working Group (BDWG), an organization dedicated to improving big data security.
The BDWG says it's now looking to identify scalable techniques for data-centric security and privacy problems. The organization's other long-term goals include the development of best-practices for security and privacy in big data, helping industry and government adopt best-practices, establishing liaisons with other organizations to coordinate the development of big data security and privacy standards, and accelerating the adoption of research aimed at addressing security and privacy issues.
Challenges and suggestions
The BDWG has so far identified 10 big data-specific security and privacy challenges, including vulnerabilities associated with distributed programming frameworks, non-relational data stores, and data storage and transactions logs.
Distributed programming frameworks use parallelism in computation and storage to process massive amounts of data. One example is the MapReduce framework, which splits an input file into multiple segments. However, data mappers sometimes contain intentional or unintentional leakages. The BDWG recommends two major attack prevention measures: securing the mappers, and securing data in the presence of an untrusted mapper.
Non-relational data stores, popularized by NoSQL databases, present another security challenge. The clustering aspect of NoSQL databases is of particular concern. To reduce security incidents, the BDWG recommends reviewing security policies for the middleware, and strengthening the NoSQL database itself to match its counterpart RDBs without compromising operational features.
Data and transaction logs are yet another security weak point. Such logs are stored in multi-tiered storage media. Manually moving data between tiers gives IT managers direct control over precisely what data is moved and when. Yet, as the size of the data set grows exponentially, scalability and availability require the use of auto-tiering for big data storage management.
The problem is that auto-tiering solutions don't automatically track where the data is stored. According to the BDWG, new mechanisms are imperative to thwart unauthorized access and maintain 24/7 availability. The organization also specifically warns that an auto-tiering solution that pulls rarely-used data to a lower tier as a cost-saving measure provides decreased security. The organization says that auto-tiering users should carefully study their tiering strategies.
Other big data security challenges identified by the BDWG affect end-point input validation/filtering, real-time security/compliance monitoring, scalable and composable privacy-preserving data mining and analytics, cryptographically enforced access control and secure communication, granular access control, granular audits, and data provenance.
The BDWG has published a white paper describing all of the big data-related security and privacy threats it has identified, along with user case descriptions and action recommendations.
User Rank: Exabyte Executive 12/26/2012 | 4:47:58 PM
Re: noSQL brings more problems Yes, good point to emphasize. NoSQL databases are still evolving. Each of the various NoSQL databases are built to tackle different challenges posed by analytics and hence security was never part of the model. I think developers embed security in middleware, but NoSQL databases still lack security support.
User Rank: Blogger 12/18/2012 | 2:04:47 AM
Re: Does it mention Human Elements of Security? As usual I would say it's all about the intellgence which goes into the situation. With the advent of the internet of things, most human interactions should be measured. If you have a best practices for privacy set up, analytics and sensor monitoring should let you see when that human is bing the weakest link and set up defences... of course, the unexpected always happens.
User Rank: Exabyte Executive 12/17/2012 | 9:45:14 PM
Does it mention Human Elements of Security? I was wondering if the white paper puts any focus on the human elements of security -- I'm used to people being the weakest link in security situations, and I wondered if it was an issue with Big Data, or if the people involved with Big Data are relatively reliable from a security perspective?
User Rank: Blogger 12/17/2012 | 8:41:27 AM
noSQL brings more problems noSQL is referenced as a specific issue here. The complexity of such a relational system makes it so and ramps up the need to make sure it's necessary. NoSQL is great, but it adds weight to the idea of making sure you can't address these issues through traditional SQL... less complex in running and security.