Sponsored by:
 
 

Big Data: Tiny Storage

Ariella Brown
50%
50%
Newest First | Oldest First | Threaded View
comments
Ariella
50%
50%
Ariella, User Rank: Blogger
2/9/2013 | 10:33:18 PM


Re: Interesting but....
@Saul I wonder if it is well-suited to images. As paintings decay over time, getting a high resolution photograph stored in a way that keeps it fresh and accurate over 10,000 years may be useful as a check on what the original shades and hues were, particularly if restoration is needed and we wish to avoid a repeat of the botched fresco.

Saul Sherry
50%
50%
Saul Sherry, User Rank: Blogger
2/9/2013 | 7:07:50 PM


Re: Interesting but....
@legalcio... I can see that on monetary terms, the answer to your tweeting question would probably be a straight up no. But in terms of interest, it would be fascinating to have access to this repository of human idea regurgitation, and map how it changes past generations (looking at whatever platform comes to be the 'next' twitter). Interesting in an anthropological setting, and possible for a broader brand significance study. But will it matter to a bank looking to retain customers, or a doctor looking to cure a patient... no. And therefore, no investment will be justified.

Ariella
50%
50%
Ariella, User Rank: Blogger
2/8/2013 | 10:12:19 AM


Re: Interesting but....
@Susan At the rate we're going, it looks like data will continue to increase, as more and more information is tracked. Walmart's data likely includes what people searched for, as well as what they ultimately bought, when they shopped online. It also would track what types of devices they are using to access the site and how long and how often they visit. With smartphone sensors, even people's movements can be tracked just by virtue of Wi-Fi signals. So every step you take can contribute to the mounting piles of data. 

DNA data storage offers the dual advantages of being incredibly compact and robust -- with an estimated life span of one thousand times that of current long-term storage options. Though it is not a feasible solution for now, thinking outside the box of binary code may lead to another solution that could be applied before a decade passes.

Susan Fourtané
50%
50%
Susan Fourtané , User Rank: Blogger
2/8/2013 | 4:08:51 AM


Re: Interesting but....
legalcio, 

The other thing that crosses my mind is all this data worth storing for the long term?  Does the shelf life for tweets really need to go beyond a year?

No tweet deserves long life storage. It would be silly to use an expensive storage usage, or something like the disapponting DNA data storage that will not happen for storing tweets. 

However, it would be fantastic if someone could come up with a real solution, which actually could be applied and used to safely store impotant big data.

-Susan

Susan Fourtané
50%
50%
Susan Fourtané , User Rank: Blogger
2/8/2013 | 3:32:20 AM


Disappointment
Ariella, 

And the answer to my question about how and when DNA data storage would be available came immediately as I continued reading. :( 

I was so excited thinking of all the possibilities and solutions this could bring that I couldn't wait to finish reading, and had to comment in the middle of having my thoughts.

Now I am so disappointed to learn that DNA data storage is simply not going to happen. Even if the cost becomes less expensive, if the files can't be updated it would be like having the same problem you have with paper storage: you simply have to create a new data entry. What's the point? 

This was good for publicity, for them. Not a real contribution to anything. 10 years in today's agile world is the equivalent to 100 years. Who cares? We will be all death by then. I am so upset that I can hardly choose my words here. 
  1. "The files cannot be updated; they would have to be set up into a new sequence for any modification.
  2. Each file has to be decoded in its entirety, as it doesn't allow access to a single component."
Do they plan to continue working on this project? Do they expect to find a way to be able to modify the DNA sequence in order to allow updates? 

Do they plan to find a way to be able to decode single components? What's next in their research plan? Do they even have a plan?

Shakespeare's Sonets in a DNA sequence storage I think I read time ago. Maybe when thet started the project. 

You say maybe this brings a shift in encoding data, at least. Maybe. 

-Susan 

 

Susan Fourtané
50%
50%
Susan Fourtané , User Rank: Blogger
2/8/2013 | 2:52:20 AM


Long life data
Ariella,

"this form of data storage for its stability, estimating its lifespan to extend to at least 10,000 years."

All this is super interesting.

So this could be the solution to storage, and the fears of losing data for one reason or another. If data can be safely stored for 10,000 years all our worries should be over, at leat in this department.

I wonder how and when DNA data storage could be available for everyone who needs to store data. 

-Susan

Susan Fourtané
50%
50%
Susan Fourtané , User Rank: Blogger
2/8/2013 | 2:37:07 AM


Is there a limit?
Ariella, 

"If all that data were in paper, it would fill "about 20 million filing cabinets." 

I can hardly imagine a room filled with 20 million filing cabinets. Less I can imagine how come someone could deal with such amount of data in paper form. 

Electcronic formats can be handled in easier, but how much is big data still to grow? 

-Susan

Ariella
50%
50%
Ariella, User Rank: Blogger
2/7/2013 | 1:59:44 PM


Re: Interesting but....
@legalcio Maybe that's what the Library of Congress should do with its record of tweets -- store them in some DNA. But don't expect to see DNA around offices for data storage any time soon. Even the somewhat optimstic view of the researchers involved here is for it to happen in ten years. 

For now the cost is prohibitive. But the expectation is that it will drop substantially over time.  According to http://singularityhub.com/2012/09/17/new-software-makes-synthesizing-dna-as-easy-as-using-an-ipad/ The cost of DNA writing right now is about 25 cents per base pair. At that rate synthesizing an E. coli genome - a relatively small genome of 4.6 million base pairs - would still cost over $1M, too much for the average lab to pay, especially as they typically build up libraries of varied versions of the DNA. But the cost of DNA synthesis is dropping at a super-exponential rate, outpacing even Moore's Law. Amirav-Drory predicts it will hit 10 cents per base pair this year, and in the near future, when the cost of synthesis drops enough, gone will be the laborious days of manually splicing together pieces of DNA.

legalcio
50%
50%
legalcio, User Rank: Exabyte Executive
2/7/2013 | 1:27:40 PM


Interesting but....
The implications for Big Data, I assume, would be whole different breed of data scientists.  Would the storage venue be too complex to take advantage of Big Data?  The other thing that crosses my mind is all this data worth storing for the long term?  Does the shelf life for tweets really need to go beyond a year?

More Blogs from Ariella Brown
Google turns its big data eyes on the world's oceans.
Can big data finally offer the transparency to pull doctors out of Big Pharma's pocket?
Aligning global data is the first step to fighting global trafficking.
A new heart research program from the University of California, San Francisco, promises a lot, but needs more attention.
College Summit gets the data that gives objective reports about what students really do need to succeed in their educational goals.
Flash Poll
Data Visualization Showcase
This Tableau visualization of international debt demonstrates how simple visualizations can give great insight
Explore this data here.
More Data Visualization Showcase
BDR in your Inbox
Featured Video
9
Big Data Explained: What Is ETL?
OK, so it's Extract, Transform and Load - but we'll show you what it really means.
Watch This Video
Follow Us on Twitter
Like Us on Facebook
Accolades
Accolades