The best big data insight comes when an organisation looks at itself from the inside out. Approaching the challenge to get your own network as transparent and clean-running as possible is the best grounding for real business intelligence insights.
James M. Connolly, US Correspondent, 5/17/2013 Comment now
College students get queasy when they think of their institution of higher learning as being a business with budgets and management mandates. After all, the classroom, the dorms, and the campus are at the root of the word collegial.
Ariella Brown, Technology Blogger, 4/12/2013 Comment now
By using data to track preparation for college at the high school level and the experience of students at the postsecondary level, College Summit focuses on achieving measurable improvement for low-income students, not only in terms of acceptance to college, but in terms of progress at that level.
Ariella Brown, Technology Blogger, 4/3/2013 Comment now
On April 18 and 19, the Digital Public Library of America (DPLA) will celebrate its launch at the Boston Public Library. In keeping with the ideals underlying the project, there is no charge to attend, though the registration forms indicate the event has filled up.
Francine Bennett, CEO, Mastodon C, 3/7/2013 Comment now
Anyone who's looked at big data technologies can't help but notice that it's open-source projects and tools that are leading the charge -- not only Apache Hadoop, but also MongoDB, Cassandra, R, scikit-learn, and many others.
Daniel D. Gutierrez, Data Scientist, 2/26/2013 Comment now
Let's say you've gone through all the resources in Free Big Data Education: A Data Science Perspective here on BDR and you're wondering what's next? For those of you who want to go further in the field of data science, here is a list of unstructured educational methods that you can use to expand your knowledge.
ETL is central to a lot of big data work, standing for Extract, Transform, and Load. But what does that mean? Let's explain it with an example:
Lauren is a data scientist working at a university, looking to bring together different datasets to make sure students are offered courses which best suit their profiles. To do this, she needs to pull data from lots of places into a centralized data warehouse.
First, she needs to extract data from the original sources, which can include existing university databases, as well as web crawling for social media information on students.
Next, Lauren has to transform this extracted data so that it fits in a way the centralized data warehouse can use it. For this, she can use a series of rules or functions to get the data into shape -- for instance, changing DOBs to reflect age, deriving aggregated values, deduplicating records, or joining data from multiple sources, depending on what the final data warehouse needs.
Finally, Lauren can load this data into the data warehouse, giving her a way to gain new insight on students by mining for patterns in this collected data.