Let's say you’ve gone through all the resources in Free Big Data Education: A Data Science Perspective here on BDR and you’re wondering what’s next? For those of you who want to go further in the field of data science, here is a list of unstructured educational methods that you can use to expand your knowledge.
Data science blogs
Monitoring blogs, written by experts in the field, is an excellent way to learn what the professionals are doing. Here are some of my favorites:
Data science conferences are happening all around the world. Although attending these events may incur significant costs, many times their proceedings' materials are made available for free, after-the-fact. Take, for example, Salford Systems, which offers a library of conference videos on a variety of data-science-related topics. I’ve watched many of them and they are top-rate.
Keeping up with academia
Keeping a close eye on academia is an important way for you to embrace leading trends in data science. Many researchers at universities and think-tanks produce a broad array of new techniques all the time, so it is a good idea to see what’s coming.
The Association for Computing Machinery (ACM) is the computer industry’s venerable professional society, and it maintains a number of relevant special interest groups (SIGs) such as ACM SIGKDD (Knowledge Discovery in Databases), which publishes an excellent journal “SIGKDD Explorations” that I greatly enjoy. SIGEVO (Genetic and Evolutionary Computation) is another favorite of mine for its special focus on evolutionary algorithms.
ACM and its SIGs require membership fees, but many of the research papers on machine learning destined for ACM journals first appear on the arXiv.org pre-print server. This is a tremendous resource that I can’t recommend highly enough. You should monitor the recent list every few days to be sure nothing good slips past.
Microsoft Research is another superb academic resource. The company maintains machine learning groups around the world. They attract the best and brightest talent and offer a broad range of research results.
A great way to learn more data science is to follow select data scientists on Twitter to get leads on the latest techniques and research. I follow a variety of people on Twitter and frequently see links to papers. Just use the Twitter search feature with keywords like “data science,” “big data,” “machine learning,” etc. One of my favorites is @kdnuggets, highly recommended. Some data scientists tweet from conferences, so you can benefit from monitoring their tweets to get the latest breaking information without actually attending.
A good way to continue your data science education is to examine data, all sorts of data, and then develop algorithms for classification, prediction, clustering, etc. Here is a list of free and open data repositories:
Finally, a great way to gain real-world experience and learn a lot along the way is to participate in a data science competition. Kaggle is arguably the best known resource and offers a number of competitions running at any particular time. You’re given a complete description of the problem to be solved, training data sets, and an online forum to discuss the project with others. Maybe the best thing about the competitions is that many have prize money. For example, the Heritage Health Network competition has a grand prize of $3 million. You can start with one of the practice competitions for valuable experience, and then try a real one later. Other competition sites are DataKind and TunedIT.
The field of data science is running at such a frenetic pace, resources like those mentioned in this article will morph over time. Please be sure to share your favorite resources here.
User Rank: Petabyte Pathfinder 2/27/2013 | 10:23:55 AM
Re: I'd like to add... good one Saul. We all agree that BDR has been very informative and has been a top knowladge base for us. Its keep us updated with new and efficient technology and the trends that are in and used by majority in the IT world.
User Rank: Petabyte Pathfinder 2/27/2013 | 3:10:00 AM
Re: I'd like to add... Thanks for the share! great topic, its could be nice if our BDR blog can be added to the list, I regularly received some great feedbacks and followers in my twitter account by tweeting. Big data is a very resourceful site, something that we can extract useful information could be used to boost our competitive stakes.
Re: I'd like to add... Wow, great resource, thanks for sharing. Important to note that I've shared this with a friend in finance...forward-thinking financial analysts are all over big data...the lessons are emerging in particular because they are embracing human behavior as the true driver of the economy, as irrational as people can be
I also offer Meetup as a great way to meet other big data and data aficionados (http://big-data.meetup.com)