Resources to get up to speed with data science
The following list of resources will let you join the data revolution by getting up to speed with data science. As you can tell from reading all the articles and posts here on Big Data Republic, this whole area is exploding in popularity and attention.
Now is the time to take advantage of free big data scientist education.
Data science -- and the driving force behind it, machine learning -- is the process of deriving added value from data assets. Commerce and research are being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels span a variety of disciplines and are not easy to obtain through conventional curricula. These include algorithms for machine learning (e.g., neural networks and clustering), parallel algorithms, basic statistical modeling (logistic regress and linear/non-linear regression), and proficiency with a complex ecosystem of tools and platforms.
Meetup groups
A good place to start is with meetup groups. Two of my favorite data science groups deal with the primary ingredients of data science work: R, which is the programming environment of choice for building algorithms, and machine learning. The LA area R user group is excellent; try to find one near you. The LA Machine Learning group has regular meetings that are extremely useful.
For more insight into these groups, visit the Field Report category at my blog, Radical Data Science, where I write up intimate accounts of the various meetings I attend.
Open courseware
The Massive Open Online Course (MOOC) movement is very active in the data science space and constitutes a superb educational resource. These free courses (some offer certifications) offer an excellent path toward obtaining the requisite background for becoming a data scientist. I’ve put together a BDR Data Science “pseudo degree program” for you to follow.
As the interest in data science continues to grow, and as the shortage in talent becomes apparent, the timing is excellent to retool yourself and climb aboard the data science gravy train.
Free e-book on Data Science with R Here is a great resource for learning about data science using the R environment, an e-book "An Introduction to Data Science" by Jeffrey Stanton from Syracuse University School of Information Studies. The book was developed for Syracuse's Certificate for Data Science. Download the PDF and enjoy!
CyberH,
User Rank: Bit Player 1/26/2013 | 3:19:53 PM
Big Data Solution Daniel, designed by data scientists, HPCC Systems is an open source data-intensive supercomputing platform to process and solve Big Data analytical problems. It is a mature platform and provides for a data delivery engine together with a data transformation and linking system. Their built-in analytics libraries for Machine Learning and integration with Pentaho and R provide analysts with an end to end solution for ETL, data mining and reporting. The http://hpccsystems.com portal is full of helpful blogs and resources for getting started.
AlphaEdge,
User Rank: Exabyte Executive 1/24/2013 | 12:47:12 PM
Re: Free is my favourite price I think it may take a little while for universities to be able to plot it out on the curriculums for the data science subject. Or many universities are already on top of it?
Saul Sherry,
User Rank: Blogger 1/24/2013 | 6:15:07 AM
Re: Free is my favourite price @mharden do you think the students/established data analysts see the difference being the way you envisioned it? Is data scientist the more daunting path to take?
Saul Sherry,
User Rank: Blogger 1/23/2013 | 5:57:02 AM
Re: Free is my favourite price Hi @AlphaEdge, I've just visited all the free book links, and they all throw up either the correct PDF or a page which I can then use to navigate to the PDF. What device were you using to access the links?
AlphaEdge,
User Rank: Exabyte Executive 1/22/2013 | 5:21:57 PM
Re: Free is my favourite price @Daniel, Was trying to access the list "Free data science books", several of the links do not work. I tried to navigage the webpage, but was not able to find the pdf version. I hoped to read them in a pdf version. Thanks.
Re: Free is my favourite price Data analyst vs. Data scientist ... think of it this way, data analysis is a subset of data science, which implies that a data scientist must have the skills of a data analyst. In fact, much of the early elements of a data science project is to use data analysis to "get to know" the data set(s) intimately. Data scientists possess higher level skills moving forward from here. You'll see that the first few classes in the list I provided prep the process with becoming familiar with data analysis techniques. Hope this helps.
mharden,
User Rank: Exabyte Executive 1/21/2013 | 7:31:45 PM
Re: Free is my favourite price Thanks for the links Daniel. These remind me of my freshman year in college when I took my first programming class. Would you say its more about the mind set and approach to solving problems that distinguishes a data scientist from a data analyst?
Another option for DS101 Here is another free Intro to Stats class, which can be viewed as an option to the BDR Data Science 101 course. This one is from Edx.org based on a UC Berkeley course.
Join this radio show to truly understand what a CIO needs to do to build a successful private cloud and what skills and values the IT team will need to embody.
To save this item to your list of favorite Big Data Republic content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.