Resources to get up to speed with data science
The following list of resources will let you join the data revolution by getting up to speed with data science. As you can tell from reading all the articles and posts here on Big Data Republic, this whole area is exploding in popularity and attention.
Now is the time to take advantage of free big data scientist education.
Data science -- and the driving force behind it, machine learning -- is the process of deriving added value from data assets. Commerce and research are being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels span a variety of disciplines and are not easy to obtain through conventional curricula. These include algorithms for machine learning (e.g., neural networks and clustering), parallel algorithms, basic statistical modeling (logistic regress and linear/non-linear regression), and proficiency with a complex ecosystem of tools and platforms.
A good place to start is with meetup groups. Two of my favorite data science groups deal with the primary ingredients of data science work: R, which is the programming environment of choice for building algorithms, and machine learning. The LA area R user group is excellent; try to find one near you. The LA Machine Learning group has regular meetings that are extremely useful.
For more insight into these groups, visit the Field Report category at my blog, Radical Data Science, where I write up intimate accounts of the various meetings I attend.
The Massive Open Online Course (MOOC) movement is very active in the data science space and constitutes a superb educational resource. These free courses (some offer certifications) offer an excellent path toward obtaining the requisite background for becoming a data scientist. I’ve put together a BDR Data Science “pseudo degree program” for you to follow.
Free e-book on Data Science with R Here is a great resource for learning about data science using the R environment, an e-book "An Introduction to Data Science" by Jeffrey Stanton from Syracuse University School of Information Studies. The book was developed for Syracuse's Certificate for Data Science. Download the PDF and enjoy!
User Rank: Bit Player 1/26/2013 | 3:19:53 PM
Big Data Solution Daniel, designed by data scientists, HPCC Systems is an open source data-intensive supercomputing platform to process and solve Big Data analytical problems. It is a mature platform and provides for a data delivery engine together with a data transformation and linking system. Their built-in analytics libraries for Machine Learning and integration with Pentaho and R provide analysts with an end to end solution for ETL, data mining and reporting. The http://hpccsystems.com portal is full of helpful blogs and resources for getting started.
User Rank: Exabyte Executive 1/24/2013 | 12:47:12 PM
Re: Free is my favourite price I think it may take a little while for universities to be able to plot it out on the curriculums for the data science subject. Or many universities are already on top of it?
Re: Free is my favourite price Hi @AlphaEdge, I've just visited all the free book links, and they all throw up either the correct PDF or a page which I can then use to navigate to the PDF. What device were you using to access the links?
User Rank: Exabyte Executive 1/22/2013 | 5:21:57 PM
Re: Free is my favourite price @Daniel, Was trying to access the list "Free data science books", several of the links do not work. I tried to navigage the webpage, but was not able to find the pdf version. I hoped to read them in a pdf version. Thanks.
Re: Free is my favourite price Data analyst vs. Data scientist ... think of it this way, data analysis is a subset of data science, which implies that a data scientist must have the skills of a data analyst. In fact, much of the early elements of a data science project is to use data analysis to "get to know" the data set(s) intimately. Data scientists possess higher level skills moving forward from here. You'll see that the first few classes in the list I provided prep the process with becoming familiar with data analysis techniques. Hope this helps.
User Rank: Exabyte Executive 1/21/2013 | 7:31:45 PM
Re: Free is my favourite price Thanks for the links Daniel. These remind me of my freshman year in college when I took my first programming class. Would you say its more about the mind set and approach to solving problems that distinguishes a data scientist from a data analyst?