Our one-to-one conversations with data scientists continues as we get to know Josh Wills,
senior director of data science at Cloudera.
Cloudera is among a handful of organisations essentially synonymous with big data -- so this chat presented us with a great opportunity to get under the skin of one of the guys driving innovation in big data tool creation.
Big Data Republic: What does your day-to-day work involve?
Josh Wills: I work on a few open-source projects: Apache Crunch (data pipelines), Oryx (machine learning infrastructure), and Gertrude (multivariate testing for machine learning and search ranking.) I also advise my customers on anything they might need at their current phase of Hadoop adoption, from hiring their first data scientist through use case evaluation and technology selection through building and deploying data products.
BDR: What attributes are important for a data scientist at Cloudera?
JW: We put a higher priority on tool building and software engineering abilities than most data science teams, because our main job is to make our customers more productive in their own data science work. Although we provide advice over the phone and during in-person whiteboarding sessions, the open-source tools we create allow us to collaborate on problems no matter where we are.
BDR: What's your workspace like?
JW: I travel a lot, so I had to develop the ability to work just about anywhere by tuning out the world around me. I work at home on my couch, in airport lounges, in hotel rooms, in coffee shops, and occasionally at Cloudera's office in San Francisco.
BDR: How did you get started in data science?
JW: I think that I've been a data scientist my entire professional life, there just wasn't a term for it until a few years ago. My sequence of job titles prior to joining Cloudera was software engineer, pricing analyst, software engineer, member of technical staff, statistician, quantitative analyst, senior software engineer, and staff software engineer. I've always worked in data analysis roles that required a lot of coding or software engineering roles that involved building tools for working with data.
BDR: What do you listen to while you work?
JW: When I'm analyzing data, Fleet Foxes or Ray LaMontagne. When I'm coding, Skrillex or Aphex Twin.
BDR: What software tools couldn't you live without?
JW: For my Java development work, I'm a recent convert to IntelliJ. For R, I am a huge fan of RStudio, and for Python, I still use vim. You can take vim away from me when you pry it from my cold, dead hands.
BDR: What advice would you give to companies looking to recruit people like you?
JW: Good data scientists gravitate towards good data, and good data is like money: It goes where it is wanted and stays where it is well treated.
BDR: What's the best thing about being a data scientist?
JW: I love losing myself in some data set that I'm analyzing. Thinking up some question, manipulating the data to get some insight into it, thinking of something new and trying that, over and over and over again. I don't even feel my ass in the chair.
Transition to Data Science Very impressive profile, there is no doubt that Cloudera is already way ahead of other competitors in the big data implementation projects. Looking at Josh's transition from software engineering towards statistics and data analytics is quite impressive, but the reverse journey for a data scientist is not that easy. I am referring to a person from statistics and quant field to adapt to the software enginerring and coding and excel at the data science profession is too difficult.
"For their coders they had a system of velcro patches on retractable cables which could be pulled own from the ceiling... when you had that patch velcroed to your monitor you were NOT to be disturbed."
I know a work open space which is used by different startups. When the person, (or team) has a little rubber duck on the desk is a sign for "do not disturb".
Re: Different music for different scenarios SharCo,
"Songs with lyrics (especially the ones where the singer seems to be singing five words a second) really don't go well with writing. You just end up with work that sounds like a five year old wrote it."
I would say that depends on the individual. I have never listened to anything with five words per second while writing simply because it crisps my nerves. Mostly, I write in silence, and if I am in an environment that is too noisy I can't get too much done. When I choose silence it has to be absolute silence, which I enjoy very much.
However, some times I write with classical music in the backgroud, not too loud, it gives me certain tempo, and some nice speed that makes the words flow equally nicely. It puts me a good mood, and I am always happy with the writing result.
Some other times, (fewer that the two mentioned above) I write listening to music with lyrics. I have to be in certain mood, though. And the singer has to have a special voice with a calming, or nice effect on me. I have written listening to Frank Sinatra, or selected David Bowie´s songs. I have been happy with the result as well. You might have read some of those writings. :) I wonder if you could tell. :D
I may listen to some of that music for 15-20 minutes while thinking standing in front of the window, and stop it while going back to my desk for putting those thoughts into words.
Sometimes I listen to Sigur Rós, which is quite meditative to my ear, and makes me produce some more philosophical writing.
So, it all depends not only on the individual but also in the several particular moods of that individual. There is no rule to it.
User Rank: Petabyte Pathfinder 12/10/2013 | 7:51:33 AM
Re: Different music for different scenarios I don't think music suppresses creativity at all. In fact, I think it encourages it. But it depends on the person. It really depends. If they work well in silence, then silence it is.
User Rank: Petabyte Pathfinder 12/10/2013 | 7:50:06 AM
Re: Different music for different scenarios I find that music helps me feel creative if I "let" it. I think it has to do with your attitude, too. I find that instrumental music inspires me, while those with lyrics are those that I want to just reflect and not do anything.
User Rank: Petabyte Pathfinder 12/10/2013 | 7:47:47 AM
Re: Different music for different scenarios Songs with lyrics (especially the ones where the singer seems to be singing five words a second) really don't go well with writing. You just end up with work that sounds like a five year old wrote it.
User Rank: Petabyte Pathfinder 12/10/2013 | 7:46:23 AM
Re: Different music for different scenarios Hmm.. When I was in high school, I worked best with a little background noise. I didn't mind music playing from the distance, for example. Then when I was in college, I couldn't concentrate anymore. I needed complete silence in order to focus. So I would say, yes, that's a generalization, because I don't think gender is the only determinant.
User Rank: Blogger 12/10/2013 | 5:31:19 AM
Re: Different music for different scenarios For me, I find that music can distract the part of me that seems to want to be distracted, while letting me get on with some work. It has have no lyrics otherwise, writing and lyrics don't go well together.
User Rank: Blogger 12/10/2013 | 5:01:29 AM
Re: Different music for different scenarios There are all sorts of solutions out there @Mike. A few years ago I went to a talk by a guy who launched a start up (he'd come from one of the big guys, Facebook or Google, can't remember which). For their coders they had a system of velcro patches on retractable cables which could be pulled own from the ceiling... when you had that patch velcroed to your monitor you were NOT to be disturbed.