[ml] Seeking a Data Scientist

Jared Dunne jareddunne at gmail.com
Mon Oct 24 21:29:03 UTC 2011


My employer, a Thomson Reuters business, is hiring a Data Scientist
for our Architecture Team.  We are seeking a full-time hire, but would
also consider a contract or contract-to-hire candidate.  Our offices
are located near downtown Sunnyvale about 10 minutes walk from
Caltrain.  There is definitely some flexibility around work hours and
telecommuting, but the majority of work would be performed onsite in
Sunnyvale.

We recently began a new project that involves pulling large amounts of
legal industry data from various internal and external sources (mostly
semi-structured data).  Once we have the data, we will need to
automate the matching of entity records across data sources and the
merging of them together into a more comprehensive, unified record.
Additionally, we will be interested in doing more analysis on the
resulting merged data to discover new insights.  We anticipate needing
to select and implement a variety of machine learning, information
retrieval, and data mining algorithms through out the course of this
project.

We have people on staff with vague understanding of some good
approaches to these problems, but we need an experienced data
scientist to confidently steer our approach and implementation.
Ideally, candidates would have a good depth and breadth of knowledge
of the relevant algorithms that we might consider.  While it's not
required that the candidate be an elite software engineer, some
programming experience and ability to prototype your chosen
approach(es) will be required.  Similarly, while the candidate need
not know all the gory details and have implementation experience with
every conceivable algorithm under the sun, they should be able to
propose a variety of competing approaches and explain their benefits
and drawbacks.

MS/MA or PhD in Computer Science, Statistics, Probability,
Mathematics, or similar field is preferred, though only a BS/BA is
required.  Being able to find, read and understand white papers to
research and learn new algorithms is definitely a required skill.
Experience with R, SciPy, Mahout (or similar) is a big plus.  Some
experience with Java is preferred, though only some form of prior
programming experience is required.  Experience with Hadoop (or
MapReduce) would be amazingly helpful, but isn't required.  Agile
development experience is a plus.

If you are interested, or have questions, please contact me directly
by email.  Feel free to attach a resume or CV.  If you aren't sure
whether you are good match, write me anyways and we can feel it out.

Jared-

PS: I'll have limited email access starting Wednesday until Nov 3rd,
so if I'm delayed to getting back to you, please don't read into it!



More information about the ml mailing list