Work

Play with minds, bits, habits and words

I’m interested in how automation encodes work and thought, and how we move through these together. There is tremendous power in moving mindless work — even, and perhaps especially, work with natural language — to machinery, and we have a corresponding responsibility to understand our application of automation to the lives we change.

I am usually focused on the “Bayesian” and “Deep” poles of the Eisner simplex, but I know how to find tractable solutions in the “Classical” sense as well.

I love working across discipline boundaries: being the linguist among engineers, the data analyst working with UX teams, or the machine-learning person among lawyers or humanists.

Empiricism

My 2013 resume (available below) leads with a bit of a thematic-roles structural joke:

Empiricist. Software engineer. Computational linguist.

Specialization

  • purpose: Finding & answering interesting questions in rich, complex, noisy data, with special attention to natural language.
  • instrument: Scientific understanding, curiosity, and top-notch software craftsmanship.
  • theme: Building and applying maintainable, understandable, extensible scientific software.
  • manner: Marrying engineering-based and science-based ideas of empiricism, information, replicability, modularity, and explanatory power.

In the kind of science and engineering that I like to do, I try to have a clear picture of how a project can be measured — what would a successful project look like? what would be a sign of failure? how do we know if it’s working? — and drive both scientific and engineering development from those metrics. Engineers — especially software engineers — are familar with “test-driven development”; the research scientific community — even in computational linguistics — sometimes sees testing and measurement as an afterthought.  I prefer to keep measurement right in sight from the beginning of the project.

I like the word “empiricist” — it reflects the kind of orientation to data and measurement that I prefer to adopt in my work with natural language and with scientifically-aware engineering in general.

The trendy name for my profession is “data science”, which is undoubtedly data-oriented, but not entirely science. My work combines aspects of software engineering, database (and parallel-systems) architecture and design, curiosity and expertise in the domain of the current problem, and applications of (usually statistical) machine learning models.

History

I live in Seattle, Washington.

My graduate-school experience — and much of my domain expertise — is in natural language processing, speech processing, and other aspects of computational linguistics. I have industry and academic experience in speech synthesis, speech recognition, text processing, large-corpus data analysis, and other areas of NLP.

I have a 2010 PhD in Linguistics from the University of Washington, where I was a member of the Electrical Engineering SSLI lab. I finished my PhD as a visitor at SRI‘s STAR lab, living in San Francisco. Since completing my PhD, I worked as Director of Research at Wordnik, as a Research Linguist at STAR, and as a Senior Computational Linguist at Quid, a business intelligence platform.  In 2011 I moved back to Seattle, my home of the heart.

I was a Senior Scientist and Software Architect at inome, where I designed and implemented systems to apply machine-learning and NLP techniques to data-mine people records into a graph of All The Things.  In my Software Architect role, I was a go-to guy on system design, pipeline architectures, parallel data processing, and designing for software development in the (slightly kooky!) Hadoop ecology.

I was a (Senior) Software Engineer at Google Seattle from 2014-2016, where I worked on neural networks, language modeling, and several other projects I’m not allowed to talk about. I worked in Java, C++ and Python.

More information about my research and career can be found here:

I welcome email: jeremy at trochee dot net.