
  • Each one teach one

    I am either a terrible father or a best father. Mr. 3 asked me to tell him a story about Darth Vader as we brushed teeth tonight, so I told him about how Anakin was separated from the Skywalker knife and became the thrall of the Vader knife. … He’s going to be so confused.

  • Visualization libraries in Jupyter, Python, & R

    I’ve become a near-rabid fan of the Jupyter data analysis environment (hello Scott!), and I am deeply impressed by the work that Continuum (and some of my former colleagues at Google) have put into supporting it.  (I share some of these concerns, but that’s a post for another time.) This week I have been teaching myself…

  • Relational skills and the three wh’s

    There’s a fairly tidy — but imperfect — correspondence between the three wh’s and the relational skillsets I proposed yesterday. how corresponds well to the tooling skillset what roughly corresponds to the data stewardship skillset … leaving why to correspond to the collaboration skillset, which seems apt: why do data science if you don’t have someone you’re doing it with, or for?…

  • Relational data science skills

    Here’s what I see as ideal “data science” leadership. This post is a nod to the classic Conway Venn Diagram, but more focused on relational skills rather than the specific individual output (much as Tunkelang suggests here). Tooling skills Here, it’s most helpful to be comfortable with the family of “data science” tools that is out there, and be…

  • Rolling the dice at the Just World Casino

    tl;dr: The tech frame of “lean startup”, venture capital funding, “exit strategies”, and relentless “valuation” talk is fundamentally anti-human for nearly all of us. [ETA (immediately after publication):] Startup idea: They are treated like bees; they are robbed of the honey they make. — Hottest Startups (@HottestStartups) May 16, 2016 The kneejerk libertarianism and Randian…

  • Three wh-‘s of data science

    “Big data” bandwagoneers may remember the three Vs of big data: volume, variety, and velocity (sometimes joined by veracity or variability[0]).  These concerns are real, though (if you’re not Google, Amazon or the NSA), your data is probably not as big as you think it is. Data “science”, though, is a bigger question than working with big data.  Sometimes…

  • Samyro 0.0.2 – sampling structured inputs

    New version of Samyro (0.0.2) now uploaded to Pypi. Github repo has the details, but I’ll brag about the new features: samyro write accepts a –seed argument, which allows the usual temperature-based decoding *after* the engine has progressed through the given seed. The default seed is now the BOS character, which plays nicely with the structured…

  • Samyro 0.0.1 – development update

    Last week I posted a new Python package to Pypi: Samyro, my toolkit for doing RNN-based character synthesis of new text from given text. I summarize in tweets the journey to the release of the project.  There’s lots more to do, but much of the experimentation now is based on changing the input texts that…

  • Butlerian jihad and the return of the JDI

    A message containing a precis for action in the case of data-loss and/or coldsleep hibernation. Sent from [transcription failed] [Message begins] After the Exchange Compact (establishing the Combine Honnete Ober Advancer Mercantiles) and the massive data-loss of the Second Butlerian Jihad, impulse-based intelligences were thoroughly reduced to second-class citizens of the Old Empire, and mentats took over most architectural design…

  • I’m looking for work

    I am currently without employment, and I’m looking to see what’s next for me. I am excited about human language, computers, and machine learning, and I’m pretty good at all three and their areas of overlap. I am happiest tinkering in the “Bayesian” and “Deep” corners of the Eisner Simplex, but can keep my head above water…