December 30, 2013

The verbs of Data Science

The kerfuffle about what is Data Science and who are Data Scientists comes down to nouns and verbs.

Data Science and Data Scientists are nouns. Most nouns are abstractions, buckets that make communication easier. Nouns are short-hand, but they can slippery (some people do not tolerate ambiguity well.). Nouns become more useful (and more powerful) when paired with verbs.

Here are sample data science verb-noun pairs:

  • Fit models.
  • Create data products.
  • Communicate results.

Each one of those general verbs can be made crisper with more specific verbs:

  • Pipe data into Vowpal Wabbit .
  • Code the backend to a recommendation widget for a website.
  • Post slide deck.
I much rather have a discussion about verbs than nouns.

December 23, 2013

Skills are more important than interests

It is more common to see a "research interests" section than a "research skills" section on a curriculum vitae (CV). This is a reflection that it is easier to read review papers than learn Unix. A paper won't tell you that you don't understand it. On the other hand, Unix won't "go" if you are not precisely correct.

I rather work with people who view their professional work as a craft. As a craftsperson, they are proud of their toolbox. More pragmatically, I want an assurance their toolset is current and relevant.

December 16, 2013

Visualization baked into analytic processing

Current statistical analysis software assumes a matrix mindset. Easy for computers and the associated math but hard for most people, including me.

I am a visual explorer. When faced with a new problem, I like to draw to think. I envision a successful outcome before I start.

Current software doesn't help that process.

I imagine a different kind of statical program that pictorially represents raw data and exploratory analysis. Raw data could be icon-based, little homunculi with group tags. Groups could be represented as circles, mean differences represented as the distance between circles and variances represented as diameters. Confidence intervals could be added as concentric circles.

Visual quantitative reasoning is an important, but under taught skill. It is often the last step, often use only used to interpret finished graphs in published papers. It could also be the first step in understanding unpolished data.

A visually based software would help the process of introducing statistics to students. Circles are more friendly than dataframes.

Every representation shapes thought. There are limitations and distortions inherent in visually representing data. However there should be viable alternatives to the current left-brain driven paradigms.

December 13, 2013

Anki Markdown flashcards

I love Markdown and Anki.

Markdown allows me to keep "my stuff" in plain text. It has enough formatting to frictionlessly publish to Word, pdf, or the web.

Anki is my preferred digital flashcard system. Flashcards allow me to test my knowledge. Knowing what I know (and more importantly, what I don't know) is a critical step in the learning process.

I combined those two loves. You can check out my Anki Markdown flashcards here.

December 9, 2013

What is so special about the human brain? from TED.com



There are urban legends in all fields, including science. It takes bravery to challenge assumptions. To ask the simple questions with fresh eyes.

Her work is interesting but straightforward. It is simple to "count" the number of neurons. Neuroscience needs to move past counts. Move past in vitro. Move towards examining the connectome in vivo.

Only an academic neuroscientist would be shocked that human brains are not just scaled up rodent brains. Rodent models of human phenomena are powerful but past their prime. We now possess the tools (e.g., neuroimaging and cheap genetic sequencing) to more directly explore issues facing the human experience.

Evolutionary biology is a powerful heuristic for understanding diet. She missed the vital role meat plays as an efficient energy source.

December 2, 2013

If "publish or perish" is your world-view

Say yes to:
  • Simple, linear, discrete, time-boxed projects within your domain
  • Hierarchical / command and control supervising
  • Situations where other people work on your projects with your timelines
  • Thoughtful but critical feedback
Say no to:
  • Complex, sprawling, open-ended projects that you know very little about
  • Contributing to an “ecosystem”
  • Developing talent and mentoring
  • Volunteering
  • Unfiltered ideas
Spend time choosing the game to play. Don’t spend time being made at the rules.