Archive for the ‘Uncategorized’ Category

Connecting Data Systems and DevOps

In this post, we explain why anyone transforming their company into a data-driven organization should care about software development best practices, even if they don’t consider themselves a software company.


Noteworthy Links: Strata Edition

Here are some links from around the internet to get you in a Strata state of mind.

The Basics of Classifier Evaluation: Part 2

A previous blog post made the point that classifiers shouldn’t use classification accuracy as a performance metric. The next part in this series was going to discuss other evaluation techniques such as ROC curves, profit curves, and lift curves. However, there are several important points to be made first. Here I present a sequence that shows the progression and inter-relation of the issues.

The Basics of Classifier Evaluation: Part 1

If it’s easy, it’s probably wrong.

Zero to Kaggle in 30 Minutes

We’ll walk through the steps for competing in Kaggle’s “Digit Recognizer” contest using SQL-based machine learning tools to identify hand-written digits.


The CDO is still a relatively new position and there isn’t yet a strong consensus about the exact job description, reporting structure, or qualification set.

Avoiding Common Mistakes with Time Series

A basic mantra in statistics and data science is correlation is not causation. This is a lesson worth learning.

What Your Board of Directors Wants to Know About Big Data

I recently spoke about “Unlocking Business Opportunity from Big Data” to a group of former CEOs and senior business executives. Here are some questions they had.

One Year Later, Observations on the Big Data Market

We recently celebrated our first birthday here at Silicon Valley Data Science.