All Blog Posts

Jupyter Notebook Best Practices for Data Science

We present some best practices that we implemented after working with the Notebook—and that might help your data science teams as well.

image processing feature

Image Processing in Python

In this post, we use a Jupyter Notebook go over the steps for creating a proof of concept for the image processing piece of our Caltrain work.

Introduction to Trainspotting

In this post we’ll start looking at the nuts and bolts of making our Caltrain work possible: image processing, video analysis, and image recognition.

predix transform iot image

Predix Transform 2016

We detail insights learned while attending the recent Predix Transform conference.

Learning from Imbalanced Classes

This post gives insight and concrete advice on how to tackle imbalanced data.

SVDS Strengthens Executive and Advisory Team

I’m excited to announce two new members of our team: Antony Falco (VP, Product & Innovation) and Nayla Rizk (Advisor).

5 Ways to Facilitate Failure

Failure is appealing as a stepping stone along the path to innovation, but it’s very scary in practice—especially when you can’t yet see where the path is leading. We’d like to suggest the following five guidelines as a place to start.

Scaling Data Science: Dream Big, Start Medium-ish

On July 13th we welcomed the Open Data Science Conference meetup series to our HQ—our speaker talked about thinking critically about the size of your data.

How I Learned to Stop Worrying and Love Ephemeral Storage

This post will show architects and developers how to set up Hadoop to communicate with S3, use Hadoop commands directly against S3, use distcp to perform transfers between Hadoop and S3, and how distcp can be used to update on a regular basis based only on differences.