Fausto Inestroza

Combining his expertise of emerging technologies and cross-industry experience, Fausto helps clients architect big data platforms and build data-driven products. He has extensive experience with data platforms, analytical processes, and distributed systems. His work has encapsulated various architectures and techniques including: cloud-based distributed architectures, stream processing, distributed pub-sub, complex event processing, distributed in-memory caching, collaborative filtering, and market-mix optimization. He has experience architecting and developing solutions utilizing a wide range of technologies including: Hadoop, Cassandra, Storm, Kafka, Hive, Pig, and Pentaho.

Prior to joining SVDS, Fausto was a technical lead at Accenture Technology Labs, where he helped lead the development of multiple analytical systems including large-scale recommender systems and real-time predictive analytics platforms. He was also responsible for the redesign of the data management strategy for a proprietary marketing analytics platform.

Fausto has presented his work at industry conferences including Strata + Hadoop World and Cassandra Summit.

Recent Posts

Graphic of a button that is off and one that is on

Realize the Business Power of Your Data with DevOps

If you are on the path to being a data-driven company, you have to be on the path to being a development-enabled company.

Building Data Systems: What Do You Need?

In this post, we’re going to go over the capabilities you need to have in place in order to successfully build and maintain data systems and data infrastructure.

Understanding Modern Data Systems

In this post, Fausto talks about the characteristics that differentiate data infrastructure development from traditional development, and highlights key issues to look out for.

Connecting Data Systems and DevOps

In this post, we explain why anyone transforming their company into a data-driven organization should care about software development best practices, even if they don’t consider themselves a software company.

Past Events

2015

  • Cassandra Summit

    Santa Clara, CA

    SVDS presents two sessions at the Cassandra Summit: a look at the migration of our client Allant’s CDI-keying engine from Oracle to Cassandra; and a how-to on using Cassandra as a platform for building a custom distributed system.

  • Data Lakes in the Real World: Ask Us Anything

    Online

    Modern data architectures look radically different as we move towards a new idea of data platforms. During this “ask us anything” webinar we will discuss our experiences building new data architectures and take your questions.

Heather Nelson

A problem-solver by nature, Heather is passionate about helping organizations leverage data to drive competitive advantage. She draws across a diverse background in business and technology consulting to find the best solutions for her clients’ toughest data problems.

Heather has led a wide range of data science and data engineering projects across a variety of industries including health, financial services, and retail. In particular, she has extensive experience in unstructured data text extraction, data analysis, data conversions, data visualization, and business case development. She also has hands on experience with many data tools and technologies such as Tableau, WEKA, SQL, Java, R, Hadoop, and Pig.

Heather is particularly passionate about facilitating data-driven business decisions, and leverages her background in technology to marry the right solutions with the right business problems. At SVDS, she has led implementation teams to build real-time inventory management systems that serve the eCommerce website at a Fortune-50 retailer, among other projects.

Heather holds a BS in Computer Science from the University of Missouri, where she graduated with highest honors.

Recent Posts

Crossing the Development to Production Divide

In this post we’ll give an overview of obstacles we’ve faced (you may be able to relate) and talk about solutions to overcome these obstacles.

Pile of colorful spinning top toys

Easily Spinning up Data Platforms

A quick overview of the motivation behind our instant and repeatable data platform tool.

Data Opportunities in Insurance

In this post we explore how data is changing the insurance industry, through the lens of auto insurance underwriting.

Crossing the Development to Production Divide

We know what it’s like to deal with complex production deployments that cover the gamut from infrastructure upgrades, to feature deployments, to data migrations, where each step threatens to derail the plan. In this post she’ll give an overview of obstacles she’s faced (you may be able to relate) and talk about solutions to overcome these obstacles.

Past Events

2017

  • Strata Data Conference New York 2017

    New York, NY

    The Strata Data Conference is where cutting-edge science and new business fundamentals intersect—and merge. Several of us will be there in September, discussing platforms, strategy, and tools. Let us know if you’ll be attending and would like to chat.

  • Data Dialogues: Data Strategy

    Online

    The Data Strategy track of our webinar series focuses on creating and continuously updating your data strategy. Register now!

  • OSCON Texas 2017

    Austin, TX

    OSCON is a long-running conference focused on open source technology and communities. We’ll be there talking about our “push button” infrastructure tool.

  • Enterprise Data World 2017

    Atlanta, GA

    Enterprise Data World focuses on data-driven business. Several of us will be there this year, talking about data platforms and enterprise data science. Let us know if you’ll be there, or you can sign up to receive our slides.

2016

  • Enterprise Dataversity 2016

    Chicago, IL

    Several of us will be in Chicago this year, presenting tutorials on data strategy, data platforms, and how to manage data science in the enterprise. CTO John Akred will also be taking part in a panel about how to strengthen your data strategy skills.

2015

  • UNSTRUCTURED Data Science Pop-Up

    Chicago, IL

    At this intimate day-long forum for data science practitioners, SVDS participates in a panel on data strategy moderated by O’Reilly Media’s Tim McGovern and gives a talk on running data science teams.

Matt Johanson

Matt comes to SVDS with over 13 years of experience bringing data solutions to large organizations in a variety of leadership and technical positions. Matt’s experience solving difficult problems with unique and innovative solutions is fueled by his passion for speed, efficiency, and value to the customer.

Recent Posts

Graphic of pipes, in shades of gray

Data Pipelines in Hadoop

In this post we’ll look at some real world examples of managing headaches while moving to Hadoop.

Ryan Magnusson

Ryan has over 13 years of experience creating enterprise applications for giants in both the retail and credit card processing industries. He is an expert in Java development and has served on open source and Java standards committees.

Chloe Mawer

Coming from a background in geophysics and hydrology, Chloe is well-versed in leveraging data to make predictions and provide valuable insights. She has experience working on a wide variety of problems ranging from developing a data strategy for a pharmaceutical company to devising a methodology for performing longitudinal consumer impact studies at a large retail company. With experience in both academic research and engineering, she tackles novel problems and creates practical, effective solutions. She has researched, written, and spoken on the subject of data valuation for both monetization and for making internal decisions within an organization.

Chloe holds a PhD in Environmental Engineering from Stanford University. Her research there focused on developing methods for obtaining hydrologic insights from electrical data taken from the subsurface to better inform groundwater management decisions.

Recent Posts

Exploratory data analysis in Python

Exploratory Data Analysis in Python

We summarize the objectives and contents of our PyCon tutorial, and then provide instructions for following along so you can begin developing your own EDA skills.

magnifying glass and map

The Value of Exploratory Data Analysis

In this post, we will give a high level overview of what EDA typically entails and then describe three of the major ways EDA is critical to successfully model and interpret its results.

Driving Product Engagement with User Behavior Analytics

In this post, we will look at driving product engagement with behavioral data, as well as building an integrated analytical environment.

Data-Driven User Engagement

The promise of data and analytics for product companies is that they can help you understand usage, and improve your ability to build, deploy, and service products to customers much more accurately and efficiently. In this post, we look at understanding the customer life cycle.

image processing feature

Image Processing in Python

In this post, we use a Jupyter Notebook go over the steps for creating a proof of concept for the image processing piece of our Caltrain work.

Introduction to Trainspotting

In this post we’ll start looking at the nuts and bolts of making our Caltrain work possible: image processing, video analysis, and image recognition.

Valuing Data is Hard

This article is the first in a series that I will be posting on the topic of thinking about data as an intangible asset, and how to value it as such.

Past Events

2017

  • PyCon 2017

    Portland, OR

    PyCon is the largest annual Python conference, and will be in Portland, OR this year. Our team will be there, talking about exploratory data analysis. Let us know if you’ll be there, or come say hi at our tutorial.

  • TDWI Accelerate Boston 2017

    Boston, MA

    We’ll be in Boston covering a variety of topics—from running agile data teams, to visual storytelling with data. Let us know if you’ll be there, or sign up to receive all our slides.

2016

  • Data Day Seattle 2016

    Seattle, WA

    Join us as CTO John Akred gives a talk on alternative approaches to valuing data within an organization, and Data Scientist Chloe Mawer demonstrates the power of Jupyter notebooks using a real-world train-detection problem. We’ll also present a tutorial on building data pipelines with Kafka and Spark.

  • PyCon 2016

    Portland, OR

    Data Scientist Chloe Mawer will be in Portland giving a presentation about our Caltrain research. Our VP of Data Science, Jeffrey Yau, will also be attending the conference. Be sure to find us and say hi!

    You can find Chloe’s slides here.

  • DataEDGE 2016

    Berkeley, CA

    VP of Data Science Jeffrey Yau, along with Data Scientists Chloe Mawer and Daniel Margala, will be presenting on predicting train delays. See more about our train work here.

2015

  • UNSTRUCTURED Pop-up Data Science

    Seattle, WA

    Joins SVDS CTO John Akred and other panelists from Amazon, Match.com, and eBay for a fireside chat on recommendation engines. Then catch data scientist Chloe Mawer’s talk on figuring out how much your data is actually worth.

Harrison Mebane

With a background in theoretical physics research, Harrison brings a broad knowledge of computational and mathematical techniques for solving complex problems. He enjoys finding patterns and building predictive models from all types of datasets.

In addition to having a strong background in mathematics, Harrison is proficient in Python, Java, Scala, SQL, Spark, HBase, and Cassandra. While at SVDS, he built a PDF extractor based on natural language processing and machine learning, and developed a distributed data platform and APIs to execute supply-demand models for a Fortune 100 retailer’s customer-facing inventory management system. Harrison also implemented a Spark job for the retailer which was in production during the 2014 holiday season. He has published his work on these topics at O’Reilly OSCON and Strata.

Harrison holds a PhD in theoretical physics from the University of Illinois at Urbana-Champaign and a BA in physics from Harvard University.

Recent Posts

Talking About the Caltrain

On May 6th, SVDS hosted an Open Data Science Conference (ODSC) Meetup in our Mountain View headquarters. Data Engineer Harrison Mebane and Data Scientist Christian Perez presented on our Caltrain project.

Past Events

2015

  • Strata + Hadoop World 2015

    San Jose, CA

    Several of us will be presenting and we’d love to see you there. Join us for our tutorials and sessions, or come visit us at our booth in the Expo Hall.

Milenko Milanovic

Milenko has extensive experience architecting and implementing software solutions across variety of industries. He enjoys finding simple solutions to complex problems. Above all, he is passionate about getting things done.

Matt Mollison

With a background in cognitive psychology and neuroscience, Matt has extensive experience in hypothesis testing and the analysis of complex datasets. He is excited about using predictive models and other statistical methods to solve real-world problems.

Recent Posts

TensorFlow RNN Tutorial

In this post, we’ll provide a short tutorial for training a RNN for speech recognition; we’re including code snippets throughout, and an accompanying GitHub repository. The software we’re using is a mix of borrowed and inspired code from existing open source projects.

Techniques and Technologies: Topology and TensorFlow

In early December we hosted a meetup, featuring Dr. Alli Gilmore discussing topological data analysis, and Dr. Andrew Zaldivar covering practical usage of Tensorflow.

Better Know the Districts

One might reasonably judge how well the congress reflects the views of the citizenry by examining the proportion of those citizens who think congress is doing a good job.

Silvia Oliveros

With a background in computer engineering and visual analytics, Silvia has worked on several projects helping clients explore and analyze their data. She is interested in building and optimizing the infrastructure and data pipelines used to gather insights from various datasets.

Silvia has given multiple talks at data science industry conferences, and is an author on multiple academic papers, including:

  • Oliveros-Torres, S., Yang, Y., Jang, Y., Ebert, D., “Visual Analytics for Risk-based Decision Making, Long-Term Planning, and Assessment Process,” Eurovis Workshop on Visual Analytics, 2014.
  • Malik, A., Maciejewski, R., Jang, Y., Oliveros, S., Yang, Y., Maule, B., White, M., Ebert, D.,”A Visual Analytics Process for Maritime Response, Resource Allocation and Risk Assessment,” Information Visualization, 2012.
  • Oliveros, S., Eich-Miller, H., Boushey, C., Ebert, D., and Maciejewski, R., “Applied Visual Analytics for Exploring the National Health and Nutrition Examination Survey,” Hawaii International Conference on System Sciences, 2012.

Silvia holds an MS in Computer Engineering from Purdue University, and a BS in Computer Engineering from Michigan Technological University.

Recent Posts

How to Choose a Data Format

In this post we provide a framework for choosing a data format, and provide some example use cases.

Working Effectively in Data Science Teams

On April 21st, SVDS hosted the WWCode Silicon Valley chapter in our Mountain View office; we gave a talk titled Working Effectively in Data Science Teams.

SVDS at Strata San Jose 2016

Several of our presenters were interviewed at Strata San Jose. If you missed the conference, check out these interviews below to catch up on some of the topics that were on our minds.

How to Choose a Data Format

It’s easy to become overwhelmed when it comes time to choose a data format. In this post Silvia gives you a framework for approaching this choice, and provide some example use cases.

Past Events

2017

  • DataEngConf 2017

    San Francisco, CA

    DataEngConf features talks and workshops aimed at bridging the gap between data scientists, data engineers, and data analysts. We’ll be there, giving tips on choosing the right format for your data.

2016

  • Strata + Hadoop World

    San Jose, CA

    Many of us will be at the Strata Conference + Hadoop World 2016 in San Jose, and we’d love to see you there!