NYC

Strata + Hadoop World New York 2016

The SVDS crew will be in New York this year, and we look forward to seeing you. Come by our talks, or catch us in the hallway track.

Tuesday, September 27


Architecting a Data Platform

9:00am–12:30pm in 3D 12

What are the essential components of a data platform? This tutorial will explain how the various parts of the Hadoop, Spark and big data ecosystem fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.

By tracing the flow of data from source to output, we’ll explore the options and considerations for components, including:

  • Acquisition: from internal and external data sources
  • Ingestion: offline and real-time processing
  • Storage
  • Analytics: batch and interactive
  • Providing data services: exposing data to applications

We’ll also give advice on:

  • Tool selection
  • The function of the major Hadoop components and other big data technologies such as Spark and Kafka
  • Integration with legacy systems

The business case for Spark, Kafka, and friends

9:00am–9:30am in 1B 01/02

Spark is white-hot at the moment, but why does it matter? The secret power of big data technologies is that they promote flexible development patterns and economic scaling and are ready to adapt to business needs—but years of focusing on the label “big” has obscured much of the value to those approaching the topic. Skepticism and hype-fatigue are understandable reactions.

Developers are usually the first to understand why some technologies cause more excitement than others. Edd Wilder-James relates this insider knowledge, providing a tour through the hottest emerging data technologies of 2016 to explain why they’re exciting in terms of both new capabilities and the new economies they bring. Edd explores the emerging platforms of choice and explains where they fit into a complete data architecture and what they have to offer in terms of new capabilities, efficiencies, and economies of use.

Topics include:

  • Spark
  • Kafka
  • Docker and containers
  • Notebooks

Developing a Modern Enterprise Data Strategy

1:30pm–5:00pm in 1 E 15/1 E 16

Big data and data science have great potential for accelerating business, but how do you reconcile the business opportunity with the sea of possible technical solutions? Fundamentally, data should serve the strategic imperatives of a business—those key strategic aspirations that define the future vision for an organization. A data strategy should guide your organization in two key areas—what actions your business should take to get started with data and where to start to realize the most value.

In this tutorial, we will explain how we work to solve real business challenges with data, including the following topics:

  • Why Have A Data Strategy?
  • Connecting Data With Business
  • Devising A Data Strategy
  • The Data Value Chain
  • New Technology Potentials
  • Project Development Style
  • Organizing To Execute Your Strategy

Wednesday, September 28


Beyond the numbers: Expanding the size of your analytic discovery team

4:35pm–5:15pm in 1 C04 / 1 C05

To thrive as a data-driven enterprise, organizations must train their eye for the best opportunities for analytic impact. Creative problem-solving through collaboration enables each individual working with data to utilize their strengths and specialized skills. Hypothesis testing for business impact through analytics requires finding ways to share hypotheses as well as the data assets that are used to test assumptions through data tools like data inventories, data catalogs, code repositories, and data visualization tools.

Join moderator Edd Wilder-James of Silicon Valley Data Science and analytics and data experts from global pharmaceutical company Pfizer, the City of San Diego, and Neustar, the first real-time provider of cloud-based information services, for a lively discussion on how these businesses have tapped into the strengths of a broad team of professionals beyond the boundaries of the analytics specialists and are reimagining how to structure organizations for analytic creative thinking.

Thursday, September 29


Ask me anything: Developing a modern enterprise data strategy

4:35pm–5:15pm in 1 C03

John Akred, Scott Kurth, and Julie Steele field a wide range of detailed questions, including:

  • Developing a modern data strategy
  • Architecting a data platform
  • The evolving role of the CDO and best practices around that

Even if you don’t have a specific question, join in to hear what others are asking.