Strata + Hadoop World CA 2017

Name: Strata + Hadoop World CA 2017
Start: 2017-03-13T00:00:00-07:00
End: 2017-03-16T23:59:59-07:00
Location: San Jose Convention Center

Many of us will be at Strata in San Jose, and we’d love to see you there!

Tuesday, March 14

Architecting A Data Platform

9:00am-12:30pm in 210 D/H

John AkredStephen O'Sullivan

What are the essential components of a data platform? John Akred and Stephen O’Sullivan explain how the various parts of the Hadoop, Spark, and big data ecosystems fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.

By tracing the flow of data from source to output, John and Stephen explore the options and considerations for components, including acquisition from internal and external data sources; ingestion (offline and real-time processing); storage; analytics (batch and interactive); and providing data services (exposing data to applications). They’ll also give advice on tool selection, the function of the major Hadoop components and other big data technologies such as Spark and Kafka, and integration with legacy systems.

The Business Case for Deep Learning, Spark, and Friends

9:05am-9:30am in LL20 C

Edd Wilder-James

Technologies like deep learning are white-hot, but why do they matter? The secret power of today’s data technologies is that they promote economic scaling and flexible development patterns that can adapt to business needs—but industry hype has obscured much of the value to those approaching the topic. Skepticism is an understandable reaction.

Developers are usually the first to understand why some technologies cause more excitement than others. Edd Wilder-James relates this insider knowledge, providing a tour through the hottest emerging data technologies of 2017 to explain why they’re exciting in terms of both new capabilities and the new economies they bring. Edd explores the emerging platforms of choice and explains where they fit into a complete data architecture and what they have to offer in terms of new capabilities, efficiencies, and economies of use.

Topics include:

Deep learning and AI
Spark
Docker and containers
Notebooks for data science

Developing a Modern Enterprise Data Strategy

1:30pm-5:00pm in 210 B/F

Edd Wilder-JamesScott Kurth

Big data and data science have great potential for accelerating business, but how do you reconcile the business opportunity with the sea of possible technical solutions? Fundamentally, data should serve the strategic imperatives of a business—those key strategic aspirations that define the future vision for an organization. A data strategy should guide your organization in two key areas—what actions your business should take to get started with data and where to start to realize the most value.

Edd Wilder-James and Scott Kurth explain how to create a modern data strategy to power data-driven business.

Topics include:

Why have a data strategy?
Connecting data with business
Devising a data strategy
The data value chain
New technology potentials
Project development style
Organizing to execute your strategy

Wednesday, March 15

Office Hour with John Akred and Stephen O'Sullivan

11:50am-12:30pm in Table A

John AkredStephen O'Sullivan

John and Stephen will be discussing data strategy and platform architecture.

Thursday, March 16

Ask me anything: Developing a modern enterprise data strategy

11am-11:40am in 212 A-B

John AkredScott KurthJulie SteeleStephen O'Sullivan

John Akred, Julie Steele, Stephen O’Sullivan, and Scott Kurth field a wide range of detailed questions about developing a modern data strategy, architecting a data platform, and best practices for CDO and its evolving role. Even if you don’t have a specific question, join in to hear what others are asking.

Graph-based Anomaly Detection: When and How

2:40pm-3:20pm in 210 A/E

Jeffrey Yau

Thanks to frameworks such as Spark’s GraphX and GraphFrames, graph-based techniques are increasingly applicable to anomaly, outlier, and event detection in time series. However, most data do not naturally come in the form of a network that can be represented in graphs. Therefore, it is not clear whether graph-based techniques always offer the most appropriate approach to detect anomalies.

Jeffrey Yau offers an overview of applying graph-based techniques and outlines the benefits of graphs relative to other techniques. Jeffrey compares and contrasts the use of graph theory and techniques, large-scale time series mining methods, and traditional parametric linear and nonlinear time series techniques in anomaly, outlier, and event detection—with specific examples from credit card fraud, wearable IoT devices, and financial time series.

Topics include:

Static graphs
Dynamic graphs
The most common large-scale time series mining methods
Traditional parametric linear and nonlinear time series techniques, including change-point detection
Trade-offs need to be made when applying each of these classes of techniques to identify anomalies

Event Speakers

John Akred

With over 15 years in advanced analytical applications and architecture, John is dedicated to helping organizations become more data-driven. He combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.
Stephen O’Sullivan

A leading expert on big data architecture and Hadoop, Stephen brings over 20 years of experience creating scalable, high-availability, data and applications solutions. A veteran of WalmartLabs, Sun and Yahoo!, Stephen leads data architecture and infrastructure.
Edd Wilder-James

Founder of the pioneering data conference, O’Reilly Strata, Edd is a respected voice in the worlds of data, open source and the web. Bringing together deep technical know-how with market understanding, Edd makes sense of information technology and its trajectory.
Scott Kurth

Building on 20 years of experience making emerging technologies relevant to enterprises, Scott crafts vision and strategy for organizations. With a background in architecture and engineering, he combines deep technical knowledge with a broad perspective, to focus on business value.
Jeffrey Yau

An expert in quantitative modeling, Jeffrey brings over 17 years of experience applying econometric, statistic, and mathematical modeling techniques to real-world challenges.

Strata + Hadoop World CA 2017

Tuesday, March 14

Architecting A Data Platform

The Business Case for Deep Learning, Spark, and Friends

Developing a Modern Enterprise Data Strategy

Wednesday, March 15

Office Hour with John Akred and Stephen O'Sullivan

Thursday, March 16

Ask me anything: Developing a modern enterprise data strategy

Graph-based Anomaly Detection: When and How

Customer Knowledge

Customer Knowledge

Home

Sign In