British parliament

Strata + Hadoop World London 2017

Strata + Hadoop World London focuses on how to make data-driven decisions across industries. Several of us will be there in May, discussing platforms, strategy, and tools. Let us know if you’ll be attending and would like to chat.

Tuesday, May 23

Data 101: The Business Case for Deep Learning, Spark, and Friends

9:00am-12:30pm in Capital Suite 14

Data 101 is comprised of several talks, focused on the fundamentals of using data in your business. One of those talks will be CEO Sanjay Mathur’s review of business technologies. Deep learning is white-hot at the moment, but why does it matter? Developers are usually the first to understand why some technologies cause more excitement than others. Sanjay relates this insider knowledge, providing a tour through the hottest emerging data technologies of 2017 to explain why they’re exciting in terms of both new capabilities and the new economies they bring.

Architecting a Data Platform

9:00am-12:30pm in Capital Suite 8

What are the essential components of a data platform? John Akred and Stephen O’Sullivan explain how the various parts of the Hadoop, Spark, and big data ecosystems fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.

By tracing the flow of data from source to output, John and Stephen explore the options and considerations for components, including acquisition from internal and external data sources, ingestion (offline and real-time processing), storage, analytics (batch and interactive), and providing data services (exposing data to applications). They’ll also give advice on tool selection, the function of the major Hadoop components and other big data technologies such as Spark and Kafka, and integration with legacy systems.

Developing a Modern Enterprise Data Strategy

1:30pm-5:00pm in Capital Suite 12

Big data and data science have great potential for accelerating business, but how do you reconcile the business opportunity with the sea of possible technical solutions? Fundamentally, data should serve the strategic imperatives of a business—those key strategic aspirations that define the future vision for an organization. A data strategy should guide your organization in two key areas—what actions your business should take to get started with data and where to start to realize the most value.

Edd Wilder-James and Scott Kurth explain how to create a modern data strategy to power data-driven business.

Topics include:

  • Why have a data strategy?
  • Connecting data with business
  • Devising a data strategy
  • The data value chain
  • New technology potentials
  • Project development style
  • Organizing to execute your strategy

Thursday, May 25

Ask me anything: Developing a modern enterprise data strategy


John Akred, Scott Kurth, and Stephen O’Sullivan field a wide range of detailed questions about developing a modern data strategy, architecting a data platform, and best practices for and the evolving role of the CDO. Even if you don’t have a specific question, join in to hear what others are asking.

What's Your Data Worth?

4:35pm-5:15pm in Capital Suite 17

The unique properties of data make assessing its value difficult when using the traditional approaches of intangible asset valuation. John Akred discusses a number of alternative approaches to valuing data within an organization for specific purposes, including informing decisions to purchase third-party data and monitoring data’s value internally to manage and increase that value over time.

Data is difficult to value in large part because, economically, it does not adhere to the three main conditions of a traditional market system. In addition, traditional valuation methods of intangible assets do not apply to data valuation.

  • Cost: Data is often produced as a byproduct of other business processes, making its cost hard to pin down.
  • Comparables: Data varies greatly by content and quality so comparables are difficult to find.
  • Forecasts: The dominance of data aggregators and one-on-one deals in the buying and selling of data obscure the prices of any comparables that may actually exist in the market.

While a traditional valuation of data may not be applicable, John explores data’s value in the context of specific uses and intentions within an organization, sharing several examples of how to use methods such as the value of information (VOI) framework and A/B testing to assess whether or not a third-party data source should be purchased or continue to be purchased and demonstrating how mutual information (MI) can be used to assess the value of a data source once it is in use within the organization.

John concludes by discussing the qualities that make data more valuable within an organization and provides a range of concrete and straightforward metrics that allow the value of data to be monitored internally to ensure that business decisions can be optimized to maximize that value over time.