We Need a New Data Architecture: What Next?

July 28th, 2015

Enterprise IT faces a pressing need to understand the new data architectures required to support business. The demand is universal, due to massive volumes of data from internet and mobile applications; the need to generate competitive advantage from new types of information, such as images and social media; and the expectation that we create the kinds of online analytical products that users are now familiar with thanks to LinkedIn, Google, and other web services.

It’s clear from the explosion of interest in newer platforms and technologies that the old tools and licensing costs don’t work to meet new business needs. Open source, cloud, and scale-out distributed systems create a new cost model.

The path ahead isn’t as clear, however. Some early businesses rode the NoSQL wave with enthusiasm, but discovered that no single type of database meets their needs: many such scale-out databases come with hefty requirements for extra programming. Early pioneers are reverting to relational databases or, more commonly, adopting a hybrid architecture.

The world has moved into a business technology space-race era of sorts: it’s not enough to support stable business processes; IT must also support innovation and iteration. Early adopters in that race reap the rewards of trying audacious new things—but they also bear the non-trivial costs when some of those things explode.

At SVDS, we hone our approach through R&D experimentation, then export the learnings of early adopters in the Valley in a way that makes sense at enterprise scale. Adopting and engineering new data platforms is an inescapable requirement for most businesses, and we have devised a method for creating modern data architectures that work, even in the face of rapid change.

A good modern data architecture:

supports current and future technical capabilities,
considers fit with existing architecture,
selects the appropriate technology platforms, and
delivers a plan for implementation.

Being adaptable and future-proof means that you need to spend a lot of time considering what we call the “data value chain”—the stages of data as it enables your business: discovery, ingest, processing, persistence, integration, analysis, and exposure. Your current and future requirements at these stages often have the largest influence on technology selection.

For example, the need for real-time analytics not only mandates certain performance requirements for data processing, but also requires service guarantees from data ingest, and has an effect on how the results of those analytics are exposed back to the organization.

By far the biggest factor in any architecture analysis is the needs of the business itself. An effective modern CIO is one who creates an enabling platform for the business to innovate and build upon. The IT mindset must transition away from policies governing users towards tools that enable them.

That’s why you just can’t “get a Hadoop” and make everybody happy: simply installing a tool and kicking the tires doesn’t translate into a serious-minded exploration of your present and future data needs. In other words: modern data architectures are not a shrink-wrap problem. (If it were that easy, then the business advantages wouldn’t be so radical.)

I’ll be moderating an Ask Us Anything panel to talk more about our approach to data architecture at our August 6 webinar,Data Lakes in the Real World. I hope you’ll join us there or at one of our forthcoming public tutorials:

NoSQL Now, August 18–20, San Jose, CA (Developing an Effective Data Strategy, Architecting a Data Platform)
Strata + Hadoop World New York, September 29–October 1 (Developing a modern enterprise data strategy, Architecting a data platform)

Dust in the Blockchain

Thank You

The Basics of Classifier Evaluation: Part 1

Welcome to Silicon Valley Data Science

Sign In