Successful Data Teams are Agile and Cross-Functional

April 21st, 2016

Editor’s note: Welcome to Throwback Thursdays! Every third Thursday of the month, we feature a classic post from the earlier days of our company, gently updated as appropriate. We still find them helpful, and we think you will, too! The original version of this post can be found here.

I built a lot of capabilities based on emerging technologies in my years delivering enterprise data and analytical solutions. I was always struck by how the Silicon Valley startups I worked with could do so much more, with so much less. I’ve come to learn, sometimes the hard way, that there are critical elements of the “who” and the “how,” particular to those start-up teams, that contribute to their success. It’s why we named our company for Silicon Valley: a lightweight, agile approach to data-driven product development was pioneered here.

The How

Large enterprises prefer contracting for well-defined scope, with certainty in price.  While understandable, that style of development comes at a steep cost.  It forces the project to organize around delivering what was promised, not necessarily delivering value. The paperwork to provide evidence that the project did what was contracted for could account for an enormous amount of cost, effort, and distraction.  Much of a startup’s efficiency can be attributed to the fact that, when they are building their products, they are not typically doing their core development under contract.

A key element of Silicon Valley software product delivery teams is an agile delivery methodology—oil to the water of firm fixed price contracts. Agile methods let you iterate your way to value, instead of clinging to a plan that can’t possibly anticipate the challenges and opportunities in using data to build a product. Agile is becoming more and more common, even in larger enterprises. It is critical for data-driven development.

Data science is fundamentally about developing a deeper understanding of phenomena by experimentation and observation. While one often has an expectation about the outcome of an experiment, if it was a forgone conclusion you wouldn’t conduct the experiment in the first place. Whatever your method, you are planning under uncertainty. Traditional delivery methods seek to plan as much as possible and overcome the uncertainty, agile methods embrace uncertainty and minimize planning to be best prepared to deal with roadblocks or capitalize on opportunities.  In my experience, agility gets you to value more quickly, with a lot less pain, and a lot happier customers.

The Who

DJ Patil led one of the most famous, pioneering data-driven product development teams. He has written about the challenges of fielding them in Building Data Science Teams (O’Reilly, 2011). Patil recounts how he and Jeff Hammerbacher “intentionally kept the distinction between roles [on their teams] blurry.” The teams weren’t “comprised solely of mathematicians and other ‘data people,’” either. That is because the “silos that have traditionally separated data people from engineering, from design, and from marketing, don’t work when you’re building data products.”

Companies looking to build such teams face a daunting challenge. Large scale efforts with accompanying large teams often depend on having interchangeable resources to field and maintain those large teams over extended periods of time. Entire departments are formed around functional areas like marketing, and the disciplines don’t sit together or, frankly, know how to communicate with each other.

The best data driven product companies build with small, agile teams of data scientists and engineers who are good at working together. The engineer understands the algorithm that the applied math person ends up using, because they have been discussing it as it evolves. Both are aware of each other’s progress and can offer each other ideas. This approach results in an effective algorithm that scales, and teammates who will share a beer in celebration, rather than retreating to separate taverns to drink away their mutual frustration.

An even bigger opportunity lies in effectively using the misfit—the genre-defying mash-up artists of the data domain. Like the indie-rockers who pioneered alt-country, in the world of data these talented folks can span the domains of engineering, statistics, distributed systems, operations research, database systems, natural language processing, applied mathematics, and sometimes even a little guitar. In the cookie-cutter world of large scale, these people have to pick a poison. But small teams that stay together can build around those eccentricities, and make the best use of them, since their members don’t have to be interchangeable.

These highly productive, cross-functional teams have another thing going for them: trust. When organizing for an 18-24 month run at a product on seed funding, you are in it together, and you need to depend on each other. The team quickly learns who delivers on their promises. Like a good sports team, the whole group gets court vision. They know what to expect from each other and start to make no-look passes. In the practice of data-driven product development, this means less overhead. Most gifted data people recoil at paperwork, and on these teams there is much less writing down and much more doing.

Technology has given us a great boost of productivity. Architectures have been developed expressly for enabling rapid, iterative, data-driven product development. This means we can now do with 10 what we used to do with 30. At KDD in Chicago, Microsoft talked about building data products into Bing that added $100s of millions in revenue, developed in days1. Our delivery approaches have adapted to that reality, as have the way we organize our teams.

At Silicon Valley Data Science, we have adapted these approaches to delivering data-driven products and services via consulting engagements. We make this kind of development available to many enterprises, enabling them to create and transform their products on the base of their strategic data assets.

Contact us for more information
Predictive Model Performance: Online and Offline Evaluations (Short Talk)  Authors: Jeonghee Yi; Ye Chen, Microsoft; Jie Li, Microsoft; Swaraj Sett, Microsoft; Tak Yan, Microsoft; Jeonghee Yi, Microsoft