Archive for the ‘Architecture’ Category

The Data Platform Puzzle

Building or rebuilding a data platform can be a daunting task, as most questions that need to be asked have open-ended answers. But that doesn’t mean you have to guess and use your gut.

Models: From the Lab to the Factory

Deploying a model without a rigorous process in place has consequences. We go over techniques for successful deployment and management.

Driving Product Engagement with User Behavior Analytics

In this post, we will look at driving product engagement with behavioral data, as well as building an integrated analytical environment.

We Need a New Data Architecture: What Next?

In this revamped classic, Edd looks at the challenges of moving forward with a new architecture, and where you need to start.

How I Learned to Stop Worrying and Love Ephemeral Storage

This post will show architects and developers how to set up Hadoop to communicate with S3, use Hadoop commands directly against S3, use distcp to perform transfers between Hadoop and S3, and how distcp can be used to update on a regular basis based only on differences.

pipelines screenshot

Building Pipelines to Understand User Behavior

In this post, we cover what’s needed to understand user activity, and we look at some pipeline architectures that support this analysis.

Kafka Simple Consumer Failure Recovery

This post walks you through a simple failure recovery mechanism, as well as a test harness that allows you to make sure this mechanism works as expected.

Building Data Systems: What Do You Need?

In this post, we’re going to go over the capabilities you need to have in place in order to successfully build and maintain data systems and data infrastructure.

Understanding Modern Data Systems

In this post, Fausto talks about the characteristics that differentiate data infrastructure development from traditional development, and highlights key issues to look out for.