
Exploratory Data Analysis in Python
We summarize the objectives and contents of our PyCon tutorial, and then provide instructions for following along so you can begin developing your own EDA skills.
Coming from a background in geophysics and hydrology, Chloe is well-versed in leveraging data to make predictions and provide valuable insights. She has experience working on a wide variety of problems ranging from developing a data strategy for a pharmaceutical company to devising a methodology for performing longitudinal consumer impact studies at a large retail company. With experience in both academic research and engineering, she tackles novel problems and creates practical, effective solutions. She has researched, written, and spoken on the subject of data valuation for both monetization and for making internal decisions within an organization.
Chloe holds a PhD in Environmental Engineering from Stanford University. Her research there focused on developing methods for obtaining hydrologic insights from electrical data taken from the subsurface to better inform groundwater management decisions.
We summarize the objectives and contents of our PyCon tutorial, and then provide instructions for following along so you can begin developing your own EDA skills.
In this post, we will give a high level overview of what EDA typically entails and then describe three of the major ways EDA is critical to successfully model and interpret its results.
In this post, we will look at driving product engagement with behavioral data, as well as building an integrated analytical environment.
The promise of data and analytics for product companies is that they can help you understand usage, and improve your ability to build, deploy, and service products to customers much more accurately and efficiently. In this post, we look at understanding the customer life cycle.
In this post, we use a Jupyter Notebook go over the steps for creating a proof of concept for the image processing piece of our Caltrain work.
In this post we’ll start looking at the nuts and bolts of making our Caltrain work possible: image processing, video analysis, and image recognition.
This article is the first in a series that I will be posting on the topic of thinking about data as an intangible asset, and how to value it as such.
PyCon is the largest annual Python conference, and will be in Portland, OR this year. Our team will be there, talking about exploratory data analysis. Let us know if you’ll be there, or come say hi at our tutorial.
We’ll be in Boston covering a variety of topics—from running agile data teams, to visual storytelling with data. Let us know if you’ll be there, or sign up to receive all our slides.
Join us as CTO John Akred gives a talk on alternative approaches to valuing data within an organization, and Data Scientist Chloe Mawer demonstrates the power of Jupyter notebooks using a real-world train-detection problem. We’ll also present a tutorial on building data pipelines with Kafka and Spark.
Data Scientist Chloe Mawer will be in Portland giving a presentation about our Caltrain research. Our VP of Data Science, Jeffrey Yau, will also be attending the conference. Be sure to find us and say hi!
You can find Chloe’s slides here.
VP of Data Science Jeffrey Yau, along with Data Scientists Chloe Mawer and Daniel Margala, will be presenting on predicting train delays. See more about our train work here.