Themes from JupyterCon 2017

September 7th, 2017

This past August was the first JupyterCon—an O’Reilly-sponsored conference around the Jupyter ecosystem, held in NYC. I attended on behalf of Silicon Valley Data Science (SVDS) and presented a poster. We make extensive use of Jupyter (Notebook, Hub, nbconvert, etc.) in our data science consulting work and love to show our support for open source projects. JupyterCon was one of the best conferences that I’ve been to, and I learned a great deal from the few days that I was there. There were several themes that presented themselves during the conference that I would like to highlight:

  • reproducible science and collaboration
  • Jupyter for teaching
  • future possibilities for Project Jupyter

In this post I will present a number of talks grouped by their themes, with some thoughts surrounding them.

Reproducible (data) science and collaboration

The Jupyter Project grew out of the IPython framework that was started by an academic (Fernando Perez) as an “afternoon hack.” From the beginning, the project focused on how to better use computational tools to solve problems faced by working scientists. This pedigree shows through to today in many ways—reproducibility and collaboration are key concepts in science, and were addressed by a number of talks at JupyterCon.

The following two keynotes spoke at high level about collaboration and reproducibility.

The next two talks get more into implementation specifics.

  • In Design for reproducibility, Lorena Barba tackled the challenge of reproducibility directly.
  • In How Jupyter makes experimental and computational collaborations easy, Zach Sailer explained how his collaboration combines the various pieces of the Jupyter ecosystem (what he called orbit) to develop, communicate, and share their science. His slides are available here, with a YouTube video hopefully coming online soon.
  • I presented a poster based on my previous work on collaboration for data science teams. A number of people stopped by during the poster session to chat about the challenges they face in working on teams and to share ideas for solutions.

Jupyter for teaching

Jupyter Notebooks allow teachers to give students a document that interleaves narrative and description with interactive code snippets and challenges. This suggests an excellent pedagogical tool when properly used. Of course, deploying code that is meant to be altered by students on their own laptops with every conceivable hardware configuration can be a daunting task. A number of talks spoke to how they tackled this challenge.

The future of Jupyter/JupyterLab

What’s next for the Jupyter Project? Through the conference, the message was JupyterLab, a new frontend to many of the tools that exist in the Jupyter ecosystem. JupyterLab was demoed in tutorials and talks throughout, and the newest version (0.27) was released the first day of the conference.

What this means is that Jupyter Notebooks aren’t going anywhere: they feature prominently within JupyterLab. Having played around with earlier versions of JupyterLab, I was very happy with the newest release as it feels like it has come a long way. A few talks from the Jupyter team demonstrated what JupyterLab offers.

  • I recommend looking at JupyterLab: The next-generation Jupyter frontend when it comes out on YouTube. The core developers of the Jupyter Project demonstrated the newest version of JupyterLab. They did a nice job selling the improvements over a simple Jupyter Notebook server.

Other future thoughts centered around deploying something interactive from Jupyter so that other users could gain insight from some analysis.

Wrapping up

Finally, I want to highlight a talk that doesn’t really fit any of these themes, but simply blew me away.

  • A billion stars in the Jupyter Notebook by Maarten Breddels. Keep an eye out for the video, but let me assure you that it’s far more than simply plotting a billion stars. The talk demonstrated the amazing capabilities of several visualization libraries; from the conference description: “Maarten Breddels offers an overview of vaex, a Python library that enables calculating statistics for a billion samples per second on a regular n-dimensional grid, and ipyvolume, a library that enables volume and glyph rendering in Jupyter notebooks. Together, these libraries allow the interactive visualization and exploration of large, high-dimensional datasets in the Jupyter Notebook.” Maarten fully delivered on these promises!

Overall, an excellent conference and I learned a lot. Were you there? Tell us about your favorite sessions in the comments.