Posts in report

Ask Twitter: Why don’t academic researchers use cloud services?

this is an experiment at making my Twitter conversations a bit more useful and archivable over time. It’s going to be a bit messy and unpolished, but hopefully that makes it more likely I’ll actually do it :-)

Over the past decade, cloud infrastructure has become increasingly popular in industry. An ecosystem of modular tools and cloud services (often called the Modern Data Stack) has filled many data needs for companies.

Read more ...


What do people think about rST?

Publishing computational narratives has always been a dream of the Jupyter Project, and there is still a lot of work to be done in improving these use-cases. We’ve made a lot of progress in providing open infrastructure for reproducible science with JupyterHub and the Binder Project, but what about the documents themselves? We’ve recently been working on tools like Jupyter Book, which aim to improve the writing and publishing process with the Jupyter ecosystem. This is hopefully the first post of a few that ask how we can best-improve the state of publishing with Jupyter.

Many of the ideas in this post have now made their way into a new flavor of markdown called Markedly Structured Text, or MyST. It brings all of the features of rST into Markdown. Check it out!

Read more ...


Thoughts from the Jupyter team meeting 2019

I just got back from a week-long Jupyter team meeting that was somehow both very tiring and energizing at the same time. In the spirit of openness, I’d like to share some of my experience. While it’s still fresh in my mind, here are a few takeaways that occurred to me throughout the week.

Note that these are my personal (rough) impressions, but they shouldn’t be taken as a statement from the project/community itself.

Read more ...


Summer conference report back

This is a short update on several of the conferences and workshops over the summer of this year. There’s all kinds of exciting things going on in open source and open communities, so this is a quick way for me to collect my thoughts on some things I’ve learned this summer.

Pangeo is a project that provides access to a gigantic geosciences dataset. They use lots of tools in the open-source community, including Dask for efficient numerical computation, the SciPy stack for a bunch of data analytics, and JupyterHub on Kubernetes for managing user instances and deploying on remote infrastructure. Pangeo has a neat demo of their hosted JupyterHub instance that people can use to access this otherwise-inaccessible dataset! See their video from SciPy below.

Read more ...


An academic scientist goes to DevOps Days

Last week I took a few days to attend DevOpsDays Silicon Valley. My goal was to learn a bit about how the DevOps culture works, what are the things people are excited about and discuss in this community. I’m also interested in learning a thing or two that could be brought back into the scientific / academic world. Here are a couple of thoughts from the experience.

tl;dr: DevOps is more about culture and team process than it is about technology, maybe science should be too…

Read more ...