After writing it down a couple of weeks ago for Hacker News, here is the recap and some updates:
I am a computational biologist with a heavy emphasis on the data analysis. I did try Jupyter a couple of years ago and here are my concerns with it, compared to my usual flow (Pycharm + pure python + pickle to store results of heavy processing).
- Extracting functions is harder
- Your git commits become completely borked
- Opening some data-heavy notebooks is neigh impossible once they have been shut down
- Import of other modules you have in local is pretty non-trivial.
- Refactoring is pretty hard
- Sphinx for autodoc extraction is pretty much out of the picture
- Non-deterministic re-runs – depending on the cell
execution order you can get very different results. That’s an issue
when you are coming back to your code a couple of months later and
try to figure what you did to get there.
- Connecting to the ipython notebook, even from the environments like Pycharm is highly non-trivial, just as the mapping to the OS
- Hard to impossible to inspect the contents of the ipython notebook when it’s hosted on Github due to the encoding snafus
There are likely work-arounds for most of these problems, but the issue is that with my standard workflow they are non-issues to start with.
In my experience, Jupyter is pretty good if you rely only on existing libraries that you are piecing together, but once you need to do more involved development work, you are screwed.