Jupyter notebooks enable data scientists to explore data and prototype solutions just like walking. Put your left foot forward (code), then your right foot (plot), then your left foot again (code), and rinse and repeat. This is in no small part thanks to the Javascript-based GUI that interfaces with Python kernels under the hood, to power the immediate interactivity that underlies the modern data science workflow.
This immediate interactivity is sorely felt when it is missing.
Displaying textual content in Jupyter notebooks should be fast
Once I was going through code submitted by participants for a hackathon for archival purposes. I didn’t need to run any notebooks; just needed to skim through them as part of the process. There was one particular notebook that was full of graphs, generated by a couple of loops. Did you know that images in code cell outputs are stored as base64 encoded strings in the notebook itself? This particular notebook weighted in at >60 MBs, and it froze my Jupyter Lab for a minute or two upon opening it.
Although I don’t run into monster notebooks often, I run into similar productivity snags in my day to day frequent enough that I think about it. Sometimes I just want to quickly skim a notebook to refresh my memory / to look for something I forgot without intending to run the notebook1.
Jupyter notebooks tend to take a couple of seconds to load in Jupyter Lab. It would be nice if I can flip through content-heavy Jupyter notebooks with the same responsiveness of flipping through pages on a physical notebook. Jupyter notebooks are just specially formatted JSON! I shouldn’t need to start a Jupyter server if all I wanted to do is skim it quickly. Reading and displaying a Jupyter notebook JSON should be fast if we just opt to put plain text2 in front of eyeballs.
Why the command line?
I ended up as a heavy command line user after getting used to doing everything on my work server over SSH. Plus, I already use charmbracelet/glow to render markdown in CLI, something similar will work just fine for me.
Low hanging fruit: someone has implemented it!
After studying the format of a Jupyter notebook (link to nbformat docs), I set out to find libraries for building TUI apps in Python (most comfortable language). I hit the jackpot when I came across textualize/rich-cli, which straight up can pretty print notebooks to terminal! It was as simple as pip install rich-cli
, then rich notebook.ipynb
.
Tried it out, love how it looks. But this story has not come to an end just yet. When loading large notebooks, there was a perceptible lag of a few seconds before the rendered output shows up. Since we’re just rendering text here, I figured there’s a bottleneck somewhere that can be removed to make the output show up almost instantly.
A minor tweak to fully scratch my itch
After some tinkering, I found that the lag came from rich-cli
preparing all notebook cells for rendering before dumping them to STDOUT all at once. Modifying the logic to render each cell directly before processing the next one is doable, but this breaks the built-in pagination: the pager in the underlying API (textualize/rich) will display each cell in their own pager instance if I do that. As a workaround, I used subprocess
to pipe to less
as a pager instead. Because of how bash pipe works, any output received is immediately flushed to STDOUT. Thus, regardless of notebook size, the first few cells are rendered in terminal almost instantly. Bingo!
Now that the logic works as intended, I packaged the code as nbread
and put it on github (tnwei/nbread), so that I can easily install it as a standalone command-line application with pipx
.
From good to awesome: integration with ranger file manager
I happen to be a happy user of ranger
, a command-line file manager that comes with a file preview pane. By default, Jupyter notebooks show up simply as raw JSON in the preview, and is opened as a text file when selected. Given the work done above, I saw how I can improve my user experience by configuring integration with nbread
: I won’t even need to launch nbread
manually; I can just open the notebook I want by finger-dancing on arrow keys!
Just had to make the following changes:
Modify handle_extension()
in ~/.config/ranger/scope.sh
for file preview:
case "$extension" in
ipynb)
¦ # Jupyter notebook previewer
¦ nbread "${FILE_PATH}" && { dump | trim; exit 5; } || exit 2;;
And adding the following to ~/.config/ranger/rifle.conf
to launch nbread
when a Jupyter notebook is selected:
# Jupyter notebooks
ext ipynb = nbread "$1" --pager
Conclusion and thoughts
I ended up using the outcome of this little side project almost every day. It just works and doesn’t take much conscious thought to use. Here’s what using nbread
in ranger
looks like:
Certainly, there are limitations: none of the rich media content like images and interactive widgets are visible (see footnote 2). However, as a preview tool, it works more than well enough for my purposes and successfully scratched my itch. So I’m happy with it.
Github repo link: (tnwei/nbread)
*GPT it? Hmmm ↩︎
What about the rich media formatting, you ask? One, typically I’m looking for textual content instead of rich content when skimming through notebooks: code or notes written in comments / markdown cells. And two, dealing with rich content will make this little side project more complex by a few orders of magnitude. I personally don’t think putting a lot of effort here is worth it. ↩︎ ↩︎