I have a running mental list of little inefficiencies that annoy me throughout my workdays. The sand, the pebbles, the annoyances of my working life. I try to follow this xkcd in determining if it's worth the time to actually fix them. The basic heuristic is: if it's a small thing, but I do it every day, it's worth spending a few hours optimizing it.
Today I tackled one: creating a global config file to auto-import all my favorite modules into any newly-created Jupyter notebook.
I work a lot with Jupyter notebooks. They're a standard data science tool. What usually happens, though, is I experience "Jupyter efficiency drift". I'll spend a chunk of time (several days/weeks) digging into notebooks, and I'll get really good at importing the right modules, setting my
matplotlib up all nice and friendly, and making little
My crappy workaround has always been to dig up the last notebook I was working on, and copy the top few cells:
The problem was - some notebooks didn't have the entire rigamarole, some were missing my beloved
alert() function, and this was a stupid process. Sometimes I'd open a new notebook just to do some small thing, and it seemed like overkill to stay meticulous about copy+pasting that One True First Cell.
Because surely Jupyter must have some top-level, global-environment-style config that can do this for me? I mean, Stata has
profile.do, bash has
.bash_profile, is this some crazy idea? To paraphrase Kenneth Branagh: no, the world must be dotfiled!
And, indeed, Jupyter - or rather, Jupyter's ur-center, IPython - has startup files (docs). The basic process:
- Go to your default IPython profile's startup folder:
- Make a Python script for what you wanna import:
subl 0-the-world-must-be-peopled.py(Note: You can have multiple scripts, they'll be run in lexicographic order.)
- People that script with your imports! Mine looked like the above
import osetc. stuff. (Note: Cell %magics don't work here, sorry.)
VOILA! PRODUCTIVITY GAIN.
An aside on
Hey, let's talk about
alert(), by the way.
I often run stuff in my Jupyter notebooks that takes a while. For example: if I'm pulling in lots (and lots) of data from somewhere, and it's just taking forever to load in memory. Or if I'm waiting for a PyMC3 MCMC sampling, or a neural net's many epochs.
In those cases, I want to be alerted when the process completes. Specifically, I want it to pull me back from whatever other thing I've wandered off to do (e.g. watching this). BEHOLD - that's what
alert() is good for:
some_long_process() completes. You may want to manually allow popups in your
localhost via your browser, since (I guess) most modern web browsers block them by default (after much 1990s abuse).