Rant: Notebook Interfaces¶
|Abstract:||Cell-based notebook interfaces like Jupyter or Wolfram Mathematica store variables in the context of an (invisible) kernel. It is easy to create unresolvable dependencies by defining variables and deleting the code. An example is shown.|
There are a couple of programs that have cell-based notebook interfaces. In this example I will use iPython Notebook (now called Jupyter notebooks). The exact same results can be obtained with Wolfram Mathematica.
In a normal programming session, you would type your program in your editor/IDE and then run it from top to bottom. If the program runs through, it will likely run from top to bottom again just fine. Of course you can shoot yourself in the foot here by reading/writing files to disk. In a cell-based notebook, this is just so much easier.
The following will show screenshots of a Python notebook, each screenshot is
the full notebook. Let me start with a single cell and assign the value of
This does not work because
var1 is undefined. Let me define
In in a normal programming context, this would just not work at all. In this
cell-based thing, I can now evaluate the first cell again. The kernel will have
var1 defined and that runs just well:
The only trace of this strange dependency is in the evaluation numbers in front of the cell. Since those variables are now globally defined, I can just evaluate the notebook from top to bottom again:
Even worse, I can now remove the line that defines
var1. The remaining line
will evaluate just fine:
At this point, one could consider sending this notebook to somebody else. Or
perhaps shutting down the computer (and with that the kernel) and assume that
everything is well. However, the value of
var1 is nowhere in the program.
It is just in the current instance of the kernel because in the past I have
evaluated a cell that defined
var1. So let me restart the kernel now:
Evaluating the one cell in the notebook again gives the same error as before:
This simple example shows how easy it is shoot oneself into the foot with a cell-based notebook. Perhaps I am just more used to normal procedural programs that I find this behavior rather upsetting. I have seen people fall into this trap while working on their notebook. At some point, it would not run cleanly from top to bottom after the kernel has been reset. They had to re-implement a couple of things because the code that has set the needed variables was gone.
A great advantage is that one can run parts of the programs after a minor change. I do like this and find it a big waste of time to run my whole analysis program after each trivial change. Yet I fear this fallacy of cyclic or broken dependencies.
If you prefer the notebook interfaces, how do you deal with that? Please send me an email, I’d like to hear about it!