Jupyter Explosions and Kernel Inheritance

15 Jan 2022

When working with Jupyter notebooks, I often struggle with the combinatorical explosion that inevitably happens as you explore your problem space.

You start prototyping a new model. After a while you end up with $n$ different preprocessing setups, $m$ models with $k$ hyperparameter settings producing $nmk$ pipelines in total, each evaluated with $l$ metrics using both test and train sets and visualized with $p$ plots. Congratulations! Your notebook is now a complete mess.

I usually deal with the above scenario by putting all the code into functions, classes, and modules that can be shared across multiple notebooks. This approach limits code repetition and prevents my notebooks from getting too long and complex. However, the downside is that you lose lot of the interactivity. Recently I have been thinking about applying the notion of inheritance to Jupyter notebooks and kernels as another possible solution.

Before going into the details, we need to establish some basic terminology. Notebook is a .ipynb file. Kernel is a process connected to a Python interpreter that evaluates notebook contents. It is possible to connect multiple notebooks to the same kernel.

What Is Kernel Inheritance

Let’s have a running notebook base.ipynb with a cell x = "Hey!" that has been executed and another notebook new.ipynb with the following code:

import jupyter_inheritance
jupyter_inheritance.inherit_from("base.ipynb")
print(x)

When executed, we want the cell above to output Hey! even though, the variable x has not been defined in the new.ipynb notebook. There are multiple ways we can achieve this.

Content Copy

The easiest solution is to just copy the contents of base.ipynb notebook and execute it in new.ipynb. But what if base.ipynb contained a database query that takes forever to run? We would have to wait for the query to finish each time we inherit from base.ipynb. Furthermore, there is no guarantee that the query keeps returning the same results as the data in the database can change over time.

Kernel Sharing

We can also just connect new.ipynb to the same kernel as base.ipynb. This has one downside though. If we decided to redefine x to x = "Bye!" in new.ipynb, the change would propagate back to base.ipynb and x = "Hey!" would be lost.

Kernel State Copy

Serializing the state (all the Python objects existing in the kernel memory) of base.ipynb kernel, dumping it into a file and loading it in new.ipynb deals with all the mentioned issues. The code is not executed from scratch and both notebooks use separate kernels. This is the solution we will try to implement.

Inheritance

I use the term “inheritance” instead of “copy” because I think the user experience feels similar to class inheritance in object oriented programming. It allows us to create an empty notebook, get everyting from a parent for free, and build on top of it.

How It Works

To make the inheritance work, we need to solve three problems:

  1. Finding the id of the parent kernel
  2. Communication between the child and parent kernels
  3. Serializing a kernel state

Finding the Kernel Id

One of the resources exposed by Jupyter server API is /api/sessions. The response looks something like this:

[
  {
    "path": "base.ipynb",
    "kernel": {
      "id": "3fa85f64",
      "connections": 1
    }
  }
]

In this example, the notebook base.ipynb is using kernel 3fa85f64 to execute its code. Finding the correct id is just the matter of filtering the response using the path to the notebook we want to inherit from.

To get the host of the Jupyter server to which we can send the API request, we can use list_running_servers function from jupyter_server.serverapp.

Inter-Kernel Communication

There is a useful package called jupyter_client that can send messages to Jupyter kernels using ZeroMQ transport. To send a message, we first need connection details of the receiving kernel. The details can be found in a file jupyter-runtime-dir/kernel-{kernel_id}.json, every running kernel has one. When we have the connection file, we can do this:

client = BlockingKernelClient(connection_file)
client.load_connection_file()
client.start_channels()
client.execute("y = 'Hello from the other side!'")

If successful, the kernel specified by the connection file has now a defined variable y containing the string "Hello from the other side!". The variable is accessible from any notebook connected to the kernel. The code above can be executed in any Python or Jupyter environment that can reach the target kernel. In our case, we will be running it in new.ipynb.

Kernel State Serialization

The naive aproach of just serializing everything in __main__.__dict__ with pickle will not get us very far. One of the problems is that during deserialization, we will not be able to unpickle functions or classes that were defined directly in base.ipynb because their definitions will not be available in new.ipynb.

Fortunately, a really neat package called dill can serialize entire module type objects with all the required definitions. The package even offers functions dump_session load_session to (de)serialize the __main__ module directly.

We can now combine dill and jupyter_client to send a message from new.ipynb to base.ipynb telling the base kernel to serialize its state to a file. When all is done, we just deserialize the file to new.ipynb, effectively copying the state of one kernel to another.

code = f"""
    import dill
    dill.dump_session("{storage_file_path}")
"""
client.execute(code)
client.get_shell_msg()
dill.load_session(storage_file_path)

The line client.get_shell_msg() blocks new.ipynb code execution until base.ipynb is done with the serialization. This prevents new.ipynb going immidiately to dill.load_session and loading an incomplete file.

Conclusion

That is pretty much it! I have published the complete implementation as a Python package, it is a bit more complex than the snippets in this post but all the important parts works the same. You can check it out on GitHub and try it yourself.