I’m trying to set up a service, where multiple users will be able to create their own dashboards through notebooks. These notebooks will be developed on a shared JupyterHub server, which is the same server I intend to host the dashboards from.
When developing these notebooks, the users have access to the server filesystem. Each dashboard will be visible to the whole community. They are also expected to connect to different databases which may require authentication.
In other words, I will be hosting code that was written by other “autonomous” creators.
Users connecting to the dashboards will be authenticated with apache.
I’m hosting the panel in a docker container with root access.
My question is:
What is an effective way of limiting permissions when executing the code / creating a sandbox environment?
I have considered spawning a subprocess for each open session and reducing permissions inside this session with os.setuid
I have also considered starting a new container instance for each connecting user to fully isolate it. I would prefer to avoid this option
I am running Bokeh on a Tornado ioloop.
The framework we will use is Panel (I also posted the question there: panel thread), but as I can see, most of this is built on top of Bokeh.
I understand and support your statement for the most part. In my case I trust authors, and all code is executed on internal servers. Security measures are primarily to prevent someone from accidentally interfering with other people’s work.
I ended up monkey patching the eval_function:
Anytime code is run on the server, I spawn a child process with multiprocessing.
Within this process I change the user permissions to the user that corresponds to the client, and execute the code.
The return value and the changes to the document are passed on a queue, and synchronized with the main process on return.
It’s not the most elegant solution in the world, and it probably doesn’t follow the design patterns of Bokeh, but it seems to get the job done.
@BeeHaive I’d be interested to see your patches and perhaps solicit contributions. Some time ago I looked at moving the execution of the initial “update document” out to threads, to avoid blocking tornado for new connections at this stage. But I ran into some issues around document locking, and I got pulled to other questions and it fell off my radar. I’m not so interested in the permissions part (I still think multi-tenancy is just a larger scope than we can reliably maintain) but if you’ve gotten around the blocking at document creation that would be extremely interesting. Please open a GitHub development discussion if you are able.
I would be happy to share my approach and get your inputs, but I doubt what I have done will lead to any performance gains. The child process I spawn takes place after document creation, and the parent thread is still awaiting regardless.
I will likely work on my solution for a while longer. I am more than happy to share the results if they can benefit others.