Loading large dataset once at app creation

I am using bokeh application by running a bokeh server. I have loaded a large data (2GB) at the first of application code in main.py. Now, every time I open a new session in my browser and make a request, the data is loaded again in memory and my memory is going to be full (also it takes a lot of load time). I wonder how could I solve this problem? Is there any solution to load data once in memory and share it between different sessions (for example when calling bokeh serve command)? or can I use lifecycle hooks for this purpose?

Yes, you can use lifecycle hooks for this purpose. If sessions only need read access to the data (no writing), then the simplest thing to do is to store the data in some attribute of another module in the same directory as your main.py. Python’s module caching ensures that all sessions that import the module will be able to access the data. You can see this demonstrated in the spectrogram example. If you need to mediate concurrent writes, then you will be better off putting the data in some actual real database and letting sessions talk to the database. (Alternatively you will have to implement your own locking strategy.)

1 Like