I’m trying to modify an existing program I have such that it allows the user to select a file, which some python code then processes and some nice plots result. However, I’m really stuck on getting the input from the FileInput reader into a dataframe. My inputs are in relatively standardized xls and xlsx files, but I can’t figure out how I can put them in a dataframe as pd.read_excel expects a link to the file.
Here is the simple code I’m using to try and test my function, I’m sure I’m missing something simple but I’ve spent a fair bit of time trying to figure this out an am not making much headway.
This one was tough! I feel like I ought to start out by saying that this is really more a pandas/python issue than a Bokeh issue, but I was intrigued by it and went down the rabbithole anyway.
Here’s what I got to work:
from bokeh.io import curdoc
from bokeh.models.widgets import FileInput
from pybase64 import b64decode
import pandas as pd
import io
def upload_fit_data(attr, old, new):
print("fit data upload succeeded")
decoded = b64decode(new)
f = io.BytesIO(decoded)
new_df = pd.read_excel(f)
print(new_df)
file_input = FileInput(accept=".csv,.json,.txt,.pdf,.xls")
file_input.on_change('value', upload_fit_data)
doc=curdoc()
doc.add_root(file_input)
The turning point was finding this buried SO answer that mentioned that xlrd (the engine doing the conversion for the read_excel function) is apparently mis-documented, and rather than wanting a string, it really wants bytes.
Good luck! By the way, if you end up with a working example, we’d love to see it-- keep in touch!
New bokeh user here. This was incredibly helpful ! My relatively simple use-case was to have the user select a file using FileInput widget and plot the contents of select columns on a scatter plot.
When you receive the change signal on the backend, it will already arrive with all the data. So the upload would be complete. Same with the frontend - you receive the event already with the data.