We are creating a GUI in bokeh to process datasets generated from sensors. One process we would like to add is creating a new column in a dataframe based on TextInput field in the GUI. I will update this Support question as a Q&A if I figure it out myself.
A use case is, if we have 3 (T1, T2, and T3) channels reporting temperature and we want average of the 3 channels a user will add a new channel by specifying arithmetic for the new channel as (T1+T2+T3)/3.
One way is to convert the string and use pandas or python eval. This is a huge security risk so looking at alternatives.
Found these as alternatives: pyparsing and plusminus . However, unable to use dataframe columns as input.
Was wondering if any other community members were able to come up with a solution for this. Thank you for your time.
@swamilikes2code I am only passingly familiar with this project, but another option might be numexpr which explicitly accepts array expressions as string inputs. The primary reason for this in the case of numexpr is to afford numerical optimizations, but they might (I presume) have some level of input sanitization that could be relied on.
Thank you @Bryan for your input. I checked the library and from my cursory research on their GitHub page looks like they have issues with using eval() from user input too. I am not a security expert in any form, from this discussion looks like it has technical debt and this is built in.
I found another library which is called ASTEVAL. It is not entirely safe, but has some safeguards built in from the python AST library. I tested it and it seems to work well in the minimal proof of concept. Here are more details from them to save anyone a click:
How Safe is asteval?
Asteval avoids all of the exploits we know about that make eval() dangerous. For reference, see, Eval is really dangerous and the comments and links therein. From this discussion it is apparent that not only is eval() unsafe, but that it is a difficult prospect to make any program that takes user input perfectly safe. In particular, if a user can cause Python to crash with a segmentation fault, safety cannot be guaranteed. Asteval explicitly forbids the exploits described in the above link, and works hard to prevent malicious code from crashing Python or accessing the underlying operating system. That said, we cannot guarantee that asteval is completely safe from malicious code. We claim only that it is safer than the builtin eval(), and that you might find it useful.
For the record, I consider eval perfectly safe, as long as the input to eval is entirely controlled and does not rely in any way on user inputs. [1] Obviously that’s not the case here for you, though. So then I’d ask: are there any common patterns to all the kinds of inputs users might want to enter? Perhaps they can be expressed direclty in a constrained UI (e.g. if almost all the transforms are “sum a few columns” then have a multi-select widget to choose columns to sum)