I am trying to create a simple scatter plot using omics data read in from a .CSV file. I have samples in rows with peak areas for metabolites in corresponding columns. I would like to be able to plot sample (x axis) vs metabolite. I included a truncated .CSV file as an example.
I started with the following code:
import pandas
from bokeh.plotting import figure, output_file, show
df = pandas.read_csv(“omicsdata.csv”)
p = figure(title=“Chart Title”, x_axis_label=‘Sample’, y_axis_label=‘Peak Area’)
p.scatter(df[‘Sample’], df[‘CMP’], line_width=1)
show(p)
``
This creates the chart, but with no data points plotted. It seems to be the non-numerical names in the sample column. If I plot another column, say two metabolites against each other, it will plot fine. I would prefer not to re-label samples as these sample names correspond to sample names in a larger database.
Following the guide from Plotting with basic glyphs — Bokeh 2.4.2 Documentation , I tried to create a categorical axis label.
import pandas
from bokeh.plotting import figure, output_file, show
df = pandas.read_csv(“omicsdata.csv”)
label = df[‘Sample’]
p = figure(title=“Chart Title”, x_axis_label=‘Sample’, y_axis_label=‘Peak Area’)
p = figure(x_range=label)
p.scatter(df[‘Sample’], df[‘CMP’], line_width=1)
show(p)
``
Running the above, I get an error about invalid range input
raise ValueError(“Unrecognized range input: ‘%s’” % str(range_input))
ValueError: Unrecognized range input
``
It does not seem to like the non-numerical values in the sample column. So, I tried to explicitly create a new list from the sample column explicitly as string data
import pandas
from bokeh.plotting import figure, output_file, show
df = pandas.read_csv(“omicsdata.csv”)
label = df[‘Sample’]
map(str, label)
p = figure(title=“Chart Title”, x_axis_label=‘Sample’, y_axis_label=‘Peak Area’)
p = figure(x_range=label)
p.scatter(df[‘Sample’], df[‘2-aminoadipic acid’], line_width=1)
show(p)
``
This gives me the same range error output.
Any thoughts? I am new to python and have not encountered the same issues previously when trying to plot with ggplot in R.
Thanks
omicsdata.csv (6.44 KB)