I have a dataframe with songs, which includes ‘duration’ and ‘popularity’. Duration is in minutes and popularity is 0-100. I use these variables to create a scatterplot, and I want to some way to control the x-axis, for the value ‘duration’. With that I mean a way to ‘zoom’ by selecting a min and max duration, and only songs that fall between those values are in the frame.
I have tried to do this with a RangeSlider, but it doesn’t quite work. The problem is that when I move the slider sometimes it resets the view back to the max possible value. I think this might be because the list of durations is not continuous, so some values on the slider don’t correspond to existing duration value.
Is there a way I can make this work with the RangeSlider? Or another method to achieve the same?
Code:
import pandas as pd
from bokeh.io import curdoc
from bokeh.models import ColumnDataSource, RangeSlider
from bokeh.plotting import figure
from bokeh.layouts import column
duration = [4.66, 6.62, 3.88, 5.95,6.28, 5.53, 6.32, 6.02, 2.51, 2.73, 4.38, 8.33, 3.49, 7.60, 5.02]
popularity = [0, 1, 13, 0, 33, 36, 0, 32, 26, 57, 0, 31, 45, 0, 6]
df = pd.DataFrame({ 'duration': duration, 'popularity': popularity})
source = ColumnDataSource(data={
'x' : df.duration,
'y' : df.popularity
})
plot1 = figure(plot_width=500,
plot_height=500,
x_axis_label="duration",
y_axis_label="popularity",
title ='Popularity vs Duration',
y_range = [-1,100]
)
plot1.circle(x='x', y='y', source=source, fill_alpha=0.8, size=10)
duration_slider = RangeSlider(start=0, end=df['duration'].max(), value=(0,5), step=.01, title="Duration")
def update_plot(attr, old, new):
duration_start, duration_end = duration_slider.value
new_data = {
'x' : df.duration,
'y' : df.popularity,
}
source.data = new_data
plot1.x_range.start = duration_start
plot1.x_range.end = duration_end
plot1.title.text = 'Popularity vs Duration for songs'
duration_slider.on_change("value", update_plot)
layout = column(duration_slider, plot1)
curdoc().add_root(layout)
curdoc().title = 'Test'