Confusion with the dataframe.index while streaming

I have a csv with “Month” as a first column and some numerical values as other columns. I wish show streaming plot.

I read that csv as pandas dataframe and plot values with a moving time-window. Meaning, I will start with first 10 rows, plot them. Then in the next “update()” I plot next 10 rows. This is how I keep moving the dataframe 10 at a time. This will give some feeling of running chart. (May be there are better ways, but for now I am just exploring streaming).

But there is some problem I guess when I set the first column “Month” as index. It is somehow lost during conversion to ColumnDataSource or during streaming. Not sure. Please throw some light.

Here is how I read csv and set index

df = pd.read_csv(join(dirname(__file__), 'data/nz_imports.csv'))
df["Month"] = df["Month"].astype("str")
df["Month"] = pd.to_datetime(df["Month"], format="%YM%m")
df["Month"] = pd.to_datetime(df["Month"])
df = df.set_index("Month")

``

Csv is as follows

Month,TotalAirportsCIF,TotalParcelPostCIF,TotalSeaportsCIF,TotalAirportsWeight,TotalParcelPostWeight,TotalSeaportsWeight
2000M01,394427,4434,1346468,4831,50,903462
2000M02,592551,9523,1631598,7205,13,1276569
2000M03,621641,6081,1801851,7591,10,1167420
2000M04,515741,4847,1657750,6252,13,1242467
2000M05,608863,4658,1690275,7512,12,1086912
2000M06,711430,4399,1668470,7363,9,985843

``

Here is how I pick first 10 rows and make plot:

start_row_num = 0
end_row_num = 10
step = 10
filtered_df = df[start_row_num:end_row_num].copy()
plot_source = ColumnDataSource(data=filtered_df)
plot = make_plot(plot_source)

``

In Plot I have used lines:

plot.line("Month", "TotalAirportsCIF",color="#0000FF", source=source, legend="Total Airports CIF")
plot.line("Month", "TotalSeaportsCIF",color="#8A2BE2", source=source, legend="Total Seaports CIF")

``

My update call looks like:

def update(step):
    global start_row_num
    global end_row_num
    start_row_num +=step
    end_row_num += step
    localdf = df[start_row_num:end_row_num].copy()
    plot_source.stream(localdf,100)

``

And I start the loop as

curdoc().add_periodic_callback(update(step), 50)

``

The plot does not move. Error
ValueError(‘Must stream updates to all existing columns (missing: Month)’,)

``

What am I missing?

If I comment following line then I start getting the “Month” index, but then it fails in update, where plot_source.stream method complains that “Must stream updates to all existing columns (missing: index)”.

Here is the experiment I did:

df = pd.read_csv(join(dirname(__file__), 'data/nz_imports.csv'))
df["Month"] = df["Month"].astype("str")
df["Month"] = pd.to_datetime(df["Month"], format="%YM%m")
#df = df.set_index("Month")
filtered_df = df[start_row_num:end_row_num].copy()
plot_source = ColumnDataSource(data=filtered_df)
plot = make_plot(plot_source)

``

The update method:

def update(step):
    global start_row_num
    global end_row_num
    global plot_source
    start_row_num +=step
    end_row_num += step
    localdf = df[start_row_num:end_row_num]
    new_data = dict()
    new_data['Month'] = localdf["Month"]
    new_data['TotalAirportsCIF'] = localdf["TotalAirportsCIF"]
    new_data['TotalParcelPostCIF'] = localdf["TotalParcelPostCIF"]
    new_data['TotalSeaportsCIF'] = localdf["TotalSeaportsCIF"]
    new_data['TotalAirportsWeight'] = localdf["TotalAirportsWeight"]
    new_data['TotalParcelPostWeight'] = localdf["TotalParcelPostWeight"]
    new_data['TotalSeaportsWeight'] = localdf["TotalSeaportsWeight"]
    plot_source.stream(new_data,100)

``

Still does not work…

Also, wish to say that: can stream method accept df rathar than enetering all the columns? (or should I have had used df.to_dict() method?)

Please suggest correct and definitive way to get the update working fine.

···

On Tuesday, September 13, 2016 at 2:14:01 PM UTC+5:30, Yogesh Kulkarni wrote:

I have a csv with “Month” as a first column and some numerical values as other columns. I wish show streaming plot.

I read that csv as pandas dataframe and plot values with a moving time-window. Meaning, I will start with first 10 rows, plot them. Then in the next “update()” I plot next 10 rows. This is how I keep moving the dataframe 10 at a time. This will give some feeling of running chart. (May be there are better ways, but for now I am just exploring streaming).

But there is some problem I guess when I set the first column “Month” as index. It is somehow lost during conversion to ColumnDataSource or during streaming. Not sure. Please throw some light.

Here is how I read csv and set index

df = pd.read_csv(join(dirname(__file__), 'data/nz_imports.csv'))
df["Month"] = df["Month"].astype("str")
df["Month"] = pd.to_datetime(df["Month"], format="%YM%m")
df["Month"] = pd.to_datetime(df["Month"])
df = df.set_index("Month")

``

Csv is as follows

Month,TotalAirportsCIF,TotalParcelPostCIF,TotalSeaportsCIF,TotalAirportsWeight,TotalParcelPostWeight,TotalSeaportsWeight
2000M01,394427,4434,1346468,4831,50,903462
2000M02,592551,9523,1631598,7205,13,1276569
2000M03,621641,6081,1801851,7591,10,1167420
2000M04,515741,4847,1657750,6252,13,1242467
2000M05,608863,4658,1690275,7512,12,1086912
2000M06,711430,4399,1668470,7363,9,985843

``

Here is how I pick first 10 rows and make plot:

start_row_num = 0
end_row_num = 10
step = 10
filtered_df = df[start_row_num:end_row_num].copy()
plot_source = ColumnDataSource(data=filtered_df)
plot = make_plot(plot_source)

``

In Plot I have used lines:

plot.line("Month", "TotalAirportsCIF",color="#0000FF", source=source, legend="Total Airports CIF")
plot.line("Month", "TotalSeaportsCIF",color="#8A2BE2", source=source, legend="Total Seaports CIF")

``

My update call looks like:

def update(step):
    global start_row_num
    global end_row_num
    start_row_num +=step
    end_row_num += step
    localdf = df[start_row_num:end_row_num].copy()
    plot_source.stream(localdf,100)

``

And I start the loop as

curdoc().add_periodic_callback(update(step), 50)

``

The plot does not move. Error
ValueError(‘Must stream updates to all existing columns (missing: Month)’,)

``

What am I missing?