Hello !
I’m trying to create a linechart showing the evolution of occurrences according to trade families over time.
I’m going to run an SQL query on a database to retrieve the results.
Don’t worry about the SQL query: if I only have the month of June, the linechart won’t be displayed, so I’ve added fictitious data.
The date column extracted from my SQL query is in ‘YYYY-MM-DD’ format. I had first made a daily granularity to test my graph. To make it work, I had to convert the date column with pd.to_datetime()
In the case of monthly granularity, I’d like to display only the month and year on the X axis, e.g. “June 2023” only where the data is present, but I can’t manage to display anything other than several dates in the format “06/01, 06/08, 06/15”, and so on.
Do you know how I can do this?
Here’s the actual code, and below a screenshot of the result.
import pandas as pd
import numpy as np
import psycopg2, itertools
from bokeh.plotting import figure, show
from bokeh.palettes import Category10_10 as palette
from bokeh.models import DatetimeTickFormatter
# Database connexion
conn = psycopg2.connect(database="data", user="postgres", password="privatepassword", host="localhost", port="5432")
cursor = conn.cursor()
cursor.execute("""
(SELECT DATE(DATE_TRUNC('month', date_of_search)), job_search, COUNT(*)
FROM rawjobs
GROUP BY DATE_TRUNC('month', date_of_search), job_search
ORDER BY DATE_TRUNC('month', date_of_search) DESC)
UNION
(SELECT DATE('2023-07-01') AS date, 'Data Analyst' AS job_search, '487' AS count)
UNION
(SELECT DATE('2023-07-01') AS date, 'Data Engineer' AS job_search, '1202' AS count)
""")
results = cursor.fetchall()
cursor.close()
conn.close()
# Data preparation
dates = pd.to_datetime([row[0].strftime('%Y-%m-%d') for row in results])
job_searches = [row[1] for row in results]
counts = [row[2] for row in results]
# Chart options
plot = figure(x_axis_type="datetime", title="occurrences evolution over time",
x_axis_label="date", y_axis_label="occurrences")
# Format the x-axis ticks to display month and year
plot.xaxis[0].formatter = DatetimeTickFormatter(months=["%b"])
plot.xaxis.major_label_orientation = np.pi / 4 # Rotate the labels for better readability
# Trying color palette
colors = itertools.cycle(palette)
for job_search in list(set(job_searches)):
job_dates = [date for date, job, count in zip(dates, job_searches, counts) if job == job_search]
job_counts = [count for date, job, count in zip(dates, job_searches, counts) if job == job_search]
plot.line(job_dates, job_counts, line_width=3, legend_label=job_search, color=next(colors))
show(plot)