I’m trying to create a linechart showing the evolution of occurrences according to trade families over time.
I’m going to run an SQL query on a database to retrieve the results.
Don’t worry about the SQL query: if I only have the month of June, the linechart won’t be displayed, so I’ve added fictitious data.
The date column extracted from my SQL query is in ‘YYYY-MM-DD’ format. I had first made a daily granularity to test my graph. To make it work, I had to convert the date column with pd.to_datetime()
In the case of monthly granularity, I’d like to display only the month and year on the X axis, e.g. “June 2023” only where the data is present, but I can’t manage to display anything other than several dates in the format “06/01, 06/08, 06/15”, and so on.
Do you know how I can do this?
Here’s the actual code, and below a screenshot of the result.
import pandas as pd import numpy as np import psycopg2, itertools from bokeh.plotting import figure, show from bokeh.palettes import Category10_10 as palette from bokeh.models import DatetimeTickFormatter # Database connexion conn = psycopg2.connect(database="data", user="postgres", password="privatepassword", host="localhost", port="5432") cursor = conn.cursor() cursor.execute(""" (SELECT DATE(DATE_TRUNC('month', date_of_search)), job_search, COUNT(*) FROM rawjobs GROUP BY DATE_TRUNC('month', date_of_search), job_search ORDER BY DATE_TRUNC('month', date_of_search) DESC) UNION (SELECT DATE('2023-07-01') AS date, 'Data Analyst' AS job_search, '487' AS count) UNION (SELECT DATE('2023-07-01') AS date, 'Data Engineer' AS job_search, '1202' AS count) """) results = cursor.fetchall() cursor.close() conn.close() # Data preparation dates = pd.to_datetime([row.strftime('%Y-%m-%d') for row in results]) job_searches = [row for row in results] counts = [row for row in results] # Chart options plot = figure(x_axis_type="datetime", title="occurrences evolution over time", x_axis_label="date", y_axis_label="occurrences") # Format the x-axis ticks to display month and year plot.xaxis.formatter = DatetimeTickFormatter(months=["%b"]) plot.xaxis.major_label_orientation = np.pi / 4 # Rotate the labels for better readability # Trying color palette colors = itertools.cycle(palette) for job_search in list(set(job_searches)): job_dates = [date for date, job, count in zip(dates, job_searches, counts) if job == job_search] job_counts = [count for date, job, count in zip(dates, job_searches, counts) if job == job_search] plot.line(job_dates, job_counts, line_width=3, legend_label=job_search, color=next(colors)) show(plot)