Density Line Plot

Speedrun

See here for the implementation.

The idea

I found myself recently with a large time series that I was trying to summarize. The time series had many data points for every day and I found myself wondering how the daily distribution varied.

The code

The implementation uses three ideas:

  • Grouping data in some way, for example generating a list of data points per day, I can plot a scatter of that data for each group.
  • If I jitter that data, I can better see the distribution
  • Interpolating the density of the data using a histogram (which is faster than using a Kernel Density Estimate approach), we can have pretty colors to make it even more readable.

Limitations

  • The code is untested using time as the x-axis, but would probably only need a reimplementation of \( \verb|jitter_x_axis| \) to use datetime timedeltas.
  • You need to bring your own data set which is already grouped, but that shape isn’t common to build in polars or pandas, so it might need a bit of finnicking.

def density_lineplot(data: npt.NDArray, out: str):
    """Plot density over time for grouped data."""

    fig, ax = plt.subplots(figsize=(20,10))

    for row_idx in range(data.shape[0]):

        x = row_idx*np.ones(data.shape[1])
        jittered_x = jitter_x_axis(x)
        y_set = data[row_idx]
        z = make_density_color(x, y_set, 20)

        ax.scatter(
            x=jittered_x,
            y=y_set,
            alpha=0.1,
            c=z,
            cmap="magma",
        )

    plt.savefig(f"{out}.png")
    plt.close()

Some pictures

Enjoy some simple example outputs. They were drawn using the following base:

import numpy as np

time_indices = 100
data_points_per_time = 500

xs = (np.ones((time_indices, data_points_per_time))
   * np.reshape(np.linspace(1, time_indices, time_indices), (-1, 1)))

A negative binomial:

data = np.random.negative_binomial(100,1/xs)

neg_bin.png

A gamma:

data = np.random.gamma(xs, np.random.uniform(2,8,xs.shape))

gamma.png

A normal with an extra sine wave thrown in:

data = np.random.normal(np.sin(xs), np.random.uniform(2,8,xs.shape))

normal.png


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *