Info in Noise. Two Strategies for Visualizing Many… | by Lenix Carter | Sep, 2024

Two Strategies for Visualizing Many Time-Sequence at As soon as

Think about this: you’ve received a bunch of line charts, and also you’re assured that there’s a minimum of one development hiding someplace in all that knowledge. Whether or not you’re monitoring gross sales throughout 1000’s of your organization’s merchandise or diving into inventory market knowledge, your aim is to uncover these subtrends and make them stand out in your visualization. Let’s discover a few strategies that can assist you do exactly that.

A whole lot of traces plotted, but it surely isn’t clear what the subtrends are. This artificial knowledge can present the advantage of these methods. (Picture by creator)

Density Line Charts are a intelligent plotting method launched by Dominik Moritz and Danyel Fisher of their paper, Visualizing a Million Time Sequence with the Density Line Chart. This methodology transforms quite a few line charts into heatmaps, revealing areas the place the traces overlap probably the most.

After we apply Density Line Charts to the artificial knowledge we confirmed earlier, the outcomes appear like this:

PyDLC permits us to see “sizzling spots” the place a excessive diploma of traces overlap. (Picture by creator)

This implementation permits us to see the place our developments are showing, and determine the subtrends that make this knowledge attention-grabbing.

For this instance we use the Python library PyDLC by Charles L. Bérubé. The implementation is kind of simple, because of the library’s user-friendly design.

plt.determine(figsize=(14, 14))
im = dense_lines(synth_df.to_numpy().T,
x=synth_df.index.astype('int64'),
cmap='viridis',
ny=100,
y_pad=0.01
)

plt.ylim(-25, 25)

plt.axhline(y=0, coloration='white', linestyle=':')

plt.present()

When utilizing Density Line Charts, remember that parameters like ny and y_pad could require some tweaking to get the most effective outcomes.

This method isn’t as broadly mentioned and doesn’t have a universally acknowledged title. Nonetheless, it’s primarily a variation of “line density plots” or “line density visualizations,” the place we use thicker traces with low opacity to disclose areas of overlap and density.

This method exhibits subtrends fairly properly and reduces cognitive load from the various traces. (Picture by creator)

We will clearly determine what appear to be two distinct developments and observe the excessive diploma of overlap throughout the downward actions of the sine waves. Nonetheless, it’s a bit trickier to pinpoint the place the impact is the strongest.

The code for this strategy can also be fairly simple:

plt.determine(figsize=(14, 14))

for column in synth_df.columns:
plt.plot(synth_df.index,
synth_df[column],
alpha=0.1,
linewidth=2,
label=ticker,
coloration='black'
)

Right here, the 2 parameters that may require some adjustment are alpha and linewidth.

Think about we’re looking for subtrends within the every day returns of fifty shares. Step one is to tug the info and calculate the every day returns.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

stock_tickers = [
'AAPL', 'MSFT', 'GOOGL', 'AMZN', 'TSLA', 'META', 'NVDA', 'BRK-B', 'UNH', 'V',
'HD', 'MA', 'KO', 'DIS', 'PFE', 'NKE', 'ADBE', 'CMCSA', 'NFLX', 'CSCO',
'INTC', 'AMGN', 'COST', 'PEP', 'TMO', 'AVGO', 'QCOM', 'TXN', 'ABT', 'ORCL',
'MCD', 'MDT', 'CRM', 'UPS', 'WMT', 'BMY', 'GILD', 'BA', 'SBUX', 'IBM',
'MRK', 'WBA', 'CAT', 'CVX', 'T', 'MS', 'LMT', 'GS', 'WFC', 'HON'
]

start_date = '2024-03-01'
end_date = '2024-09-01'

percent_returns_df = pd.DataFrame()

for ticker in stock_tickers:
stock_data = yf.obtain(ticker, begin=start_date, finish=end_date)

stock_data = stock_data.fillna(methodology='ffill').fillna(methodology='bfill')

if len(stock_data) >= 2:
stock_data['Percent Daily Return'] = stock_data['Close'].pct_change() * 100

stock_data['Ticker'] = ticker
percent_returns_df = pd.concat([percent_returns_df, stock_data[['Ticker', 'Percent Daily Return']]], axis=0)

percent_returns_df.reset_index(inplace=True)

show(percent_returns_df)

We will then plot the info.

pivot_df = percent_returns_df.pivot(index='Date', columns='Ticker', values='% Day by day Return')

pivot_df = pivot_df.fillna(methodology='ffill').fillna(methodology='bfill')

plt.determine(figsize=(14, 14))
sns.lineplot(knowledge=pivot_df, dashes=False)
plt.title('% Day by day Returns of Prime 50 Shares')
plt.xlabel('Date')
plt.ylabel('% Day by day Return')
plt.legend(title='Inventory Ticker', bbox_to_anchor=(1.05, 1), loc='higher left')
plt.grid(True)
plt.tight_layout()

A really messy many-line plot with little discernible info. (Picture by creator)

The Density Line Chart does face some challenges with this knowledge attributable to its sporadic nature. Nonetheless, it nonetheless supplies beneficial insights into general market developments. For example, you possibly can spot intervals the place the densest areas correspond to important dips, highlighting tough days out there.

(Picture by creator)
plt.determine(figsize=(14, 14))
im = dense_lines(pivot_df[stock_tickers].to_numpy().T,
x=pivot_df.index.astype('int64'),
cmap='viridis',
ny=200,
y_pad=0.1
)

plt.axhline(y=0, coloration='white', linestyle=':')
plt.ylim(-10, 10)

plt.present()

Nonetheless, we discover that the transparency method performs considerably higher for this explicit drawback. The market dips we talked about earlier turn out to be a lot clearer and extra discernible.

(Picture by creator)
plt.determine(figsize=(14, 14))

for ticker in pivot_df.columns:
plt.plot(pivot_df.index,
pivot_df[ticker],
alpha=0.1,
linewidth=4,
label=ticker,
coloration='black'
)

Each methods have their very own deserves and strengths, and the most effective strategy in your work will not be apparent till you’ve tried each. I hope one in all these strategies proves useful in your future initiatives. When you’ve got some other strategies or use circumstances for dealing with huge line plots, I’d love to listen to about them!

Thanks for studying, and take care.