Think about this: you’ve received a bunch of line charts, and also you’re assured that there’s a minimum of one development hiding someplace in all that knowledge. Whether or not you’re monitoring gross sales throughout 1000’s of your organization’s merchandise or diving into inventory market knowledge, your aim is to uncover these subtrends and make them stand out in your visualization. Let’s discover a few strategies that can assist you do exactly that.
Density Line Charts are a intelligent plotting method launched by Dominik Moritz and Danyel Fisher of their paper, Visualizing a Million Time Sequence with the Density Line Chart. This methodology transforms quite a few line charts into heatmaps, revealing areas the place the traces overlap probably the most.
After we apply Density Line Charts to the artificial knowledge we confirmed earlier, the outcomes appear like this:
This implementation permits us to see the place our developments are showing, and determine the subtrends that make this knowledge attention-grabbing.
For this instance we use the Python library PyDLC by Charles L. Bérubé. The implementation is kind of simple, because of the library’s user-friendly design.
plt.determine(figsize=(14, 14))
im = dense_lines(synth_df.to_numpy().T,
x=synth_df.index.astype('int64'),
cmap='viridis',
ny=100,
y_pad=0.01
)plt.ylim(-25, 25)
plt.axhline(y=0, coloration='white', linestyle=':')
plt.present()
When utilizing Density Line Charts, remember that parameters like ny
and y_pad
could require some tweaking to get the most effective outcomes.
This method isn’t as broadly mentioned and doesn’t have a universally acknowledged title. Nonetheless, it’s primarily a variation of “line density plots” or “line density visualizations,” the place we use thicker traces with low opacity to disclose areas of overlap and density.
We will clearly determine what appear to be two distinct developments and observe the excessive diploma of overlap throughout the downward actions of the sine waves. Nonetheless, it’s a bit trickier to pinpoint the place the impact is the strongest.
The code for this strategy can also be fairly simple:
plt.determine(figsize=(14, 14))for column in synth_df.columns:
plt.plot(synth_df.index,
synth_df[column],
alpha=0.1,
linewidth=2,
label=ticker,
coloration='black'
)
Right here, the 2 parameters that may require some adjustment are alpha
and linewidth
.
Think about we’re looking for subtrends within the every day returns of fifty shares. Step one is to tug the info and calculate the every day returns.
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as snsstock_tickers = [
'AAPL', 'MSFT', 'GOOGL', 'AMZN', 'TSLA', 'META', 'NVDA', 'BRK-B', 'UNH', 'V',
'HD', 'MA', 'KO', 'DIS', 'PFE', 'NKE', 'ADBE', 'CMCSA', 'NFLX', 'CSCO',
'INTC', 'AMGN', 'COST', 'PEP', 'TMO', 'AVGO', 'QCOM', 'TXN', 'ABT', 'ORCL',
'MCD', 'MDT', 'CRM', 'UPS', 'WMT', 'BMY', 'GILD', 'BA', 'SBUX', 'IBM',
'MRK', 'WBA', 'CAT', 'CVX', 'T', 'MS', 'LMT', 'GS', 'WFC', 'HON'
]
start_date = '2024-03-01'
end_date = '2024-09-01'
percent_returns_df = pd.DataFrame()
for ticker in stock_tickers:
stock_data = yf.obtain(ticker, begin=start_date, finish=end_date)
stock_data = stock_data.fillna(methodology='ffill').fillna(methodology='bfill')
if len(stock_data) >= 2:
stock_data['Percent Daily Return'] = stock_data['Close'].pct_change() * 100
stock_data['Ticker'] = ticker
percent_returns_df = pd.concat([percent_returns_df, stock_data[['Ticker', 'Percent Daily Return']]], axis=0)
percent_returns_df.reset_index(inplace=True)
show(percent_returns_df)
We will then plot the info.
pivot_df = percent_returns_df.pivot(index='Date', columns='Ticker', values='% Day by day Return')pivot_df = pivot_df.fillna(methodology='ffill').fillna(methodology='bfill')
plt.determine(figsize=(14, 14))
sns.lineplot(knowledge=pivot_df, dashes=False)
plt.title('% Day by day Returns of Prime 50 Shares')
plt.xlabel('Date')
plt.ylabel('% Day by day Return')
plt.legend(title='Inventory Ticker', bbox_to_anchor=(1.05, 1), loc='higher left')
plt.grid(True)
plt.tight_layout()
The Density Line Chart does face some challenges with this knowledge attributable to its sporadic nature. Nonetheless, it nonetheless supplies beneficial insights into general market developments. For example, you possibly can spot intervals the place the densest areas correspond to important dips, highlighting tough days out there.
plt.determine(figsize=(14, 14))
im = dense_lines(pivot_df[stock_tickers].to_numpy().T,
x=pivot_df.index.astype('int64'),
cmap='viridis',
ny=200,
y_pad=0.1
)plt.axhline(y=0, coloration='white', linestyle=':')
plt.ylim(-10, 10)
plt.present()
Nonetheless, we discover that the transparency method performs considerably higher for this explicit drawback. The market dips we talked about earlier turn out to be a lot clearer and extra discernible.
plt.determine(figsize=(14, 14))for ticker in pivot_df.columns:
plt.plot(pivot_df.index,
pivot_df[ticker],
alpha=0.1,
linewidth=4,
label=ticker,
coloration='black'
)
Each methods have their very own deserves and strengths, and the most effective strategy in your work will not be apparent till you’ve tried each. I hope one in all these strategies proves useful in your future initiatives. When you’ve got some other strategies or use circumstances for dealing with huge line plots, I’d love to listen to about them!
Thanks for studying, and take care.