or
the place
- kappa_post is the posterior focus parameter:
- mu_post is the posterior imply path
Yay, we made it! The posterior can be a von Mises distribution with up to date parameters (mu_post, kappa_post). Now we are able to replace the prior with each new commentary and see how the imply path adjustments.
Welcome again to these of you who skipped the mathematics and congratulations to those that made it via! Let’s code the Bayesian replace and vizualize the outcomes.
First, let’s outline the helper features for visualizing the posterior distribution. We’ll must it to create a pleasant animation in a while.
import imageio
from io import BytesIOdef get_posterior_distribution_image_array(
mu_grid: np.ndarray,
posterior_pdf: np.ndarray,
current_samples: Checklist[float],
idx: int,
fig_size: Tuple[int, int],
dpi: int,
r_max_posterior: float
) -> np.ndarray:
"""
Creates the posterior distribution and noticed samples histogram on a polar plot,
converts it to a picture array, and returns it for GIF processing.
Parameters:
-----------
mu_grid (np.ndarray):
Grid of imply path values for plotting the posterior PDF.
posterior_pdf (np.ndarray):
Posterior chance density perform values for the given `mu_grid`.
current_samples (Checklist[float]):
Checklist of noticed angle samples in radians.
idx (int):
The present step index, used for labeling the plot.
fig_size (Tuple[int, int]):
Dimension of the plot determine (width, peak).
dpi (int):
Dots per inch (decision) for the plot.
r_max_posterior (float):
Most radius for the posterior PDF plot, used to set plot limits.
Returns:
np.ndarray: Picture array of the plot in RGB format, appropriate for GIF processing.
"""
fig = plt.determine(figsize=fig_size, dpi=dpi)
ax = plt.subplot(1, 1, 1, projection='polar')
ax.set_theta_zero_location('N')
ax.set_theta_direction(-1)
ax.plot(mu_grid, posterior_pdf, colour='pink', linewidth=2, label='Posterior PDF')
# noticed samples histogram
n_bins = 48
hist_bins = np.linspace(-np.pi, np.pi, n_bins + 1)
hist_counts, _ = np.histogram(current_samples, bins=hist_bins)
# normalize the histogram counts
if np.max(hist_counts) > 0:
hist_counts_normalized = hist_counts / np.max(hist_counts)
else:
hist_counts_normalized = hist_counts
bin_centers = (hist_bins[:-1] + hist_bins[1:]) / 2
bin_width = hist_bins[1] - hist_bins[0]
# set the utmost radius to accommodate each the posterior pdf and histogram bars
r_histogram_height = r_max_posterior * 0.9
r_max = r_max_posterior + r_histogram_height
ax.set_ylim(0, r_max)
# plot the histogram bars exterior the circle
for i in vary(len(hist_counts_normalized)):
theta = bin_centers[i]
width = bin_width
hist_height = hist_counts_normalized[i] * r_histogram_height
if hist_counts_normalized[i] > 0:
ax.bar(
theta, hist_height, width=width, backside=r_max_posterior,
colour='teal', edgecolor='black', alpha=0.5
)
ax.textual content(
0.5, 1.1, f'Posterior Distribution (Step {idx + 1})',
remodel=ax.transAxes, ha='heart', va='backside', fontsize=18
)
ax.set_yticklabels([])
ax.grid(linestyle='--')
ax.yaxis.set_visible(False)
ax.spines['polar'].set_visible(False)
plt.subplots_adjust(high=0.85, backside=0.05, left=0.05, proper=0.95)
# saving to buffer for gif processing
buf = BytesIO()
plt.savefig(buf, format='png', bbox_inches=None, pad_inches=0)
buf.search(0)
img_array = plt.imread(buf)
img_array = (img_array * 255).astype(np.uint8)
plt.shut(fig)
return img_array
Now we’re prepared to put in writing the replace loop. Do not forget that we have to set our prior distribution. I’ll begin with a round uniform distribution which is equal to a von Mises distribution with a focus parameter of 0. For the kappa_likelihood I set a set average focus parameter of two. That’ll make the posterior replace extra seen.
# preliminary prior parameters
mu_prior = 0.0 # preliminary imply path (any worth, since kappa_prior = 0)
kappa_prior = 0.0 # uniform prior over the circle# mounted focus parameter for the probability
kappa_likelihood = 2.0
posterior_mus = []
posterior_kappas = []
mu_grid = np.linspace(-np.pi, np.pi, 200)
# vizualisation parameters
fig_size = (10, 10)
dpi = 100
current_samples = []
frames = []
for idx, theta_n in enumerate(knowledge['radians']):
# compute posterior parameters
C = kappa_prior * np.cos(mu_prior) + kappa_likelihood * np.cos(theta_n)
S = kappa_prior * np.sin(mu_prior) + kappa_likelihood * np.sin(theta_n)
kappa_post = np.sqrt(C**2 + S**2)
mu_post = np.arctan2(S, C)
# posterior distribution
posterior_pdf = np.exp(kappa_post * np.cos(mu_grid - mu_post)) / (2 * np.pi * i0(kappa_post))
# retailer posterior parameters and noticed samples
posterior_mus.append(mu_post)
posterior_kappas.append(kappa_post)
current_samples.append(theta_n)
# plot posterior distribution
r_max_posterior = max(posterior_pdf) * 1.1
img_array = get_posterior_distribution_image_array(
mu_grid,
posterior_pdf,
current_samples,
idx,
fig_size,
dpi,
r_max_posterior
)
frames.append(img_array)
# updating priors for subsequent iteration
mu_prior = mu_post
kappa_prior = kappa_post
# Create GIF
fps = 10
frames.lengthen([img_array]*fps*3) # repeat final body a couple of occasions to make a "pause" on the finish of the GIF
imageio.mimsave('../photographs/posterior_updates.gif', frames, fps=fps)
And that’s it! The code will generate a GIF exhibiting the posterior distribution replace with each new commentary. Right here’s the wonderful consequence:
With each new commentary the posterior distribution will get increasingly concentrated across the true imply path. If solely I may substitute the pink line with Auri’s silhouette, it might be excellent!
We are able to additional visualize the historical past of the posterior imply path and focus parameter. Let’s plot them:
# Convert posterior_mus to levels
posterior_mus_deg = np.rad2deg(posterior_mus) % 360
n_samples = knowledge.form[0]
true_mu = knowledge['degrees'].imply()
# Plot evolution of posterior imply path
fig, ax1 = plt.subplots(figsize=(12, 6))colour = 'tab:blue'
ax1.set_xlabel('Variety of Observations')
ax1.set_ylabel('Posterior Imply Path (Levels)', colour=colour)
ax1.plot(vary(1, n_samples + 1), posterior_mus_deg, marker='o', colour=colour)
ax1.tick_params(axis='y', labelcolor=colour)
ax1.axhline(true_mu, colour='pink', linestyle='--', label='Pattern Distribution Imply Path')
ax1.legend(loc='higher left')
ax1.grid(True)
ax2 = ax1.twinx() # instantiate a second axes that shares the identical x-axis
colour = 'tab:orange'
ax2.set_ylabel('Posterior Focus Parameter (kappa)', colour=colour) # we already dealt with the x-label with ax1
ax2.plot(vary(1, n_samples + 1), posterior_kappas, marker='o', colour=colour)
ax2.tick_params(axis='y', labelcolor=colour)
fig.tight_layout() # in any other case the proper y-label is barely clipped
sns.despine()
plt.title('Evolution of Posterior Imply Path and Focus Over Time')
plt.present()
The plot reveals how the posterior imply path and focus parameter evolve with each new commentary. The imply path ultimately converges to the pattern worth, whereas the focus parameter will increase, because the estimate turns into extra sure.
The very last thing I wished to attempt was to make use of the Bayes Issue method for speculation testing. The thought behind the Bayes Issue could be very easy: it’s the ratio of two marginal likelihoods for 2 competing hypotheses/fashions.
Typically, the Bayes Issue is outlined as:
the place:
- p(D∣Mi) and p(D∣Mj) are the marginal likelihoods of the information beneath the i and j speculation
- p(Mi∣D) and p(Mj∣D) are the posterior possibilities of the fashions given the information
- p(Mi) and p(Mj) are the prior possibilities of the fashions
The result’s a quantity that tells us how more likely one speculation is in comparison with the opposite. There are totally different approaches to interpret the Bayes Issue, however a typical one is to make use of the Jeffreys’ scale by Harold Jeffreys:
What are the fashions you may ask? Easy! They’re distributions with totally different parameters. I’ll be utilizing PyMC to outline the fashions and pattern posterior distributions from them.
Initially, let’s re-itroduce the null speculation. I nonetheless assume it’s a round uniform Von Mises distribution with kappa=0 however this time we have to calculate the probability of the information beneath this speculation. To simplify additional calculations, we’ll be calculating log-likelihoods.
# Calculate log probability for H0
log_likelihood_h0 = vonmises.logpdf(knowledge['radians'], kappa=0, loc=0).sum()
Subsequent, it’s time to construct the choice mannequin. Beginning with a easy situation of a Unimodal South path, the place I assume that the distribution is concentrated round 180° or π in radians.
Let’s outline the mannequin in PyMC. We’ll use Von Mises distribution with a set location parameter μ=π and a Half-Regular prior for the non-negative focus parameter κ. This permits the mannequin to study the focus parameter from the information and verify if the South path is most well-liked.
import pymc as pm
import arviz as az
import arviz.knowledge.inference_data as InferenceData
from scipy.stats import halfnorm, gaussian_kdewith pm.Mannequin() as model_uni:
# Prior for kappa
kappa = pm.HalfNormal('kappa', sigma=10)
# Chance
likelihood_h1 = pm.VonMises('angles', mu=np.pi, kappa=kappa, noticed=knowledge['radians'])
# Pattern from posterior
trace_uni = pm.pattern(
10000, tune=3000, chains=4,
return_inferencedata=True,
idata_kwargs={'log_likelihood': True})
This provides us a pleasant easy mannequin which we are able to additionally visualize:
# Mannequin graph
pm.model_to_graphviz(model_uni)
And right here’s the posterior distribution for the focus parameter κ:
az.plot_posterior(trace_uni, var_names=['kappa'])
plt.present()
All that’s left is to calculate the log-likelihood for the choice mannequin and the Bayes Issue.
# Posterior samples for kappa
kappa_samples = trace_uni.posterior.kappa.values.flatten()
# Log probability for every pattern
log_likes = []
for ok in kappa_samples:
# Von Mises log probability
log_like = vonmises.logpdf(knowledge['radians'], ok, loc=np.pi).sum()
log_likes.append(log_like)
# Log-mean-exp trick for numerical stability
log_likelihood_h1 = np.max(log_likes) +
np.log(np.imply(np.exp(log_likes - np.max(log_likes))))
BF = np.exp(log_likelihood_h1 - log_likelihood_h0)
print(f"Bayes Issue: {BF:.4f}")
print(f"Likelihood kappa > 0.5: {np.imply(kappa_samples > 0.5):.4f}")
>> Bayes Issue: 32.4645
>> Likelihood kappa > 0.5: 0.0649
Since we’re dividing the probability of the choice mannequin by the probability of the null mannequin, the Bayes Issue signifies how more likely the information is beneath the choice speculation. On this case, we get 32.46, a really sturdy proof, suggesting that the information is not uniformly distributed across the circle and there’s a desire for the South path.
Nonetheless, we moreover calculate the chance that the focus parameter kappa is bigger than 0.5. It is a easy strategy to verify if the distribution is considerably totally different from the uniform one. With the Unimodal South mannequin, this probabiliy is barely 0.0649, that means that the distribution continues to be fairly unfold out.
Let’s attempt one other mannequin: Bimodal North-South Combination.
This time I’ll assume that the distribution is bimodal with peaks round 0° and 180°, simply as we’ve seen within the compass rose.
To realize this, I’ll want a mix of two Von Mises distributions with totally different mounted imply instructions and a shared focus parameter.
Let’s outline a couple of helper features first:
# Sort aliases
ArrayLike = Union[np.ndarray, pd.Series]
ResultDict = Dict[str, Union[float, InferenceData.InferenceData]]def compute_mixture_vonmises_logpdf(
collection: ArrayLike,
kappa: float,
weights: npt.NDArray[np.float64],
mus: Checklist[float]
) -> float:
"""
Compute log PDF for a mix of von Mises distributions
Parameters:
-----------
collection: ArrayLike
Array of noticed angles in radians
kappa: float
Focus parameter
weights: npt.NDArray[np.float64],
Array of combination weights
mus: Checklist[float]
Array of means for every part
Returns:
--------
float: Sum of log possibilities for all knowledge factors
"""
mixture_pdf = np.zeros_like(collection)
for w, mu in zip(weights, mus):
mixture_pdf += w * vonmises.pdf(collection, kappa, loc=mu)
return np.log(np.most(mixture_pdf, 1e-300)).sum()
def compute_log_likelihoods(
hint: az.InferenceData,
collection: ArrayLike,
mus: Checklist[float]
) -> np.ndarray:
"""
Compute log likelihoods for every pattern within the hint
Parameters:
-----------
hint: az.InferenceData
The hint from the PyMC3 mannequin sampling.
collection: ArrayLike
Array of noticed angles in radians
"""
kappa_samples = hint.posterior.kappa.values.flatten()
weights_samples = hint.posterior.weights.values.reshape(-1, 2)
# Calculate log probability for every posterior pattern
log_likes = []
for ok, w in zip(kappa_samples, weights_samples):
log_like = compute_mixture_vonmises_logpdf(
collection,
kappa=ok,
weights=w,
mus=mus
)
log_likes.append(log_like)
# Calculate marginal probability utilizing log-sum-exp trick
log_likelihood_h1 = np.max(log_likes) + np.log(np.imply(np.exp(log_likes - np.max(log_likes))))
return log_likelihood_h1
def posterior_report(
log_likelihood_h0: float,
log_likelihood_h1: float,
kappa_samples: ArrayLike,
kappa_threshold: float = 0.5
) -> str:
"""
Generate a report with Bayes Issue and chance kappa > threshold
Parameters:
-----------
log_likelihood_h0: float
Log probability for the null speculation
log_likelihood_h1: float
Log probability for the choice speculation
kappa_samples: ArrayLike
Flattened posterior samples of the focus parameter
kappa_threshold: float
Threshold for computing the chance that kappa > threshold
Returns:
--------
abstract: str
A formatted string containing the abstract statistics.
"""
BF = np.exp(log_likelihood_h1 - log_likelihood_h0)
abstract = (
f"Bayes Issue: {BF:.4f}n"
f"Likelihood kappa > {kappa_threshold}: {np.imply(kappa_samples > kappa_threshold):.4f}"
)
return abstract
And now again to the mannequin:
mu1 = 0 # 0 levels
mu2 = np.pi # 180 levelswith pm.Mannequin() as model_mixture_bimodal_NS:
# Priors for focus parameters
kappa = pm.HalfNormal('kappa', sigma=10)
# Priors for part weights
weights = pm.Dirichlet('weights', a=np.ones(2))
# Outline the von Mises elements
vm1 = pm.VonMises.dist(mu=mu1, kappa=kappa)
vm2 = pm.VonMises.dist(mu=mu2, kappa=kappa)
# Combination distribution
probability = pm.Combination(
'angles',
w=weights,
comp_dists=[vm1, vm2],
noticed=knowledge['radians']
)
# Pattern from the posterior
trace_mixture_bimodal_NS = pm.pattern(
10000, tune=3000, chains=4, return_inferencedata=True, idata_kwargs={'log_likelihood': True})
# Get kappa samples
kappa_samples = trace_mixture_bimodal_NS.posterior.kappa.values.flatten()
As soon as once more, let’s visualize the mannequin graph and the posterior distribution for the focus parameter κ:
# Mannequin graph
pm.model_to_graphviz(model_mixture_bimodal_NS)
# Posterior Evaluation
az.plot_posterior(trace_mixture_bimodal_NS, var_names=['kappa'])
plt.present()
And at last, let’s calculate the Bayes Issue and the chance that the focus parameter κ is bigger than 0.5:
log_likelihood_h1 = compute_log_likelihoods(trace_mixture_bimodal_NS, knowledge['radians'], [mu1, mu2])
print(posterior_report(log_likelihood_h0, log_likelihood_h1, kappa_samples))
>> Bayes Issue: 214.2333
>> Likelihood kappa > 0.5: 0.9110
Incredible! Each our metrics point out that this mannequin is a a lot better match for the information. The Bayes Issue suggests a decisive proof and a lot of the posterior κ samples are larger than 0.5 with the imply worth of 0.99 as we’ve seen on the distribution plot.
Let’s attempt a pair extra fashions earlier than wrapping up.
This mannequin as soon as once more assumes a bimodal distribution however this time with peaks round 270° and 180°, which have been frequent instructions within the compass rose.
mu1 = np.pi # 180 levels
mu2 = 3 * np.pi / 2 # 270 levelswith pm.Mannequin() as model_mixture_bimodal_WS:
# Priors for focus parameters
kappa = pm.HalfNormal('kappa', sigma=10)
# Priors for part weights
weights = pm.Dirichlet('weights', a=np.ones(2))
# Outline the 4 von Mises elements
vm1 = pm.VonMises.dist(mu=mu1, kappa=kappa)
vm2 = pm.VonMises.dist(mu=mu2, kappa=kappa)
# Combination distribution
probability = pm.Combination(
'angles',
w=weights,
comp_dists=[vm1, vm2],
noticed=knowledge['radians']
)
# Pattern from the posterior
trace_mixture_bimodal_WS = pm.pattern(
10000, tune=3000, chains=4, return_inferencedata=True, idata_kwargs={'log_likelihood': True})
# Get kappa samples
kappa_samples = trace_mixture_bimodal_WS.posterior.kappa.values.flatten()
# Posterior Evaluation
az.plot_posterior(trace_mixture_bimodal_WS, var_names=['kappa'])
plt.present()
log_likelihood_h1 = compute_log_likelihoods(trace_mixture_bimodal_WS, knowledge['radians'], [mu1, mu2])
print(posterior_report(log_likelihood_h0, log_likelihood_h1, kappa_samples))
>> Bayes Issue: 20.2361
>> Likelihood kappa > 0.5: 0.1329
Nope, undoubtedly inferior to the earlier mannequin. Subsequent!
Closing spherical. Possibly my canine actually likes to align himself with the cardinal instructions? Let’s attempt a quadrimodal distribution with peaks round 0°, 90°, 180°, and 270°.
mu1 = 0 # 0 levels
mu2 = np.pi / 2 # 90 levels
mu3 = np.pi # 180 levels
mu4 = 3 * np.pi / 2 # 270 levelswith pm.Mannequin() as model_mixture_quad:
# Priors for focus parameters
kappa = pm.HalfNormal('kappa', sigma=10)
# Priors for part weights
weights = pm.Dirichlet('weights', a=np.ones(4))
# Outline the 4 von Mises elements
vm1 = pm.VonMises.dist(mu=mu1, kappa=kappa)
vm2 = pm.VonMises.dist(mu=mu2, kappa=kappa)
vm3 = pm.VonMises.dist(mu=mu3, kappa=kappa)
vm4 = pm.VonMises.dist(mu=mu4, kappa=kappa)
# Combination distribution
probability = pm.Combination(
'angles',
w=weights,
comp_dists=[vm1, vm2, vm3, vm4],
noticed=knowledge['radians']
)
# Pattern from the posterior
trace_mixture_quad = pm.pattern(
10000, tune=3000, chains=4, return_inferencedata=True, idata_kwargs={'log_likelihood': True}
)
# Get kappa samples
kappa_samples = trace_mixture_quad.posterior.kappa.values.flatten()
# Posterior Evaluation
az.plot_posterior(trace_mixture_quad, var_names=['kappa'])
plt.present()
log_likelihood_h1 = compute_log_likelihoods(trace_mixture_quad, knowledge['radians'], [mu1, mu2, mu3, mu4])
print(posterior_report(log_likelihood_h0, log_likelihood_h1, kappa_samples))
>> Bayes Issue: 0.0000
>> Likelihood kappa > 0.5: 0.9644
Effectively… Probably not. Though the chance that the focus parameter κκ is bigger than 0.5 is sort of excessive, the Bayes Issue is 0.0.
The beauty of Bayes Issue is that it penalizes overly complicated fashions successfully stopping overfitting.
Let’s summarize the outcomes of all fashions utilizing info standards. We’ll use the Broadly Relevant Data Criterion (WAIC) and the Go away-One-Out Cross-Validation (LOO) to match the fashions.
# Compute WAIC for every mannequin
wail_uni = az.waic(trace_uni)
waic_quad = az.waic(trace_mixture_quad)
waic_bimodal_NS = az.waic(trace_mixture_bimodal_NS)
waic_bimodal_WS = az.waic(trace_mixture_bimodal_WS)model_dict = {
'Quadrimodal Mannequin': trace_mixture_quad,
'Bimodal Mannequin (NS)': trace_mixture_bimodal_NS,
'Bimodal Mannequin (WS)': trace_mixture_bimodal_WS,
'Unimodal Mannequin': trace_uni
}
# Evaluate fashions utilizing WAIC
waic_comparison = az.examine(model_dict, ic='waic')
waic_comparison