The final graph processing/coaching pipeline for symbolic music scores inside GraphMuse includes the next steps:
- Preprocess the database of scores to generate enter graphs, GraphMuse can do that for you quick and straightforward;
- Pattern the enter graphs to create memory-efficient batches, once more GraphMuse acquired your again;
- Type a batch as a brand new graph with nodes and edges from varied sampled enter graphs; For every graph, a set of nodes is chosen which we name goal nodes. The neighbors of the goal nodes will also be fetched by demand in a course of referred to as node-wise sampling.
- Replace the goal nodes’ representations via graph convolution to create node embeddings. GraphMuse gives some fashions that you need to use, in any other case PyTorch Geometric will also be your pal;
- Use these embeddings for task-specific functions. This half is on you however I’m certain you can also make it!
Word that focus on nodes could embody all or a subset of batch nodes relying on the sampling technique.
Now that the method is graphically defined let’s take a more in-depth take a look at how GraphMuse handles sampling notes from every rating.
Sampling course of per rating.
- A randomly chosen notice (in yellow) is first sampled.
- The boundaries of the goal notes are then computed with a price range of 15 notes on this instance (pink and yellow notes).
- Then the k-hop neighbors are fetched for the targets (mild blue for 1-hop and darker blue for 2-hop). The k-hop neighbors are computed with respect to the enter graph (depicted with coloured edges connecting noteheads within the determine above).
- We will additionally lengthen the sampling course of for the beat and measure components. Word that the k-hop neighbors needn’t be strictly associated to a time window.
To maximise the computational sources (i.e. reminiscence) the above course of is repeated for a lot of scores without delay to create one batch. Utilizing this course of, GraphMuse asserts that each sampled phase goes to have the identical measurement of goal notes. Each sampled phase could be mixed to a brand new graph which might be of measurement at most #_scores x #_target_notes. This new graph constitutes the batch for the present coaching step.
For the hands-on half let’s attempt to use GraphMuse and use a mannequin for pitch spelling. The pitch spelling job is about inferring the notice identify and accidentals when they’re absent from the rating. An instance of this software is when we’ve a quantized midi and wish to create a rating corresponding to the instance within the determine beneath:
Earlier than putting in GraphMuse you will want to put in PyTorch and PyTorch Geometric. Take a look at the suitable model in your system right here and right here.
After this step, to put in GraphMuse open your most well-liked terminal and sort:
pip set up graphmuse
After set up, let’s learn a MIDI file from a URL and create the rating graph with GraphMuse.
import graphmuse as gmmidi_url_raw = "https://github.com/CPJKU/partitura/uncooked/refs/heads/foremost/checks/information/midi/bach_midi_score.mid"
graph = gm.load_midi_to_graph(midi_url_raw)
The underlying course of reads the file with Partitura after which feeds it via GraphMuse.
To coach our mannequin to deal with Pitch Spelling, we first want a dataset of musical scores the place the pitch spelling has already been annotated. For this, we’ll be utilizing the ASAP Dataset (licenced underneath CC BY-NC-SA 4.0), which can function the muse for our mannequin’s studying. To get the ASAP Dataset you may obtain it utilizing git or instantly from github:
git clone https://github.com/cpjku/asap-dataset.git
The ASAP dataset consists of scores and performances of varied classical piano items. For our use-case we are going to use solely the scores which finish in .musicxml
.
As we load this dataset, we’ll want two important utilities: one to encode pitch spelling and one other to deal with key signature info, each of which might be transformed into numerical labels. Luckily, these utilities can be found throughout the pre-built pitch spelling mannequin in GraphMuse. Let’s start by importing all the required packages and loading the primary rating to get began.
import graphmuse as gm
import partitura as pt
import os
import torch
import numpy as np# Listing containing the dataset, change this to the situation of your dataset
dataset_dir = "/your/path/to/the/asap-dataset"
# Discover all of the rating recordsdata within the dataset (they're all named 'xml_score.musicxml')
score_files = [os.path.join(dp, f) for dp, dn, filenames in os.walk(dataset_dir) for f in filenames if f == 'xml_score.musicxml']
# Use the primary 30 scores, change this quantity to make use of roughly scores
score_files = score_files[:30]
# probe the primary rating file
rating = pt.load_score(score_files[0])
# Extract options and notice array
options, f_names = gm.utils.get_score_features(rating)
na = rating.note_array(include_pitch_spelling=True, include_key_signature=True)
# Create a graph from the rating options
graph = gm.create_score_graph(options, rating.note_array())
# Get enter function measurement and metadata from the primary graph
in_feats = graph["note"].x.form[1]
metadata = graph.metadata()
# Create a mannequin for pitch spelling prediction
mannequin = gm.nn.fashions.PitchSpellingGNN(
in_feats=in_feats, n_hidden=128, out_feats_enc=64, n_layers=2, metadata=metadata, add_seq=True
)
# Create encoders for pitch and key signature labels
pe = mannequin.pitch_label_encoder
ke = mannequin.key_label_encoder
Subsequent, we’ll load the remaining rating recordsdata from the dataset to proceed getting ready our information for mannequin coaching.
# Initialize lists to retailer graphs and encoders
graphs = [graph]# Course of every rating file
for score_file in score_files[1:]:
# Load the rating
rating = pt.load_score(score_file)
# Extract options and notice array
options, f_names = gm.utils.get_score_features(rating)
na = rating.note_array(include_pitch_spelling=True, include_key_signature=True)
# Encode pitch and key signature labels
labels_pitch = pe.encode(na)
labels_key = ke.encode(na)
# Create a graph from the rating options
graph = gm.create_score_graph(options, rating.note_array())
# Add encoded labels to the graph
graph["note"].y_pitch = torch.from_numpy(labels_pitch).lengthy()
graph["note"].y_key = torch.from_numpy(labels_key).lengthy()
# Append the graph to the checklist
graphs.append(graph)
As soon as the graph constructions are prepared, we are able to transfer on to creating the information loader, which is conveniently supplied by GraphMuse. At this stage, we’ll additionally outline commonplace coaching parts just like the loss operate and optimizer to information the educational course of.
# Create a DataLoader to pattern subgraphs from the graphs
loader = gm.loader.MuseNeighborLoader(graphs, subgraph_size=100, batch_size=16, num_neighbors=[3, 3])# Outline loss features for pitch and key prediction
loss_pitch = torch.nn.CrossEntropyLoss()
loss_key = torch.nn.CrossEntropyLoss()
# Outline the optimizer
optimizer = torch.optim.Adam(mannequin.parameters(), lr=0.001)
Let me remark a bit extra on the gm.loader.MuseNeighborLoader.
That is the core dataloader in GraphMuse and it accommodates the sampling that was defined within the earlier part. subgraph_size refers back to the variety of goal nodes per enter graph, batch_size is the variety of sampled graphs per batch, and at last, num_neighbors refers back to the variety of neighbors sampled per sampled node in every layer.
With every little thing in place, we’re lastly prepared to coach the mannequin. So, let’s dive in and begin the coaching course of!
# Prepare the mannequin for five epochs
for epoch in vary(5):
loss = 0
i = 0
for batch in loader:
# Zero the gradients
optimizer.zero_grad()# Get neighbor masks for nodes and edges for extra environment friendly coaching
neighbor_mask_node = {ok: batch[k].neighbor_mask for ok in batch.node_types}
neighbor_mask_edge = {ok: batch[k].neighbor_mask for ok in batch.edge_types}
# Ahead cross via the mannequin
pred_pitch, pred_key = mannequin(
batch.x_dict, batch.edge_index_dict, neighbor_mask_node, neighbor_mask_edge,
batch["note"].batch[batch["note"].neighbor_mask == 0]
)
# Compute loss for pitch and key prediction
loss_pitch_val = loss_pitch(pred_pitch, batch["note"].y_pitch[batch["note"].neighbor_mask == 0])
loss_key_val = loss_key(pred_key, batch["note"].y_key[batch["note"].neighbor_mask == 0])
# Whole loss
loss_val = loss_pitch_val + loss_key_val
# Backward cross and optimization
loss_val.backward()
optimizer.step()
# Accumulate loss
loss += loss_val.merchandise()
i += 1
# Print common loss for the epoch
print(f"Epoch {epoch} Loss {loss / i}")
Hopefully, we’ll quickly see the loss operate lowering, a optimistic signal that our mannequin is successfully studying the way to carry out pitch spelling. Fingers crossed!
GraphMuse is a framework that tries to make the coaching and deployment of graph fashions for symbolic music processing simpler.
For individuals who wish to retrain, deploy, or finetune earlier state-of-the-art fashions for symbolic music evaluation, GraphMuse accommodates a few of the obligatory parts to re-build and re-train your mannequin sooner and extra effectively.
GraphMuse retains its flexibility via its simplicity, for individuals who wish to prototype, innovate, and design new fashions. It goals to supply a easy set of utilities moderately than together with advanced chained pipelines that may block the innovation course of.
For individuals who wish to study, visualize, and get hands-on expertise, GraphMuse is nice to get you began. It provides a straightforward introduction to primary features and pipelines with a couple of traces of code. GraphMuse can also be linked with MusGViz, which permits graphs and scores to be simply visualized collectively.
We can not speak in regards to the optimistic elements of any challenge with out discussing the detrimental ones as effectively.
GraphMuse is a new child challenge and in its present state, it’s fairly easy. It’s targeted on overlaying the important elements of graph studying moderately than being a holistic framework that covers all prospects. Due to this fact it nonetheless focuses so much on user-based implementation on many elements of the aforementioned pipeline.
Like each open-source challenge in growth GraphMuse wants assist to develop. So please, for those who discover bugs or need extra options don’t hesitate to report, request, or contribute to the GraphMuse GitHub challenge.
Final however not least, GraphMuse makes use of C libraries corresponding to torch-sparse and torch-scatter and has its personal C-bindings to speed up graph creation subsequently set up will not be all the time easy. The home windows set up is tougher judging from our person testing and person interplay stories, though not unimaginable (I’m working it on Home windows myself).
Future plans embody:
- Making set up simpler;
- Add extra assist for fashions and dataloaders for exact duties;
- Develop the open-source neighborhood round GraphMuse to maintain graph coding for music rising.
GraphMuse is a Python library that makes working with music graphs slightly bit simpler. It focuses on the coaching facet of graph-based fashions for music however goals to retain flexibility when research-based tasks require it.
If you want to assist the event and future progress of GraphMuse please star the repo right here .
Pleased graph coding !!!
[all images are by the author]