from musicntd.model.current_plot import *
import musicntd.scripts.final_tests as test
import musicntd.autosimilarity_segmentation as as_seg
import musicntd.scripts.hide_code
import pandas as pd
In this notebook are presented our final segmentation results for this version of the code. These are the results presented in the paper.
# Fixed hyperparams
subdivision = 96
annotations_type = "MIREX10"
penalty_func = "modulo8" # "Favouring 8, then modulo 4"
# Paths
entire_rwc = "C:\\Users\\amarmore\\Desktop\\Audio samples\\RWC Pop\\Entire RWC"
even_songs = "C:\\Users\\amarmore\\Desktop\\Audio samples\\RWC Pop\\Even songs"
odd_songs = "C:\\Users\\amarmore\\Desktop\\Audio samples\\RWC Pop\\Odd songs"
For these final tests, we will try ranks for $H$ and $Q$ in the range [12,16,20,24,28,32,36,40,44,48], and a range for the parameter $\lambda$ of $[0,2[$ with a step of $0.1$.
ranks_rhythm = [12,16,20,24,28,32,36,40,44,48]
ranks_pattern = [12,16,20,24,28,32,36,40,44,48]
penalty_range = [i/10 for i in range(0,20)]
Below is the convolution kernel we will use.
convolution_type = "eight_bands"
plot_me_this_spectrogram(as_seg.compute_all_kernels(10, convolution_type = convolution_type)[-1], title="Kernel", x_axis = None, y_axis = None)
We begin by a condition (not included in the paper) where we fix all parameters. In that sense, it's a totally blind method. Parameters though, are fixed empirically.
Parameter $\lambda$ was fixed to 1 as it was the best parameter in the experimental notebook for the method "Favouring 8, then modulo 4".
Ranks were set to 32 for both $H$ and $Q$, as it is generally a good compromise.
_ = test.final_results_fixed_conditions(entire_rwc, [12,32,32], penalty_weight = 1, annotations_type = annotations_type, subdivision = subdivision, penalty_func = penalty_func, legend = "with fixed conditions", convolution_type = convolution_type)
In order to find accurate parameters, we decided to proceed by 2-fold cross-validation.
Here, ranks for $H$ and $Q$ and $\lambda$ are learned on even songs and then tested on odd songs, and vice-versa.
Final results, in the paper, are means of results on both tests subsets.
Firstly:
best_param_even, results_at_zero_five_even, results_at_three_even = test.several_ranks_with_cross_validation_of_param_RWC(learning_dataset = even_songs, testing_dataset = odd_songs,
ranks_rhythm = ranks_rhythm, ranks_pattern = ranks_pattern, penalty_range = penalty_range, annotations_type = annotations_type, subdivision = subdivision, penalty_func = penalty_func, convolution_type = convolution_type)
Secondly:
best_param_odd, results_at_zero_five_odd, results_at_three_odd = test.several_ranks_with_cross_validation_of_param_RWC(learning_dataset = odd_songs, testing_dataset = even_songs,
ranks_rhythm = ranks_rhythm, ranks_pattern = ranks_pattern, penalty_range = penalty_range, annotations_type = annotations_type,
subdivision = subdivision, penalty_func = penalty_func, convolution_type = convolution_type)
Final results:
With 0.5 seconds tolerance window:
test_mean_zero_five = [(results_at_zero_five_even[i] + results_at_zero_five_odd[i])/2 for i in range(6)]
pd.DataFrame(test_mean_zero_five, index = ['True Positive','False Positives','False Negatives','Precision', 'Recall', 'F measure'], columns = ["Results at 0.5 seconds on both test datasets"]).T
With 3 seconds tolerance window:
test_mean_three = [(results_at_three_even[i] + results_at_three_odd[i])/2 for i in range(6)]
pd.DataFrame(test_mean_three, index = ['True Positives','False Positives','False Negatives','Precision', 'Recall', 'F measure'], columns = ["Results at 3 seconds on both test datasets"]).T
This below contains the results in the oracle ranks condition, which means that, for each song, we keep only the ranks leading to the best F measure.
penalty_weight = 1 # Fixed to one rather than learned, for convenience
zero_five_full = test.oracle_ranks(entire_rwc, ranks_rhythm, ranks_pattern, penalty_weight, annotations_type = annotations_type, subdivision = subdivision, penalty_func = penalty_func, convolution_type = convolution_type)
musicntd.scripts.hide_code.plot_3d_ranks_study(zero_five_full, ranks_rhythm, ranks_pattern)
Below are presented the results when we segment directly the autosimilarity of the signal.
This allows us to compare the benefit directly related to the NTD in the segmentation.
sig_zero_five_even, sig_three_even = test.cross_validation_on_signal(even_songs, odd_songs, penalty_range, convolution_type = convolution_type)
sig_zero_five_odd, sig_three_odd = test.cross_validation_on_signal(odd_songs, even_songs, penalty_range, convolution_type = convolution_type)
Finally, the final results are, at 0.5 seconds:
sig_test_mean_zero_five = [(sig_zero_five_odd[i] + sig_zero_five_even[i])/2 for i in range(6)]
pd.DataFrame(sig_test_mean_zero_five, index = ['True Positive','False Positives','False Negatives','Precision', 'Recall', 'F measure'], columns = ["Results at 0.5 seconds on both test datasets"]).T
And at 3 seconds:
sig_test_mean_three = [(sig_three_odd[i] + sig_three_even[i])/2 for i in range(6)]
pd.DataFrame(sig_test_mean_three, index = ['True Positive','False Positives','False Negatives','Precision', 'Recall', 'F measure'], columns = ["Results at 0.5 seconds on both test datasets"]).T
Also not included in the paper are below the results of the segmentation of the autosimilarity of the signal on the entire RWC dataset, but without the penalty regularization function. As we can see, the penalty function greatly impactes the segmentation when sementing the signal.
test.results_on_signal_without_lambda(entire_rwc, convolution_type = convolution_type, legend = "on the signal and on all RWC, without penalty function.")