TransWikia.com

Quantify the difference between two audio spectrograms

Signal Processing Asked by ictguy1 on January 28, 2021

I have two speech signals of the same length. They contain the exact same speech content but are recorded from two different microphones. I want to compare the differences between them.

1) Assuming that the two signals are perfectly aligned, I can compute the spectrograms for both of them and calculate a distance/coherence metric (L2 distance or magnitude-squared coherence) between the two spectrograms. Are there any other ways of quantifying spectrogram differences?

2) If the signals are not perfectly aligned – even a misalignment of a few milliseconds – can mess up the L2 distance between spectrograms. What could be some strategies in this scenario?

Thanks for your inputs.

2 Answers

You can compare the signals directly or using features (MFCC are quite common and are usually effective for this purpose).

Before the comparison you can time align the signals (or feature sequences) using the Dynamic Time Warping algorithm (DTW).

Answered by Filipe Pinto on January 28, 2021

You could do a cross correlation between signals to even out the time offset. But why go for spectrogram in the first place? Do you have acoustic conditions varying over time? If not, go for the spectrum, should do the same job but without the time alignement problem.

Answered by Max on January 28, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP