TransWikia.com

Dimensionality reduction for visualization purposes - "Sound map"

Sound Design Asked by VGF on October 28, 2021

I’d like to know how recordings of many various sounds can be analyzed to allow for visualizations in two dimensions.
My idea would be to find two data features (e.g. using principal component analysis) that make every sound class (dog bark, baby cry, etc.) distinguishable from others.
I’m struggling to understand what parameters to focus on and which method to use.

Thanks for every comment.

One Answer

For dimensionality reduction you need features to start with. You can for example extract MFCC’s or some other low-level features such as MPEG-7 descriptors. Then you can visualise them using PCA. TBH for this task you might be better of using t-SNE or UMAP to project this high dimensional data while preserving local clusters. Lastly, just have a look at YAMNet or VGGish models, which are already suited for SED task. You can extract embeddings and treat them as features for your visualisation.

Answered by jojek on October 28, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP