Audio Activity Detection and Classification

Question

I am starting a new project. Actually, my real intention is to learn speech recognition but for warming up I want to improve in audio signal processing.
In my project, I aim to record sounds except for human speech in my house when I go out. I am planning to divide my project two sections: Audio Activity Detection and Audio Classification. However, some questions appear in my mind.

What is the best techniques for audio classification? Could you supply some informative articles for me? I saw that MFCC and CNN are generally used for speech recognition. They are also valid for audio classification? Or, should I seek another methods? My classification includes baby cry, door knocks, door opening, door bell, ringing home phone, smoke detector etc. Even though their frequencies, in contrast to frequency of human speech, cannot be limited in a specific range, should I study on frequency for classification?

When I say Audio Detection Activity, I imply that sound recording starts if silence is broken, i.e. there is a sound. For preprocessing, I will filter out the noises. Therefore, noise will be part of the silence.:)  I searched few articles and I encountered the topic of Voice Activity Detection (VAD) but it is used to understand whether human speech is present or not in the environment. I am not interested in voices. Instead, I want to collect sounds mentioned before. Is Audio Detection Activity a valid term?

I am waiting for your advices and sources.

Thank you.

jonnor · Answer

Here is an recent introduction video on the subject (2019): Audio Classification with Machine Learning

Audio Activity Detection and Classification

One Answer

Add your own answers!

Ask a Question