anybody here do signal processing?

David Schwartz · January 15, 2022

I need some code written in either C/C++ or PASCAL (Delphi, Free Pascal, etc.) that suppresses breath noise from divers. I can provide a sample audio file that would be used to demonstrate it works. Ideally, it would be small enough to run on a Raspberry Pi Pico CPU (8MB of storage + 256k of RAM).

The "breath noise" is a large burst of mostly white noise produced over audio comms when a diver inhales and/or exhales. (think "Darth Vader's breathing")

My thinking is to have some adaptive measurement of the audible voice stream to measure the most dominant frequencies in the normal speech stream and compare packets so when a block of broad-spectrum data appears, attenuate it by some amount, like 50dB or so for as long as it's present. In other words, do some basic spectral analysis and isolate the speaker's dominant frequencies vs. those that may be missing most of the time. When you see a large increase in non-dominant broad-spectrum noise, assume it's a breath and attenuate the entire signal. People do not typically say anything meaningful when they're inhaling or exhaling, unless they're screaming very loudly, but that's an exhale. Usually this noise is most present during inhalation.

When the device turns on or restarts, it needs to reinitialize its internal model by learning the speakers dominant frequencies in the audio stream for the current diver and adapt itself. So it's fine to hear the breath noises at first and hear them fade away after 10 breaths or so while the user is speaking normally. Perhaps have a short thing they need to read, like a poem, to help train the model. AI is not required! It just needs to be able to build a statistical model to differentiate the current speaker's normal speaking voice from the breath noise and suppress the sound when the breath noise is detected. It should go without saying, but ... divers won't swap out equipment while under water. So the same person will be using the device for perhaps several hours.

It needs to keep a moving average, so occasional variations in the voice won't cause the breath noise to start showing up if the pitch of their voice changes somewhat under stress. Also, if helium is present in the air mix, the vocal pitch will increase, although this won't be a concern initially.

For example: measure the presence of signals in, say, 20 frequency bands in the normal vocal range; some will be heavily present, and most will be almost empty / non-existent. When you see a noise that suddenly shows up in half or more of the bins that are normally empty, then attenuate that sample. (We'll probably need a way to adjust that threshold and perhaps other filter parameters easily during testing without recompiling.)

A RPi Pico runs at 185 MHz or so, which should have plenty of bandwidth to handle vocals at a 22 kHz sampling rate in real-time. It can be developed in C/C++ or Pascal (Delphi, Free Pascal, etc) but it needs to be small enough to run in a RPi Pico or similarly configured device.

This seems to me like a fairly simple task for someone familiar with FFTs and basic signal processing. I'm sure there are open source FFT libs and what really needs to be done is writing some code to sample frequencies and use it to recognize when a burst of white noise occurs on the input vs. normal speech patterns, and attenuate that signal in real-time.

Here is a link that lets you DL the sample data file from Dropbox. REMOVE THE SPACES!

ht tp : // w5a. com/u/ breath_sounds

Edited January 16, 2022 by David Schwartz

Sign In

anybody here do signal processing?

Recommended Posts

David Schwartz 443

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity