AddBackgroundNoise
Added in v0.9.0
Mix in another sound, e.g. a background noise. Useful if your original sound is clean and you want to simulate an environment where background noise is present.
Can also be used for mixup when training classification/annotation models.
A path to a file/folder with sound(s), or a list of file/folder paths, must be specified. These sounds should ideally be at least as long as the input sounds to be transformed. Otherwise, the background sound will be repeated, which may sound unnatural.
Note that in the default case (noise_rms="relative"
) the gain of the added noise is
relative to the signal level in the input. This implies that if the input is
completely silent, no noise will be added.
Optionally, the added noise sound can be transformed (with noise_transform
) before it gets mixed in.
Here are some examples of datasets that can be downloaded and used as background noise:
Input-output example
Here we add some music to a speech recording, targeting a signal-to-noise ratio (SNR) of 5 Decibels (dB), which means that the speech (signal) is 5 dB louder than the music (noise).
Input sound | Transformed sound |
---|---|
Usage examples
from audiomentations import AddBackgroundNoise, PolarityInversion
transform = AddBackgroundNoise(
sounds_path="/path/to/folder_with_sound_files",
min_snr_db=3.0,
max_snr_db=30.0,
noise_transform=PolarityInversion(),
p=1.0
)
augmented_sound = transform(my_waveform_ndarray, sample_rate=16000)
from audiomentations import AddBackgroundNoise, PolarityInversion
transform = AddBackgroundNoise(
sounds_path="/path/to/folder_with_sound_files",
noise_rms="absolute",
min_absolute_rms_db=-45.0,
max_absolute_rms_db=-15.0,
noise_transform=PolarityInversion(),
p=1.0
)
augmented_sound = transform(my_waveform_ndarray, sample_rate=16000)
AddBackgroundNoise API
sounds_path
:Union[List[Path], List[str], Path, str]
- A path or list of paths to audio file(s) and/or folder(s) with audio files. Can be str or Path instance(s). The audio files given here are supposed to be background noises.
min_snr_db
:float
• unit: Decibel- Default:
3.0
. Minimum signal-to-noise ratio in dB. Is only used ifnoise_rms
is set to"relative"
max_snr_db
:float
• unit: Decibel- Default:
30.0
. Maximum signal-to-noise ratio in dB. Is only used ifnoise_rms
is set to"relative"
min_snr_in_db
:float
• unit: Decibel- Deprecated as of v0.31.0, removed as of v0.38.0. Use
min_snr_db
instead max_snr_in_db
:float
• unit: Decibel- Deprecated as of v0.31.0, removed as of v0.38.0. Use
max_snr_db
instead noise_rms
:str
• choices:"absolute"
,"relative"
- Default:
"relative"
. Defines how the background noise will be added to the audio input. If the chosen option is"relative"
, the root mean square (RMS) of the added noise will be proportional to the RMS of the input sound. If the chosen option is"absolute"
, the background noise will have an RMS independent of the rms of the input audio file min_absolute_rms_db
:float
• unit: Decibel- Default:
-45.0
. Is only used ifnoise_rms
is set to"absolute"
. It is the minimum RMS value in dB that the added noise can take. The lower the RMS is, the lower the added sound will be. max_absolute_rms_db
:float
• unit: Decibel- Default:
-15.0
. Is only used ifnoise_rms
is set to"absolute"
. It is the maximum RMS value in dB that the added noise can take. Note that this value can not exceed 0. min_absolute_rms_in_db
:float
• unit: Decibel- Deprecated as of v0.31.0, removed as of v0.38.0. Use
min_absolute_rms_db
instead max_absolute_rms_in_db
:float
• unit: Decibel- Deprecated as of v0.31.0, removed as of v0.38.0. Use
max_absolute_rms_in_db
instead noise_transform
:Optional[Callable[[NDArray[np.float32], int], NDArray[np.float32]]]
- Default:
None
. A callable waveform transform (or composition of transforms) that gets applied to the noise before it gets mixed in. The callable is expected to input audio waveform (numpy array) and sample rate (int). p
:float
• range: [0.0, 1.0]- Default:
0.5
. The probability of applying this transform. lru_cache_size
:int
- Default:
2
. Maximum size of the LRU cache for storing noise files in memory