`AddShortNoises`

Added in v0.9.0

Mix in various (bursts of overlapping) sounds with random pauses between. Useful if your original sound is clean and you want to simulate an environment where short noises sometimes occur.

A folder of (noise) sounds to be mixed in must be specified.

Input-output example

Here we add some short noise sounds to a voice recording.

Input-output waveforms and spectrograms

Input sound	Transformed sound

Usage examples

Noise RMS relative to whole inputAbsolute RMS

from audiomentations import AddShortNoises, PolarityInversion

transform = AddShortNoises(
    sounds_path="/path/to/folder_with_sound_files",
    min_snr_db=3.0,
    max_snr_db=30.0,
    noise_rms="relative_to_whole_input",
    min_time_between_sounds=2.0,
    max_time_between_sounds=8.0,
    noise_transform=PolarityInversion(),
    p=1.0
)

augmented_sound = transform(my_waveform_ndarray, sample_rate=16000)

from audiomentations import AddShortNoises, PolarityInversion

transform = AddShortNoises(
    sounds_path="/path/to/folder_with_sound_files",
    min_absolute_noise_rms_db=-50.0,
    max_absolute_noise_rms_db=-20.0,        
    noise_rms="absolute",
    min_time_between_sounds=2.0,
    max_time_between_sounds=8.0,
    noise_transform=PolarityInversion(),
    p=1.0
)

augmented_sound = transform(my_waveform_ndarray, sample_rate=16000)

AddShortNoises API

sounds_path: Union[List[Path], List[str], Path, str]

A path or list of paths to audio file(s) and/or folder(s) with audio files. Can be str or Path instance(s). The audio files given here are supposed to be (short) noises.

~~min_snr_in_db: float • unit: Decibel~~

Deprecated as of v0.31.0, removed as of v0.38.0. Use min_snr_db instead

~~max_snr_in_db: float • unit: Decibel~~

Deprecated as of v0.31.0, removed as of v0.38.0. Use max_snr_db instead

min_snr_db: float • unit: Decibel

Default: -6.0. Minimum signal-to-noise ratio in dB. A lower value means the added sounds/noises will be louder. This gets ignored if noise_rms is set to "absolute".

max_snr_db: float • unit: Decibel

Default: 18.0. Maximum signal-to-noise ratio in dB. A lower value means the added sounds/noises will be louder. This gets ignored if noise_rms is set to "absolute".

min_time_between_sounds: float • unit: seconds

Default: 2.0. Minimum pause time (in seconds) between the added sounds/noises

max_time_between_sounds: float • unit: seconds

Default: 8.0. Maximum pause time (in seconds) between the added sounds/noises

noise_rms: str • choices: "absolute", "relative", "relative_to_whole_input"

Default: "relative_to_whole_input" (since v0.29.0)

This parameter defines how the noises will be added to the audio input.

"relative": the RMS value of the added noise will be proportional to the RMS value of the input sound calculated only for the region where the noise is added.
"absolute": the added noises will have an RMS independent of the RMS of the input audio file.
"relative_to_whole_input": the RMS of the added noises will be proportional to the RMS of the whole input sound.

min_absolute_noise_rms_db: float • unit: Decibel

Default: -50.0. Is only used if noise_rms is set to "absolute". It is the minimum RMS value in dB that the added noise can take. The lower the RMS is, the lower will the added sound be.

max_absolute_noise_rms_db: float • unit: Decibel

Default: -20.0. Is only used if noise_rms is set to "absolute". It is the maximum RMS value in dB that the added noise can take. Note that this value can not exceed 0.

add_all_noises_with_same_level: bool

Default: False. Whether to add all the short noises (within one audio snippet) with the same SNR. If noise_rms is set to "absolute", the RMS is used instead of SNR. The target SNR (or RMS) will change every time the parameters of the transform are randomized.

include_silence_in_noise_rms_estimation: bool

Default: True. It chooses how the RMS of the noises to be added will be calculated. If this option is set to False, the silence in the noise files will be disregarded in the RMS calculation. It is useful for non-stationary noises where silent periods occur.

burst_probability: float

Default: 0.22. For every noise that gets added, there is a probability of adding an extra burst noise that overlaps with the noise. This parameter controls that probability. min_pause_factor_during_burst and max_pause_factor_during_burst control the amount of overlap.

min_pause_factor_during_burst: float

Default: 0.1. Min value of how far into the current sound (as fraction) the burst sound should start playing. The value must be greater than 0.

max_pause_factor_during_burst: float

Default: 1.1. Max value of how far into the current sound (as fraction) the burst sound should start playing. The value must be greater than 0.

min_fade_in_time: float • unit: seconds

Default: 0.005. Min noise fade in time in seconds. Use a value larger than 0 to avoid a "click" at the start of the noise.

max_fade_in_time: float • unit: seconds

Default: 0.08. Max noise fade in time in seconds. Use a value larger than 0 to avoid a "click" at the start of the noise.

min_fade_out_time: float • unit: seconds

Default: 0.01. Min sound/noise fade out time in seconds. Use a value larger than 0 to avoid a "click" at the end of the sound/noise.

max_fade_out_time: float • unit: seconds

Default: 0.1. Max sound/noise fade out time in seconds. Use a value larger than 0 to avoid a "click" at the end of the sound/noise.

signal_gain_in_db_during_noise: float • unit: Decibel

Deprecated as of v0.31.0. Use signal_gain_db_during_noise instead

signal_gain_db_during_noise: float • unit: Decibel

Default: 0.0. Gain applied to the signal during a short noise. When fading the signal to the custom gain, the same fade times are used as for the noise, so it's essentially cross-fading. The default value (0.0) means the signal will not be gained. If set to a very low value, e.g. -100.0, this feature could be used for completely replacing the signal with the noise. This could be relevant in some use cases, for example:

replace the signal with another signal of a similar class (e.g. replace some speech with a cough)
simulate an ECG off-lead condition (electrodes are temporarily disconnected)

noise_transform: Optional[Callable[[NDArray[np.float32], int], NDArray[np.float32]]]

Default: None. A callable waveform transform (or composition of transforms) that gets applied to noises before they get mixed in.

p: float • range: [0.0, 1.0]

Default: 0.5. The probability of applying this transform.

lru_cache_size: int

Default: 64. Maximum size of the LRU cache for storing noise files in memory

Source code

audiomentations/augmentations/add_short_noises.py