Skip to content

AddShortNoises

Added in v0.9.0

Mix in various (bursts of overlapping) sounds with random pauses between. Useful if your original sound is clean and you want to simulate an environment where short noises sometimes occur.

A folder of (noise) sounds to be mixed in must be specified.

Input-output example

Here we add some short noise sounds to a voice recording.

Input-output waveforms and spectrograms

Input sound Transformed sound

Usage examples

from audiomentations import AddShortNoises, PolarityInversion

transform = AddShortNoises(
    sounds_path="/path/to/folder_with_sound_files",
    min_snr_in_db=3.0,
    max_snr_in_db=30.0,
    noise_rms="relative_to_whole_input",
    min_time_between_sounds=2.0,
    max_time_between_sounds=8.0,
    noise_transform=PolarityInversion(),
    p=1.0
)

augmented_sound = transform(my_waveform_ndarray, sample_rate=16000)
from audiomentations import AddShortNoises, PolarityInversion

transform = AddShortNoises(
    sounds_path="/path/to/folder_with_sound_files",
    min_absolute_noise_rms_db=-50.0,
    max_absolute_noise_rms_db=-20.0,        
    noise_rms="absolute",
    min_time_between_sounds=2.0,
    max_time_between_sounds=8.0,
    noise_transform=PolarityInversion(),
    p=1.0
)

augmented_sound = transform(my_waveform_ndarray, sample_rate=16000)

AddShortNoises API

sounds_path: Union[List[Path], List[str], Path, str]
A path or list of paths to audio file(s) and/or folder(s) with audio files. Can be str or Path instance(s). The audio files given here are supposed to be (short) noises.
min_snr_in_db: float • unit: Decibel
⚠ Deprecated as of v0.31.0. Use min_snr_db instead
max_snr_in_db: float • unit: Decibel
⚠ Deprecated as of v0.31.0. Use max_snr_db instead
min_snr_db: float • unit: Decibel
Default: -6.0. Minimum signal-to-noise ratio in dB. A lower value means the added sounds/noises will be louder. This gets ignored if noise_rms is set to "absolute".
max_snr_db: float • unit: Decibel
Default: 18.0. Maximum signal-to-noise ratio in dB. A lower value means the added sounds/noises will be louder. This gets ignored if noise_rms is set to "absolute".
min_time_between_sounds: float • unit: seconds
Default: 2.0. Minimum pause time (in seconds) between the added sounds/noises
max_time_between_sounds: float • unit: seconds
Default: 8.0. Maximum pause time (in seconds) between the added sounds/noises
noise_rms: str • choices: "absolute", "relative", "relative_to_whole_input"

Default: "relative" (<=v0.27), but will be changed to "relative_to_whole_input" in a future version.

This parameter defines how the noises will be added to the audio input.

  • "relative": the RMS value of the added noise will be proportional to the RMS value of the input sound calculated only for the region where the noise is added.
  • "absolute": the added noises will have an RMS independent of the RMS of the input audio file.
  • "relative_to_whole_input": the RMS of the added noises will be proportional to the RMS of the whole input sound.
min_absolute_noise_rms_db: float • unit: Decibel
Default: -50.0. Is only used if noise_rms is set to "absolute". It is the minimum RMS value in dB that the added noise can take. The lower the RMS is, the lower will the added sound be.
max_absolute_noise_rms_db: float • unit: seconds
Default: -20.0. Is only used if noise_rms is set to "absolute". It is the maximum RMS value in dB that the added noise can take. Note that this value can not exceed 0.
add_all_noises_with_same_level: bool
Default: False. Whether to add all the short noises (within one audio snippet) with the same SNR. If noise_rms is set to "absolute", the RMS is used instead of SNR. The target SNR (or RMS) will change every time the parameters of the transform are randomized.
include_silence_in_noise_rms_estimation: bool
Default: True. It chooses how the RMS of the noises to be added will be calculated. If this option is set to False, the silence in the noise files will be disregarded in the RMS calculation. It is useful for non-stationary noises where silent periods occur.
burst_probability: float
Default: 0.22. For every noise that gets added, there is a probability of adding an extra burst noise that overlaps with the noise. This parameter controls that probability. min_pause_factor_during_burst and max_pause_factor_during_burst control the amount of overlap.
min_pause_factor_during_burst: float
Default: 0.1. Min value of how far into the current sound (as fraction) the burst sound should start playing. The value must be greater than 0.
max_pause_factor_during_burst: float
Default: 1.1. Max value of how far into the current sound (as fraction) the burst sound should start playing. The value must be greater than 0.
min_fade_in_time: float • unit: seconds
Default: 0.005. Min noise fade in time in seconds. Use a value larger than 0 to avoid a "click" at the start of the noise.
max_fade_in_time: float • unit: seconds
Default: 0.08. Max noise fade in time in seconds. Use a value larger than 0 to avoid a "click" at the start of the noise.
min_fade_out_time: float • unit: seconds
Default: 0.01. Min sound/noise fade out time in seconds. Use a value larger than 0 to avoid a "click" at the end of the sound/noise.
max_fade_out_time: float • unit: seconds
Default: 0.1. Max sound/noise fade out time in seconds. Use a value larger than 0 to avoid a "click" at the end of the sound/noise.
signal_gain_in_db_during_noise: float • unit: Decibel
⚠ Deprecated as of v0.31.0. Use signal_gain_db_during_noise instead
signal_gain_db_during_noise: float • unit: Decibel

Default: 0.0. Gain applied to the signal during a short noise. When fading the signal to the custom gain, the same fade times are used as for the noise, so it's essentially cross-fading. The default value (0.0) means the signal will not be gained. If set to a very low value, e.g. -100.0, this feature could be used for completely replacing the signal with the noise. This could be relevant in some use cases, for example:

  • replace the signal with another signal of a similar class (e.g. replace some speech with a cough)
  • simulate an ECG off-lead condition (electrodes are temporarily disconnected)
noise_transform: Optional[Callable[[NDArray[np.float32], int], NDArray[np.float32]]]
Default: None. A callable waveform transform (or composition of transforms) that gets applied to noises before they get mixed in.
p: float • range: [0.0, 1.0]
Default: 0.5. The probability of applying this transform.
lru_cache_size: int
Default: 64. Maximum size of the LRU cache for storing noise files in memory