remove silence at the beginning and at the end of wave files with PyDub
You can this code:
from pydub.silence import detect_nonsilent
def remove_sil(path_in, path_out, format="wav"):
sound = AudioSegment.from_file(path_in, format=format)
non_sil_times = detect_nonsilent(sound, min_silence_len=50, silence_thresh=sound.dBFS * 1.5)
if len(non_sil_times) > 0:
non_sil_times_concat = [non_sil_times[0]]
if len(non_sil_times) > 1:
for t in non_sil_times[1:]:
if t[0] - non_sil_times_concat[-1][-1] < 200:
non_sil_times_concat[-1][-1] = t[1]
else:
non_sil_times_concat.append(t)
non_sil_times = [t for t in non_sil_times_concat if t[1] - t[0] > 350]
sound[non_sil_times[0][0]: non_sil_times[-1][1]].export(path_out, format='wav')
I would advise that you cycle in chunks of at least 10 ms in order to do it a little more quickly (less iterations) and also because individual samples don't really have a "loudness".
Sound is vibration, so at a minimum it would take 2 samples to detect whether there was actually any sound, (but that would only tell you about high frequency).
Anyway… something like this could work:
from pydub import AudioSegment
def detect_leading_silence(sound, silence_threshold=-50.0, chunk_size=10):
'''
sound is a pydub.AudioSegment
silence_threshold in dB
chunk_size in ms
iterate over chunks until you find the first one with sound
'''
trim_ms = 0 # ms
assert chunk_size > 0 # to avoid infinite loop
while sound[trim_ms:trim_ms+chunk_size].dBFS < silence_threshold and trim_ms < len(sound):
trim_ms += chunk_size
return trim_ms
sound = AudioSegment.from_file("/path/to/file.wav", format="wav")
start_trim = detect_leading_silence(sound)
end_trim = detect_leading_silence(sound.reverse())
duration = len(sound)
trimmed_sound = sound[start_trim:duration-end_trim]
pydub
has probably been updated since this question was first asked, but here is the code I used to trim trailing and leading silence:
from pydub import AudioSegment
from pydub.silence import detect_leading_silence
trim_leading_silence: AudioSegment = lambda x: x[detect_leading_silence(x) :]
trim_trailing_silence: AudioSegment = lambda x: trim_leading_silence(x.reverse()).reverse()
strip_silence: AudioSegment = lambda x: trim_trailing_silence(trim_leading_silence(x))
sound = AudioSegment.from_file(file_path_here)
stripped = strip_silence(sound)
detect_leading_silence
from pydub.silence
gives you indices you can use to slice the loaded AudioSegment
. Basically, you can reverse the AudioSegment
, trim it, and reverse it again to trim trailing silence. Stripping silence from both ends is tantamount to trimming leading and trailing silences.
Note that strip_silence
should raise an IndexError
if the loaded AudioSegment
is silent or becomes silent after a trim operation.
The last time I looked, the default chunk size was 10 ms and the default silence threshold was -50 dBFS.
My version of pydub
is 0.25.1 and my version of ffmpeg
is 4.3.1.