reading a WAV file from TIMIT database in python
Your file is not a WAV file. Apparently it is a NIST SPHERE file. From the LDC web page: "Many LDC corpora contain speech files in NIST SPHERE format." According to the description of the NIST File Format, the first four characters of the file are NIST
. That's what the scipy error is telling you: it doesn't know how to read a file that begins with NIST
.
I suspect you'll have to convert the file to WAV if you want to read the file with any of the libraries that you tried. To force the conversion to WAV using the program sph2pipe
, use the command option -f wav
(or equivalently, -f rif
), e.g.
sph2pipe -f wav input.sph output.wav
issue this from command line to verify its a wav file ... or not
xxd -b myaudiofile.wav | head
if its wav format it will appear something like
00000000: 01010010 01001001 01000110 01000110 10111100 10101111 RIFF..
00000006: 00000001 00000000 01010111 01000001 01010110 01000101 ..WAVE
0000000c: 01100110 01101101 01110100 00100000 00010000 00000000 fmt ..
00000012: 00000000 00000000 00000001 00000000 00000001 00000000 ......
00000018: 01000000 00011111 00000000 00000000 01000000 00011111 @...@.
0000001e: 00000000 00000000 00000001 00000000 00001000 00000000 ......
00000024: 01100100 01100001 01110100 01100001 10011000 10101111 data..
0000002a: 00000001 00000000 10000001 10000000 10000001 10000000 ......
00000030: 10000001 10000000 10000001 10000000 10000001 10000000 ......
00000036: 10000001 10000000 10000001 10000000 10000001 10000000 ......
here is yet another way to display contents of a binary file like a WAV
od -A x -t x1z -v audio_util_test_file_custom.wav | head
000000 52 49 46 46 24 80 00 00 57 41 56 45 66 6d 74 20 >RIFF$...WAVEfmt <
000010 10 00 00 00 01 00 01 00 44 ac 00 00 88 58 01 00 >........D....X..<
000020 02 00 10 00 64 61 74 61 00 80 00 00 00 00 78 05 >....data......x.<
000030 ed 0a 5e 10 c6 15 25 1b 77 20 ba 25 eb 2a 08 30 >..^...%.w .%.*.0<
000040 0e 35 fc 39 cf 3e 84 43 1a 48 8e 4c de 50 08 55 >.5.9.>.C.H.L.P.U<
000050 0b 59 e4 5c 91 60 12 64 63 67 85 6a 74 6d 30 70 >.Y.\.`.dcg.jtm0p<
000060 b8 72 0a 75 25 77 09 79 b4 7a 26 7c 5d 7d 5a 7e >.r.u%w.y.z&|]}Z~<
000070 1c 7f a3 7f ee 7f fd 7f d0 7f 67 7f c3 7e e3 7d >..........g..~.}<
000080 c9 7c 74 7b e6 79 1e 78 1f 76 e8 73 7b 71 d9 6e >.|t{.y.x.v.s{q.n<
000090 03 6c fa 68 c1 65 57 62 c0 5e fd 5a 0f 57 f8 52 >.l.h.eWb.^.Z.W.R<
notice the wav file begins with the characters RIFF which is the mandatory indicator the file is using wav codec ... if your system (I'm on linux) does not have above command line utility : xxd then use any hex editor like wxHexEditor to similarily examine your wav file to confirm you see the RIFF ... if no RIFF then its simply not a wav file
Here are details of wav format specs
http://soundfile.sapp.org/doc/WaveFormat/
http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html
http://unusedino.de/ec64/technical/formats/wav.html
http://www.drdobbs.com/database/inside-the-riff-specification/184409308
https://www.gamedev.net/articles/programming/general-and-gameplay-programming/loading-a-wave-file-r709
http://www.topherlee.com/software/pcm-tut-wavformat.html
http://www.labbookpages.co.uk/audio/javaWavFiles.html
http://www.johnloomis.org/cpe102/asgn/asgn1/riff.html
http://nagasm.org/ASL/sound05/
I have written a python script which will convert all the .WAV files in NIST format spoken by all speakers from all dialects to .wav files which ca n be played on your system.
Note: All the dialects folders are present in ./TIMIT/TRAIN/ . You may have to change the dialects_path according to your project structure(or if you are on Windows)
from sphfile import SPHFile
dialects_path = "./TIMIT/TRAIN/"
for dialect in dialects:
dialect_path = dialects_path + dialect
speakers = os.listdir(path = dialect_path)
for speaker in speakers:
speaker_path = os.path.join(dialect_path,speaker)
speaker_recordings = os.listdir(path = speaker_path)
wav_files = glob.glob(speaker_path + '/*.WAV')
for wav_file in wav_files:
sph = SPHFile(wav_file)
txt_file = ""
txt_file = wav_file[:-3] + "TXT"
f = open(txt_file,'r')
for line in f:
words = line.split(" ")
start_time = (int(words[0])/16000)
end_time = (int(words[1])/16000)
print("writing file ", wav_file)
sph.write_wav(wav_file.replace(".WAV",".wav"),start_time,end_time)
If you want a generic code that works for every wav file inside the folder run:
forfiles /s /m *.wav /c "cmd /c sph2pipe -f wav @file @fnameRIFF.wav"
It search for every wav file that can find and create a wav file that both scipy and wave can read with the name < base_name >RIFF.wav