Saturday, July 23, 2011

TIMIT What Is It?

    TIMIT is a corpus of phonemically and lexically transcribed speech of American English speakers of different sexes and dialects. Each transcribed element has been delineated in time.
TIMIT was designed to further acoustic-phonetic knowledge and automatic speech recognition systems. It was commissioned by DARPA and worked on by many sites, including Texas Instruments (TI) and Massachusetts Institute of Technology (MIT), hence the corpus' name.[1] There is also a telephone bandwidth version called NTIMIT (Network TIMIT).
 
       The Texas Instruments/Massachusetts Institute of Technology (TIMIT)corpus of read speech has been designed to provide speech data for theacquisition of acoustic-phonetic knowledge and for the development andevaluation of automatic speech recognition systems. TIMIT containsspeech from 630 speakers representing 8 major dialect divisions ofAmerican English, each speaking 10 phonetically-rich sentences. TheTIMIT corpus includes time-aligned orthographic, phonetic, and wordtranscriptions, as well as speech waveform data for each spokensentence. The release of TIMIT contains several improvements over thePrototype CD-ROM released in December, 1988: (1) full 630-speakercorpus, (2) checked and corrected transcriptions, (3) word-alignmenttranscriptions, (4) NIST SPHERE-headered waveform files and headermanipulation software, (5) phonemic dictionary, (6) new test andtraining subsets balanced for dialectal and phonetic coverage, and (7)more extensive documentation.


Via@http://en.wikipedia.org/wiki/TIMIT
Via@http://adsabs.harvard.edu/abs/1993STIN...9327403G

No comments:

Post a Comment