* -*- Mode: outline -*- * Vocal tract resonances organ-pipe vocal tract model Tube with sound source at one end. Resonance at wavelengths 4L, 4L/3, 4L/5, ... Correspond to frequencies c/4L, 3c/4L, 5c/4L, ... L is the length of the tube. c is the speed of sound in air. c=340 m/s Typical tract 17 cm. Resonances at 500 Hz, 1500 Hz, 2500 Hz, ... Vocal tract resonances produce peaks, called formants. Formant one, the lowest, 200 Hz to 1000 Hs. Formant two, 500 Hz to 2500 Hz. Formant three, 1500 Hz to 3500 Hz. * Sound Sources ** Voiced Air passing by the vocal cords causes them to flap ah, uh, etc. ** Asperation Folds of the larynix held slightly apart, to make air passing through between them turbulant. h Not normally voiced ** Frication Turbulance between tongue and top of mouth ss, sh, f May be voiced ** Source filter model * Classification of Speech sounds ** Phonemic transcription Phonemes are logical, language relative units. Semantic role as word discriminators. ** Categories of speech sounds *** Vowl uh a e i o u aa ee er uu ar aw *** Dipthong [not classified as individual phonemes] *** Glide (or liquid) r w l y *** Stop correspond with nasals **** Unvoiced p t k **** Voiced b d g *** Nasal correspond with stops m n ng *** Frictive **** Unvoiced ch **** Voiced j *** Aspirate h ** Coarticulation A sound is effected by those that come on either side of it. ** Prosody Global attributes of speech. Prosodic, or suprasegmental features. *** Features of voice quality paralinguistic anatomical differences *** Features of voice dynamics **** pitch or fundamental frequency pattern of pitch variation is intonation **** time rhythm of speech and the overall tempo **** amplitude works with rythm to produce stress ** Speech storage *** Sampling Discretation in time. Sampling interval of T seconds. Sampling frequency is 1/T Hz. **** Aliasing Componants 1/2T+f, 3/2T-f, 3/2T+f, ... masquerade as a component at 1/2T-f. Must filter the frequencies greater than half the sampling frequency to avoid aliasing. Phone system samples at 6.8 kHz. *** Quantization Discretation in amplitude. *** Logarithmic quantizatino Better signal to noise ratio over a wider range of input amplitudes. **** Companding Compressing-expanding