Lenker/referanser (Lydstyring)


MediaLTs demo av talesynteser:

Vocal Command, en enkel dansk løsning:

IVOS: Engelsk talegjenkjenning og TTS (XP) med spesifiseringa v egne kommandoer.

Voice recognition av mennesker med CP/dysartri

Hva er dysartri:

F. Hamidi et.al.
CanSpeak: A Customizable Speech Interace for People with Dysarthric speech 
2010 Canadisk artikkel fra York University om et taleruavhengig talestyringssystem for mennesker med redusert uttalelse med mulighet for talerjustert vokabular. Testet med 4 personer med CP. Gjennkjenning opp til 84,3%.

R. Patel et.al.
Automatic landmark analysis of dysarthric speech 
2008 artikkel - hva som kjennetegner dysartisk tale.

B. Blaney et.al.
Acoustic variability in dysarthria and computer speech recognition 
2000 artikkel som undersøker inter-taler variasjon (intra-speaker variation)!

M. Hawley et.al.
A speech-controlled environmental control system for people with severe dysarthria 
2008 artikkel som demonstrerer en taleravhengig (opptreningsbasert) omgivelseskontroll med et svært begrenset vokabular for mennesker med redusert uttalelse. Gjennkjenning opp til 95,4%.

Plaza-Aguilar et.al.
A Voice Recognition System for Speech Impaired People
2004 Meksikansk artikkel fra Universidad de las Americas-Puebla om talestyring for mennesker med CP og derfor redusert uttalelse.

J. Marshall Mangan og James H. Mangan
Using Recognition Software to Improve Computer Usability for a Student with Cerebral Palsy.
2008, fra Canada/USA, Abstract: This paper reports the results of an investigation into the potential of using commercially available ASR to improve the communication abilities of a person with speech and manual dexterity impairments resulting from Cerebral Palsy. We set out to investigate whether a short but intensive training program could allow this person, with above average intelligence but significant physical challenges, to improve his communications using a computer. Using both objective measurements of progress and interview data as to the subject's satisfaction, we were able to document significant improvements in ability, productivity, and feelings of success. Our findings in this exploratory project, in the form of both increased capacity and positive subjective responses, point towards an area for more intensive research and development in the future.

Christina Havstam et.al.
Speech recognition and dysarthria: A single subject study of two individuals with very severe impairment of speech and motor control.
2003, svensk, Abstract: This study investigated the use of the speech recognition system Dragon Dictate as an augmentative method of computer access for two individuals with cerebral palsy, including severe motor dysfunction and dysarthria. Single subject design was used and measures of computer access system effectiveness and speech production were used before, during and after intervention. The users' original switch access system was compared to a combination of their switch access system and speech recognition, by counting the number of correct entries. Adding speech recognition increased the number of correct entries by 40% for one of the participants. The other participant did not complete the intervention protocol. An independent judge rated speech production. No changes in speech were observed. Dragon Dictate is time-consuming to learn and demands a high level of motivation, but can be beneficial to a person who has profound dysarthria and great difficulties in accessing the computer.

Kent et.al.
Acoustic studies of dysarthric speech: Methods, progress, and potential
1999, Educational Objectives: (1) The reader will be able to describe the major types of acoustic analysis available for the study of speech, (2) specify the components needed for a modern speech analysis laboratory, including equipment for recording and analysis, and (3) list possible measurements for various aspects of phonation, articulation and resonance, as they might be manifest in neurologically disordered speech."

Thomas-Stonell et.a.
Computerized Speech Recognition: Influence of Intelligibility and Perceptual Consistency on Recognition Accuracy
1998, publisert i ISAAC, Utdrag: "VoiceType’s recognition accuracy was significantly greater for speakers in the control group than the speakers with moderate and severe dysarthria. There was no significant difference between the control group and speakers with mild dysarthria. The acquisition curves for the control and dysarthria groups across the first four sessions had differing elevations but were parallel and not significantly different. This result supports the findings of Ferrier et al. (1992), who concluded that the DragonDictate speech recognition system could be successfully used by speakers with mild to moderate dysarthria. Information is not yet available on whether dysarthric speakers could eventually achieve recognition accuracy levels of normal speakers given additional use of a speech recognition system, or whether acquisition curves plateau. The difference in the acquisition curves of the control and dysarthria groups between the fourth and fifth sessions suggests, however, that dysarthric speakers may plateau at lower recognition levels."

E. Rosengren et.al.
How does automatic speech recognition handle severely dysarthric speech?
Svensk. 1995. Om opptrening i taleravhengige ASR systemer for mennesker med dysartrisk tale.

Lyd og talestyring i kombinasjon

Mihara et.al.
The migratory cursor: accurate speech-based cursor movement by moving multiple ghost cursors using non-verbal vocalizations 
2005, fra Tokyo Institute of Technology, om musestyring med en kombinasjon av kommandoord og talelyder.

Igarashi et.al.
Voice as sound: using non-verbal voice input for interactive control
2001, Brown University om kommandoord i kombinasjon med talelyder, non-verbal features in voice for direct control"

Øvrige, semi-aktuelle

Utsumi et.al.
Sound recognition system for hearing impaired people based on pulsed neural networks
2009, japansk artikkel.

Pattern matching brukes til gjenkjenning
http://en.wikipedia.org/wiki/Pattern_matching - En annen forklaring om pattern matching her.

2001-... College of Charlestonartikler om SUITEkeys/SUITEDasher, talestyringssystemer jmf VOMOTE og DNS men åpne (cross-platform, java)
http://www.aaai.org/Papers/FLAIRS/2001/FLAIRS01-036.pdf (SUITEkeys) og http://www.cs.cofc.edu/~manaris/uploads/Main/HCI-05.lyle-manaris.pdf (SUITEDasher).

Talegjenkjenning i dyreriket (2005?), om å gjenkjenne type dyrelyder, og om fremtiden - tolke å forstå dyrelyder: http://www.ims.tuwien.ac.at/media/documents/publications/Discriminaton_and_Retrieval_of_Animal_Sounds.pdf 

Hvilke funksjoner er viktige å ha med i en lydstyringsløsning?