Speech synthesis from invasive brain signals: From offline analysis to closed-loop speech decoding
Veröffentlichungsdatum
2021-10-29
Autoren
Betreuer
Gutachter
Zusammenfassung
Several neurological diseases and disorders can impact speech communication functions and lead to the complete loss of the ability to speak. Brain-Computer Interfaces (BCIs) – systems that directly receive neural signals as input to control a computing device – raise hope for speech neuroprosthetics, that provide an alternative communication channel and thus restore speech communication functions for speech impaired people.
This cumulative dissertation examines methods for enabling synthesis of acoustic speech from brain activity data. Invasive neuroimaging techniques for measuring electrophysiological brain activity have shown suitable characteristics for capturing the complex dynamics of spoken language in both spatial and temporal resolution. Speech processes can be decoded from neural data using appropriate decoding approaches, and strategies from speech synthesis contribute to generating waveforms to be subsequently played back through a loudspeaker.
For this purpose, the dissertation first investigates methods to reconstruct spoken speech from experimental recordings in offline analysis, to develop suitable algorithms that can transform brain activity data into high-quality acoustic speech. From here, the focus shifts towards closed-loop speech decoding and synthesis and presents techniques to convert neural speech processes into audible speech in real-time, that can be output as continuous feedback over a loudspeaker. The cumulative dissertation concludes with a discussion of the presented approaches and their limitations, and presents a related modality in which natural speech could be assisted and possibly restored by electrical stimulation of orofacial muscles.
This cumulative dissertation examines methods for enabling synthesis of acoustic speech from brain activity data. Invasive neuroimaging techniques for measuring electrophysiological brain activity have shown suitable characteristics for capturing the complex dynamics of spoken language in both spatial and temporal resolution. Speech processes can be decoded from neural data using appropriate decoding approaches, and strategies from speech synthesis contribute to generating waveforms to be subsequently played back through a loudspeaker.
For this purpose, the dissertation first investigates methods to reconstruct spoken speech from experimental recordings in offline analysis, to develop suitable algorithms that can transform brain activity data into high-quality acoustic speech. From here, the focus shifts towards closed-loop speech decoding and synthesis and presents techniques to convert neural speech processes into audible speech in real-time, that can be output as continuous feedback over a loudspeaker. The cumulative dissertation concludes with a discussion of the presented approaches and their limitations, and presents a related modality in which natural speech could be assisted and possibly restored by electrical stimulation of orofacial muscles.
Schlagwörter
speech synthesis
;
brain-computer interfaces
;
speech neuroprosthetics
Institution
Fachbereich
Dokumenttyp
Dissertation
Zweitveröffentlichung
Nein
Sprache
Englisch
Dateien![Vorschaubild]()
Lade...
Name
dissertation_miguel_angrick.pdf
Description
Dissertation von Miguel Angrick
Size
53.54 MB
Format
Adobe PDF
Checksum
(MD5):67ff83abff96d0ffbc1a2946220a3fb8