now publishers - Automatic Analyses of Dysarthric Speech based on Distinctive Features

APSIPA Transactions on Signal and Information Processing > Vol 12 > Issue 3

Automatic Analyses of Dysarthric Speech based on Distinctive Features

Ka Ho Wong, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, The People’s Republic of China, khwong@se.cuhk.edu.hk , Helen Mei-Ling Meng, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, The People’s Republic of China

Suggested Citation

Ka Ho Wong and Helen Mei-Ling Meng (2023), "Automatic Analyses of Dysarthric Speech based on Distinctive Features", APSIPA Transactions on Signal and Information Processing: Vol. 12: No. 3, e18. http://dx.doi.org/10.1561/116.00000077

Publication Date: 03 May 2023

Subjects

Keywords

Dysarthric, distinctive features, recognition, sequence-to-sequence

Journal details

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 1338 times

In this article:

Abstract

Dysathria is a neuromotor disorder that causes the individual to speak with imprecise articulation. This paper presents an automatic analysis framework for dysarthric speech, using a linguistically motivated representation based on distinctive features. Our framework includes a seq2seq phonetic decoder for Cantonese dysarthric speech. The manually or automatically transcribed phones can be mapped into a representation that consists of 21 distinctive features (DF). The DFs between the transcribed phones and canonical phones are compared in order to identify articulatory error rate (AER) for each DF. This forms an AER profile for a given set of dysarthric recordings from a speaker. Experiments show that the difference between the AER profile derived from manual versus automatic phonetic transcription is relatively small – with a root mean squared error (RMSE) of 0.053 for the word-reading task and 0.085 for the sentence-reading task in CU DYS. In addition, the correlations between the AER profiles are high, at 0.97 and 0.95 for the two tasks respectively. These results reflect the viability of the proposed framework as an automated means of processing dysarthric speech to achieve articulatory analyses described by DFs. The AER profile is intuitive and interpretable, for pinpointing problem areas in articulation.

DOI:10.1561/116.00000077

Related publications

Companion

APSIPA Transactions on Signal and Information Processing Special Issue - Advanced Acoustic, Sound and Audio Processing Techniques and Their Applications
See the other articles that are part of this special issue.

Introduction
Background
Corpora
Analysis of Dysarthric, Deviant Articulations based on Manual Phonetic Transcriptions
Acquiring Deviant Articulations based on Automatic Transcriptions
Comparing Articulatory Analyses based on Manual and Automatic Transcriptions
Applications in Analysis of Dysarthric Speech
Conclusions and Future Work
References

Automatic Analyses of Dysarthric Speech based on Distinctive Features

Share

Journal details

Abstract

Related publications