1.Tagalog sentence repetition test: Content validation and pilot testing with Metro Manila speakers aged 7-21
Hannah Maria D. Albert ; Ellyn Cassey K. Chua
Philippine Journal of Health Research and Development 2024;28(1):18-24
Background:
Speech sound disorders (SSD) refer to difficulties in perceiving, mentally representing, and/or articulating speech sounds. In 2018, the Tagalog Sentence Repetition Test (SRT) was developed due to the lack of a commercially available local assessment tool for children with suspected SSDs. The SRT had not been validated or piloted yet.
Objectives:
This study aimed to determine the SRT’s content validity (comprehensiveness, relevance, comprehensibility), ability to successfully elicit the target sounds, and logistical feasibility and flaws.
Methodology:
All procedures were conducted online. Three linguists evaluated the comprehensiveness of the sounds covered, while 31 Manila Tagalog-speaking children (7 to 21 years old) participated in pilot testing. Post-testing, the children answered a questionnaire to evaluate their familiarity with the sentences’ words (relevance) and the comprehensibility of the test instructions. Content validity was assessed by computing the Content Validity Index (CVI). To see how well the test elicits the target sounds, the number of participants who produced each sound were computed.
Results:
A CVI of 1.0 was obtained for all aspects of content validity. All targets were produced by almost all the participants, except for the final glottal stop (18/31, 58%). The test administration seemed feasible as participants from all age groups successfully executed the task.
Conclusion
Although the SRT exhibited good content validity, some sentences need to be revised to address sound production issues noted during the pilot. This new version should be re-piloted to 7 to 11-year-olds in-person and via teleconferencing. A manual should also be created to facilitate administration.
Speech Disorders
;
Speech Production Measurement
2.The analysis of formant characteristics of vowels in the speech of patient with cleft palate.
Xuecai YANG ; Ningyi LI ; Lingxue BU
West China Journal of Stomatology 2003;21(6):451-462
OBJECTIVETo analyze the formant frequency of vowels in the sequence therapy of patient with cleft palate.
METHODSThe formant frequency of vowels [a], [e], [i], [u] of normal children and postoperative patients with and without speech therapy was measured and analyzed by VS-99.
RESULTS1. The mean value of F1, F2, F3 of [a] did not show significant difference among the three groups (P > 0.05). 2. The difference of mean value of [e] was significant between control group and pre-speech-therapy group, and between pre-speech-therapy and post-speech-therapy group (P < 0.05), but no significant difference was found between post-speech-therapy and control group(P > 0.05). The mean value of the formant in post-speech-therapy was higher than that of pre-speech-therapy. 3. The difference of mean value of [i] was significant between pre-speech-therapy and post-speech-therapy (P < 0.05), the mean value of F2, F3 in post-speech-therapy group decreased significantly compared with control (P < 0.05). 4. The difference of mean value of [u] showed significance between pre-speech-therapy and post-speech-therapy (P < 0.05), while the differences among other groups were insignificant (P > 0.05).
CONCLUSIONSurgical repair of cleft palate cannot make all patients obtain perfect Velopharyngeal competence (VPC), while speech therapy can improve patient's pronunciation. Speech spectrum analysis can judge the effect of cleft palate therapy objectively.
Adolescent ; Adult ; Articulation Disorders ; etiology ; physiopathology ; Child ; Cleft Palate ; complications ; physiopathology ; surgery ; Female ; Humans ; Male ; Postoperative Period ; Sound Spectrography ; Speech ; physiology ; Speech Articulation Tests ; Speech Production Measurement ; Speech Therapy ; Velopharyngeal Insufficiency ; etiology ; physiopathology
3.Design of standard voice sample text for subjective auditory perceptual evaluation of voice disorders.
Jin-rang LI ; Yan-yan SUN ; Wen XU
Chinese Journal of Otorhinolaryngology Head and Neck Surgery 2010;45(9):719-722
OBJECTIVETo design a speech voice sample text with all phonemes in Mandarin for subjective auditory perceptual evaluation of voice disorders.
METHODSThe principles for design of a speech voice sample text are: The short text should include the 21 initials and 39 finals, this may cover all the phonemes in Mandarin. Also, the short text should have some meanings.
RESULTSA short text was made out. It had 155 Chinese words, and included 21 initials and 38 finals (the final, ê, was not included because it was rarely used in Mandarin). Also, the text covered 17 light tones and one "Erhua". The constituent ratios of the initials and finals presented in this short text were statistically similar as those in Mandarin according to the method of similarity of the sample and population (r = 0.742, P < 0.001 and r = 0.844, P < 0.001, respectively). The constituent ratios of the tones presented in this short text were statistically not similar as those in Mandarin (r = 0.731, P > 0.05).
CONCLUSIONSA speech voice sample text with all phonemes in Mandarin was made out. The constituent ratios of the initials and finals presented in this short text are similar as those in Mandarin. Its value for subjective auditory perceptual evaluation of voice disorders need further study.
Humans ; Language ; Reference Standards ; Speech Acoustics ; Speech Perception ; Speech Production Measurement ; Voice Disorders ; diagnosis ; Voice Quality
4.The acoustic study on vowel movement of normal adult.
Hongyun LU ; Zhaoming HUANG ; Yinting BAI ; Lei ZHANG
Journal of Clinical Otorhinolaryngology Head and Neck Surgery 2011;25(9):406-408
OBJECTIVE:
To study the relationship between the first formant (F1) and jaw, the second formant (F2) and tongue and the third formant (F3) and lip. Fine articulation of jaw, lips, tongue by measured formant of different single-vowel, in order to explore clinical implications of F1, F2 and F3.
METHOD:
Measure 30 hearing normal men's F1, F2, F3 of /a/, /i/, /e/, /u/ and /ü/. The study compared F1 of /a/, /i/, /e/ to find the relation ship between F1 and jaw movement by one-way anova, compared F2 of /a/, /i/, /e/, /u/ to find the relationship between F2 and four tongue movements, and compared F2 and F3 of /i/, /ü/ to find the relationship between F2, F3 and lip movement by paired-samples t test.
RESULT:
There was significant difference among F1 of /a/, /i/, /e/. F2 and F3 of /i/, /ü/ were also significantly different (P<0.01); F2 of /a/, /i/, /u/ that expresses tongue articulation movement exists significant difference (P<0.01), but both F2(a) and F2(e) did not differ significantly by multiple compare means. There were extremely significant differences (P<0.01) among other three positions of tongue.
CONCLUSION
F1 can reflect different positions of jaw. F2 and F3 can reflect the position of lip and tongue. F2 can reflect different locations of tongue.
Adolescent
;
Adult
;
Humans
;
Male
;
Sound Spectrography
;
Speech Acoustics
;
Speech Production Measurement
;
Vocal Cords
;
physiology
;
Young Adult
5.A research in speech endpoint detection based on boxes-coupling generalization dimension.
Zimei WANG ; Cuirong YANG ; Wei WU ; Yingle FAN
Journal of Biomedical Engineering 2008;25(3):536-541
In this paper, a new calculating method of generalized dimension, based on boxes-coupling principle, is proposed to overcome the edge effects and to improve the capability of the speech endpoint detection which is based on the original calculating method of generalized dimension. This new method has been applied to speech endpoint detection. Firstly, the length of overlapping border was determined, and through calculating the generalized dimension by covering the speech signal with overlapped boxes, three-dimension feature vectors including the box dimension, the information dimension and the correlation dimension were obtained. Secondly, in the light of the relation between feature distance and similarity degree, feature extraction was conducted by use of common distance. Lastly, bi-threshold method was used to classify the speech signals. The results of experiment indicated that, by comparison with the original generalized dimension (OGD) and the spectral entropy (SE) algorithm, the proposed method is more robust and effective for detecting the speech signals which contain different kinds of noise in different signal noise ratio (SNR), especially in low SNR.
Artificial Intelligence
;
Humans
;
Pattern Recognition, Automated
;
methods
;
Signal Processing, Computer-Assisted
;
Speech
;
Speech Production Measurement
;
methods
;
Speech Recognition Software
6.Features and clinical application of acoustic parameters in voice.
Journal of Biomedical Engineering 2006;23(4):919-922
In order to study laryngeal phonic function, the methods of acoustic evaluation and phonatory detection has become the focused problem by doctors in otorhinolaryngology and speech pathology. A great number of acoustic parameters have been designed and used. This article intends to discuss objective rated indexes reflecting the functional condition of vocal cords.
Age Factors
;
Female
;
Humans
;
Male
;
Sex Factors
;
Speech Production Measurement
;
instrumentation
;
methods
7.Noise Reduction Using Wavelet Thresholding of Multitaper Estimators and Geometric Approach to Spectral Subtraction for Speech Coding Strategy.
Kai Chuan CHU ; Charles T M CHOI
Clinical and Experimental Otorhinolaryngology 2012;5(Suppl 1):S65-S68
OBJECTIVES: Noise reduction using wavelet thresholding of multitaper estimators (WTME) and geometric approach to spectral subtraction (GASS) can improve speech quality of noisy sound for speech coding strategy. This study used Perceptual Evaluation of Speech Quality (PESQ) to assess the performance of the WTME and GASS for speech coding strategy. METHODS: This study included 25 Mandarin sentences as test materials. Environmental noises including the air-conditioner, cafeteria and multi-talker were artificially added to test materials at signal to noise ratio (SNR) of -5, 0, 5, and 10 dB. HiRes 120 vocoder WTME and GASS noise reduction process were used in this study to generate sound outputs. The sound outputs were measured by the PESQ to evaluate sound quality. RESULTS: Two figures and three tables were used to assess the speech quality of the sound output of the WTME and GASS. CONCLUSION: There is no significant difference between the overall performance of sound quality in both methods, but the geometric approach to spectral subtraction method is slightly better than the wavelet thresholding of multitaper estimators.
Clinical Coding
;
Cochlear Implants
;
Noise
;
Signal-To-Noise Ratio
;
Speech Production Measurement
8.The collecting and processing system of the sound signal of larynx.
Qing JIAO ; Yong-xin GUO ; Qing-guo MENG ; Xian-yun WANG
Chinese Journal of Medical Instrumentation 2002;26(3):174-176
This paper presents a new kind of collecting and processing system of the sound signal of larynx where double sound cards and software filter are used. The installation of the double sound cards and processing proposal of software are mainly discussed and a new kind of ADC method of dual-channel sound signal is put forward in this paper. This system has the feature of reliable performance, simple installation and easy maintenance.
Algorithms
;
Equipment Design
;
Humans
;
Larynx
;
physiology
;
Microcomputers
;
Signal Processing, Computer-Assisted
;
instrumentation
;
Software
;
Sound
;
Speech Production Measurement
;
instrumentation
9.Study on the stability of perceptual evaluation of voice quality.
Gang WANG ; Ping YU ; Wen XU ; Wu WEN ; Chun-sheng WEI ; Dong-yan HUANG ; Li-zhen HOU
Chinese Journal of Otorhinolaryngology Head and Neck Surgery 2011;46(6):485-490
OBJECTIVETo explore the factors that influence the stability of evaluation results judged by a jury through a standard research on perceptual evaluation measurements of voice quality.
METHODSVoice samples from 300 patients with dysphonia and 100 control subjects with normal voice were recorded and assessed by a jury composed of 6 experienced listeners from different hospitals. The voice samples were discourse voices and ordered randomly 3 times, and the mean of 3 evaluations using visual analogue scale were the final results. The jury was instructed to classify voice samples according to the G (grade), R (rough) and B (breathy) components of the GRBAS scale on a 4-point scale ranging from 0 for normal to 3 for severe dysphonia. Κ value was used to analyze the concordance of evaluation results and regression analysis was used to research the effects of the extent of voice disorder to the stability of perceptual evaluation.
RESULTSThe discordance of evaluation existed both between the jury and in listeners themselves. The concordance of listeners themselves of each evaluation parameter was not bad, good, or even very good, and the concordance of evaluation of G was the best (κ value: 0.46 - 0.85), then R (κ value: 0.41 - 0.84) and B (κ value: 0.41 - 0.81). The concordance between the jury was worse than that in themselves. And except a listener whose concordance of evaluation was under the requirement, the concordance of evaluation of G was the best (κ value: 0.43 - 0.96), then R (κ value: 0.33 - 0.78) and B (κ value: 0.002 - 0.45). The stability of evaluation of normal voice and severe voice disorder was better than mild and moderate voice disorder.
CONCLUSIONSThe discordance between the jury was the main factor that influence the stability of perceptual evaluation. The evaluation parameters and extent of voice disorder will influence the stability of perceptual evaluation of the jury.
Adolescent ; Adult ; Auditory Perception ; Case-Control Studies ; Female ; Humans ; Male ; Middle Aged ; Speech Perception ; Speech Production Measurement ; Voice Disorders ; diagnosis ; Voice Quality ; Young Adult
10.Detection of endpoint for segmentation between consonants and vowels in aphasia rehabilitation software based on artificial intelligence scheduling.
Xingjuan DENG ; Ji CHEN ; Jie SHUAI
Journal of Biomedical Engineering 2009;26(4):886-899
For the purpose of improving the efficiency of aphasia rehabilitation training, artificial intelligence-scheduling function is added in the aphasia rehabilitation software, and the software's performance is improved. With the characteristics of aphasia patient's voice as well as with the need of artificial intelligence-scheduling functions under consideration, the present authors have designed a set of endpoint detection algorithm. It determines the reference endpoints, then extracts every word and ensures the reasonable segmentation points between consonants and vowels, using the reference endpoints. The results of experiments show that the algorithm is able to attain the objects of detection at a higher accuracy rate. Therefore, it is applicable to the detection of endpoint on aphasia-patient's voice.
Algorithms
;
Aphasia
;
etiology
;
rehabilitation
;
Artificial Intelligence
;
Endpoint Determination
;
Humans
;
Language Therapy
;
instrumentation
;
Phonetics
;
Software
;
Speech Intelligibility
;
Speech Production Measurement
;
instrumentation
;
Stroke
;
complications
;
Stroke Rehabilitation
;
Verbal Behavior