1.Unsedated Office-Based KTP Laser Excision of Large Vocal Fold Polyps: A Two-Case Series
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2026;37(1):53-59
Office-based angiolytic potassium titanyl phosphate (KTP) laser surgery is widely used for benign vocal fold lesions, but very large pedunculated polyps are often managed under general anesthesia (GA) because of dynamic glottic narrowing and airway concerns. We report two clinically stable male patients with large pedunculated vocal fold polyps causing approximately 30%–52% glottic airway area occupancy at maximal abduction who underwent unsedated, office-based KTP excision under topical anesthesia. Both procedures were completed without desaturation, airway compromise, or conversion to GA. Histopathology confirmed vocal fold polyps, and early follow-up (2 weeks and 1 month) demonstrated symptomatic improvement with objective improvement in GRBAS and acoustic measures. These cases suggest that, with strict patient selection and a defined airway contingency pathway, unsedated office-based treatment may be feasible even for selected large ball-valve polyps, potentially expanding the indications for awake local-anesthesia management.
2.A Case of Recurrent Verrucous Carcinoma of the Larynx
Minah SHIN ; Wonae LEE ; Sang Joon LEE
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2026;37(1):48-52
Verrucous carcinoma of the larynx is a rare, well-differentiated variant of squamous cell carcinoma that is locally invasive but rarely metastasizes. However, diagnosis can be difficult because superficial biopsies often fail to demonstrate the characteristic pushing border of verrucous carcinoma. Consequently, multiple biopsies may fail to establish a definitive diagnosis. Surgical excision remains the primary treatment. We present a case of a 54-year-old man who required several years to achieve a definitive diagnosis of verrucous carcinoma and whose lesion recurred repeatedly despite multiple excisions. This case emphasizes the importance of obtaining adequately deep and wide biopsy specimens for the accurate diagnosis of verrucous carcinoma. It also illustrates the therapeutic dilemma in the management of glottic verrucous carcinoma, as aggressive excision may compromise vocal function, whereas conservative excision increases the risk of recurrence.
3.Usability Test of Vocal Hygiene Program Application for Patients With Voice Disorders
Seo Yeon CHO ; Jae Deuk JO ; Chae Rim PARK ; Gil Joon LEE ; Sung Min JIN ; Sang Hyuk LEE
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2026;37(1):43-47
Background and Objectives:
Patients with voice disorders often face difficulties in managing their vocal health in daily life, highlighting the need for voice management and hygiene education without temporal or spatial limitations. This study aimed to evaluate the usability of a smartphone-based voice hygiene application developed for patients with voice disorders.Materials and Method Thirty patients diagnosed with voice disorders who underwent voice therapy participated in the study. After using the voice hygiene application for 4 weeks, participants completed a usability questionnaire composed of ten items across five key categories: aesthetics, functionality, engagement, usability, and reliability (two items per category).
Results:
The overall usability score of the voice hygiene application was 3.94 out of 5. The highest ratings were observed in aesthetic design and convenience. Engagement was positively influenced by the app’s reminder feature, and satisfaction regarding information reliability and system functionality was also generally high.
Conclusion
The voice hygiene application was found to be a useful tool for supporting self-management and maintaining vocal hygiene among patients with voice disorders through unrestricted accessibility. This study suggests that such applications can promote consistent voice hygiene practices and contribute to the improvement of vocal health.
4.Audio–Text Contrastive Representation Learning for Voice Assessment: Toward Assistive Clinical Applications
Kwang Hyeon KIM ; Yoonkyoung SO
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2026;37(1):33-42
Background and Objectives:
This study aims to develop and validate a bidirectional audio–text contrastive representation learning framework that enhances alignment between linguistic content and speech production characteristics, thereby exploring its potential utility for future clinical voice assessment applications.Materials and Method A dual-encoder multimodal contrastive model was trained using 12854 Korean speech–text pairs. Audio inputs were converted into 80-channel Mel spectrograms with SpecAugment, and text was processed using a KoBERT-based tokenizer. Joint embeddings were optimized via bidirectional cosine-similarity InfoNCE loss over 100 epochs. Retrieval-based evaluation quantified alignment performance between paired inputs.
Results:
The dataset exhibited an average utterance duration of 3.60±1.01 second and 4.99± 1.73 words, with high variability in phonetic realizations reflected by an ASR baseline word error rate of 100.69% (p=0.0311 for utterance-length effects). Despite such heterogeneity, the proposed model achieved consistent multimodal alignment, with audio-to-text and text-to-audio retrieval Recall@10 of 0.523 and 0.520, respectively, and median ranks of 9.00 and 10.00.
Conclusion
The findings indicate that contrastive alignment can produce robust multimodal speech representations under diverse reading patterns. Although clinical validation using pathological speech remains necessary, this work establishes a proof-of-concept basis for future voice assessment applications leveraging retrieval-based embedding analysis.
5.Magnetic Resonance Convergence Study on Explanation of the Designs of the Four Primary Vowel Letters and Four Secondary Vowel Letters of the Hunminjeongeum Middle Vowel Letters
Hong-Shik CHOI ; Jeong Min LEE ; Jinna KIM ; Ho-Young LEE ; Yunseok KANG ; Seungsu LEE ; Seul-ong KIM
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2026;37(1):19-32
This study is a convergence study that empirically verifies the principles of the Hunminjeongeum Joongsung-Ja (中聲字, vowel) system, especially the principles of the pictorial patterns of the initial and re-involved characters, using magnetic resonance imaging (MRI). According to the principle stated in the Jejahae (制字解) of the Hunminjeongeum Haeryebon—that “the 28 characters of Jeongeum were each modeled after their shapes”—this study aimed to prove that all characters were devised to reflect the shape of the vocal tract when pronounced. Four participants, including both younger and older adults including both sexes were enrolled in this study, and real-time MRI scans and acoustic analysis were conducted. In the analysis of the Cho-Chul-Ja (初出字, /ㅗ, ㅏ, ㅜ, ㅓ/), clear differences were observed in the structure of the vocal tract and the shape of the resonance cavity according to the vowels, and the degree of lip opening and the length of the vocal tract showed a close relationship with the acoustic characteristics of each vowel. In the Jae-Chul-Ja (再出字, /ㅛ, ㅑ, ㅠ, ㅕ/), changes in the shape of the resonance cavity were observed along with changes in tongue movement and lip opening during diphthong utterances, and it was confirmed that this is directly related to acoustic characteristics. The results of this study confirmed that the principle of the Hunminjeongeum Joongsung-Ja (中聲字, vowel) is closely related to the form of resonance cavity in the actual articulation process. It was confirmed that it was created by the synthesis (Hapsung, 合成) of the three letters “Cheon-ji-in (天地人, /•, ㅡ, ㅣ/)”, and it was also confirmed that the pictorial principles of the first and second versions are based on the shape and movement of the resonance tube. It is significant in that it re-examines the scientific sophistication of the vowel system devised during the creation of Hunminjeongeum in the 15th century with modern technology. This shows that Hunminjeongeum is not just a philosophical symbol, but a writing system that reflects the anatomical structure and physiological movements of actual speech organs.
6.Diagnostic Potential and Clinical Utility of Analysis of Dysphonia in Speech and Voice Parameters in Voice Disorders
Seung Jin LEE ; Ji Hye YOON ; Woojae HAN
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2026;37(1):9-18
The Analysis of Dysphonia in Speech and Voice (ADSV) program provides advanced acoustic measures that have improved the objectivity and reliability of clinical voice assessment. Among its parameters, cepstral peak prominence (CPP) and low/high spectral ratio (SR) have demonstrated significant diagnostic value in differentiating normal and pathological voices. CPP quantifies the degree of harmonic organization within a voice signal, reflecting periodicity and vocal stability, while the SR captures the spectral energy balance between low- and high-frequency bands, offering supplementary information on breathiness and noise components. Compared with conventional perturbation-based methods such as the Multi-Dimensional Voice Program, ADSV parameters are less affected by signal irregularity and maintain analytical robustness even in severely dysphonic voices. Recent studies have established normative CPP and SR values for both Korean and international populations, providing practical reference ranges for clinical application. Furthermore, ADSV-derived indices such as the Cepstral Spectral Index of Dysphonia, Acoustic Psychometric Severity Index of Dysphonia, and Comprehensive Index of Vocal Fatigue have shown promise for quantifying overall dysphonia severity and vocal fatigue. Despite some limitations—such as the need for standardized recording conditions—ADSV measures represent a major advancement toward evidence-based, quantitative voice diagnostics and hold significant potential for integration into clinical and telepractice settings.
7.Voice Therapy for Vocal Fold Paralysis: A Pathophysiology-Based Clinical Decision-Making Framework
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2026;37(1):1-8
Vocal fold paralysis (VFP) results from injury to the vagus or recurrent laryngeal nerve and leads to impaired vocal fold mobility, incomplete glottic closure, and inefficient voice production. Beyond dysphonia, VFP may compromise airway protection and swallowing safety, underscoring the need for integrated clinical decision-making. Traditionally, management has emphasized surgical or injection-based medialization to restore glottic closure, while voice therapy has often been described as a postoperative adjunct. However, accumulating evidence suggests that voice therapy can play a central and, decision-shaping role across the entire treatment continuum, including as a preferential first-line intervention in selected patients. This narrative review reexamines the role of voice therapy in VFP from a pathophysiology-based perspective. The underlying mechanisms of voice impairment following neural injury are outlined, and the spectrum of hypofunctional and hyperfunctional phonatory patterns is described. A figure-based clinical decision-making pathway is proposed that prioritizes aspiration risk and functional voice characteristics to guide the timing and integration of voice therapy and medialization procedures. Particular emphasis is placed on differentiating pre-medialization voice therapy, post-medialization voice assessment, and post-medialization voice therapy as distinct yet interconnected components of care. By conceptualizing voice therapy as a function-oriented intervention grounded in neural plasticity and motor learning, this review highlights its potential to improve phonatory efficiency, reduce maladaptive compensatory behaviors, and optimize functional outcomes. The proposed framework aims to support individualized, flexible treatment planning for patients with VFP in contemporary clinical practice.
8.Preliminary Study on Detecting Vocal Disorders Using Deep Learning in Laryngology
Kwang Hyeon KIM ; Jae-Keun CHO
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2025;36(1):5-11
Background and Objectives:
Voice disorders can significantly impact quality of life. This study evaluates the feasibility of using deep learning models to detect voice disorders using an opensource dataset.Materials and Method We utilized the Saarbrücken Voice Database, which contains 1231 voice recordings of various pathologies. Datasets were used for training (n=1036) and validation (n=195). Key vocal parameters, including fundamental frequency (F0), formants (F1, F2), harmonics-to-noise ratio, jitter, and shimmer, were analyzed. A convolutional neural network (CNN) was designed to classify voice recordings into normal, vox senilis, and laryngocele. Performance was assessed using precision, recall, F1-score, and accuracy.
Results:
The CNN model demonstrated high classification performance, with precision, recall, and F1-scores of 1.00 for normal and 0.99 for vox senilis and laryngocele. Accuracy reached 1.00 after 50 epochs and remained stable through 100 epochs. Time-frequency analysis supported the model’s ability to differentiate between classes.
Conclusion
This study highlights the potential of deep learning for voice disorder detection, achieving high accuracy and precision. Future research should address dataset diversity and realworld integration for broader clinical adoption.
9.A Case of Vocal Fold Contact Granuloma Treated With In-Office KTP Ablation Surgery
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2025;36(1):39-43
Contact granuloma of the vocal fold is a benign lesion that can be challenging to treat, often requiring multiple interventions. This case report presents the successful treatment of a refractory vocal fold contact granuloma using KTP laser vaporization under local anesthesia. A 30-year-old male patient with a persistent right vocal process granuloma, unresponsive to 3 months of conservative treatment, underwent in-office KTP laser surgery. The granuloma completely resolved within 1 month post-procedure, with improved voice quality and objective voice parameters. No recurrence was observed during the 6 months follow-up period. The in-office KTP laser vaporization technique provides a viable alternative for treating refractory vocal fold granulomas, particularly in patients who are unsuitable for general anesthesia or voice professionals who cannot tolerate prolonged voice rest following botulinum toxin injections. However, as this is a single case report, larger prospective studies with standardized follow-up periods are necessary to determine long term efficacy and recurrence rates.
10.The Effects of Semi-Occluded Vocal Tract Exercises on Presbyphonia in Elderly Women: Two Case Reports
HyeJin LIM ; Dong Won LEE ; Jeong Kyu KIM ; Seong-Hee CHOI
Journal of the Korean Society of Laryngology Phoniatrics and Logopedics 2025;36(1):32-38
This case report investigates the effects of semi-occluded vocal tract exercises (SOVTEs) on voice improvement in elderly female patients with presbyphonia. Elderly female patients with presbyphonia commonly present with symptoms such as hoarseness, decreased vocal intensity, and phonatory difficulties. These symptoms are often associated with age-related vocal fold atrophy, leading to compensatory muscle tension and inefficient phonation, which necessitate targeted therapeutic interventions. In this study, two elderly female patients, aged 73 years and 71 years, participated in a voice therapy program centered on SOVTEs designed to promote vocal fold vibration efficiency and reduce compensatory tension. Case 1 underwent five therapy sessions, while Case 2 completed sixteen sessions. Pre- and post-treatment voice assessments revealed notable improvements in vocal quality, suggesting that SOVTEs may be an effective therapeutic approach for managing presbyphonia in elderly women.

Result Analysis
Print
Save
E-mail