数据与计算发展前沿 ›› 2023, Vol. 5 ›› Issue (5): 164-173.

CSTR: 32002.14.jfdc.CN10-1649/TP.2023.05.014

doi: 10.11871/jfdc.issn.2096-742X.2023.05.014

• 技术与应用 • 上一篇    

知识与数据驱动相融合的朝鲜语自动标音方法研究

曹德智1(),吴立成2,赵悦2,*()   

  1. 1.中央民族大学,中国少数民族语言文学院,北京 100081
    2.中央民族大学,信息工程学院,北京 100081
  • 收稿日期:2022-04-18 出版日期:2023-10-20 发布日期:2023-10-31
  • 通讯作者: 赵悦(E-mail: zhaoyueso@muc.edu.cn
  • 作者简介:曹德智,中央民族大学,博士研究生,主要研究方向为语音识别技术与系统研究、计算语言学。
    本文中负责论文撰写、规则整理、属性特征提取与模型搭建。
    CAO Dezhi, Minzu University of China, Ph.D.. His main research interests are speech recognition technology and system research, and computational linguistics.
    In this paper, he is in charge of paper writing, rule sorting, attribute feature extraction and model building.
    E-mail: cdzhi9605@163.com|赵悦,中央民族大学,博士,教授,主要研究方向为语音识别技术与系统研究、机器学习、嵌入式系统。出版专著2部、教材2部;发表SCI、EI检索论文30余篇。
    本文中负责论文修改、审定,设计知识与数据驱动相融合的朝鲜语自动标音方法研究技术方案。
    ZHAO Yue, Minzu University of China, Ph.D., professor. Her main research interests are speech recognition technology and system research, machine learning, and embedded systems. She has published 2 monographs and 2 textbooks; more than 30 SCI and EI retrieved papers.
    In this paper, she is responsible for thesis revision and validation, and designing technical solutions for the research of knowledge-driven and data-driven integrating methodologies for Korean phonetic transcription techniques.
    E-mail: zhaoyueso@muc.edu.cn
  • 基金资助:
    国家自然科学基金面上项目“基于端到端多任务学习的藏语多方言语音识别方法研究”(61976236);国家自然科学基金面上项目“足盘式水黾机器人关键技术研究及其样机研制”(61773416);中央民族大学自主科研项目“基于大数据迁移学习的汉藏跨语言语音识别方法研究”(2020MDJC06)

A Knowledge-Driven and Data-Driven Integration Method for Korean Auto-Pronunciation

CAO Dezhi1(),WU Licheng2,ZHAO Yue2,*()   

  1. 1. School of Chinese Ethnic Minority Languages and Literatures, Minzu University of China, Beijing 100081, China
    2. School of Information Engineering, Minzu University of China, Beijing 100081, China
  • Received:2022-04-18 Online:2023-10-20 Published:2023-10-31

摘要:

【应用背景】 在朝鲜语语音信息处理的资源建设中,自动标音技术即字音转换技术起着至关重要的作用。目前学界对于字音转换技术的方法主要有基于知识和基于数据两种。【目的】 为解决以往仅基于知识驱动的方法难以适应大量数据信息的实际情况,导致模型复杂、计算困难等问题;以及仅基于数据驱动的方法依赖高质量数据又难以合理确定输入变量,需要模型特征充足且选取精准等问题。【方法】 本文提出了一种知识与数据驱动相融合的朝鲜语自动标音方法。首先根据朝鲜语语音变异规律为基础提取精准的特征属性,获得高质量数据;然后结合数据驱动模型能够较好拟合输入与输出变量之间映射关系的优点,训练学习模型,实现对朝鲜语的自动标音。【结果】 通过本文方法,最终标音结果能够兼顾朝鲜语连续语流中音节弱化、脱落、增音、异化等音变现象,并能够准确地获得字素相对应的音素。经交叉测试,该方法使预测模型性能提高,平均正确字音转换率可达94.63%。【结论】 利用本文提出的朝鲜语自动标音方法能够有效建立准确的朝鲜语发音字典,有望为朝鲜语语音识别与语音合成等系统提供技术支持。

关键词: 知识驱动, 数据驱动, 朝鲜语, 语流音变, 字音转换

Abstract:

[Application Background] In resource construction for Korean phonetic information processing, automatic phonetic transcription technology plays a crucial role. At present, there are two main approaches for grapheme-to-phoneme(G2P) conversion: knowledge-based and data-based. [Objective] The purpose of this paper is to solve the problems existing in these two approaches. The knowledge-driven method cannot easily adapt to the real situation of large volume data information, which results in complex models and difficult computations. The data-driven method relies on high-quality data and has difficulty in determining the input variables, which requires adequate model features and accurate selection. [Methods] This paper proposes a method which integrates the knowledge-driven and data-driven approaches for Korean phonetic transcription. Firstly, this method extracts accurate feature attributes based on the variation pattern of Korean speech to obtain high-quality data; then it trains the machine learning model for automatic pronunciation of Korean by taking the advantages of the data-driven approach in fitting the input and output variables. [Results] The proposed method takes the phonological changes in continuous Korean speech such as syllable weakening, disambiguation, augmentation and dissimilation into account, and can accurately obtain the phonemes corresponding to the graphemes. A cross-testing shows that this method can improve the performance of the prediction model, and the average correct rate of grapheme-phoneme conversion reaches 94.63%. [Conclusions] The automatic Korean phonetic transcription method proposed in this paper can effectively establish an accurate Korean pronunciation dictionary, which is expected to provide technical support for systems such as Korean speech recognition and synthesis.

Key words: knowledge-driven, data-driven, Korean, phonetic variation, grapheme-to-phoneme