Статья

A free Kazakh speech database and a speech recognition baseline

Ying ShiDepartment of Computer Science and Technology, Tsinghua University, ChinaAskar HamdullahXinjiang University, Wulumuqi, Xinjiang, CNZhiyuan TangDepartment of Computer Science and Technology, Tsinghua University, ChinaDong WangDepartment of Computer Science and Technology, Tsinghua University, ChinaThomas Fang ZhengDepartment of Computer Science and Technology, Tsinghua University, China

2017en

ABI

Аннотация

Automatic speech recognition (ASR) has gained significant improvement for major languages such as English and Chinese, partly due to the emergence of deep neural networks (DNN) and large amount of training data. For minority languages, however, the progress is largely behind the main stream. A particularly obstacle is that there are almost no large-scale speech databases for minority languages, and the only few databases are held by some institutes as private properties, far from open and standard, and very few are free. Besides the speech database, phonetic and linguistic resources are also scarce, including phone set, lexicon, and language model. In this paper, we publish a speech database in Kazakh, a major minority language in the western China. Accompanying this database, a full set of phonetic and linguistic resources are also published, by which a full-fledged Kazakh ASR system can be constructed. We will describe the recipe for constructing a baseline system, and report our present results. The resources are free for research institutes and can be obtained by request. The publication is supported by the M2ASR project supported by NSFC, which aims to build multilingual ASR systems for minority languages in China.

Перевод пока недоступен

Идентификаторы

DOI: 10.1109/apsipa.2017.8282133

Цитирования и источники

Цитирований: 2Использованных источников: 0

Показатели — AkademScholar