Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseОткрытый API экосистемы
Статья

A free Kazakh speech database and a speech recognition baseline

Ying ShiDepartment of Computer Science and Technology, Tsinghua University, ChinaAskar HamdullahXinjiang University, Wulumuqi, Xinjiang, CNZhiyuan TangDepartment of Computer Science and Technology, Tsinghua University, ChinaDong WangDepartment of Computer Science and Technology, Tsinghua University, ChinaThomas Fang ZhengDepartment of Computer Science and Technology, Tsinghua University, China
2017en
ABI

Аннотация

Automatic speech recognition (ASR) has gained significant improvement for major languages such as English and Chinese, partly due to the emergence of deep neural networks (DNN) and large amount of training data. For minority languages, however, the progress is largely behind the main stream. A particularly obstacle is that there are almost no large-scale speech databases for minority languages, and the only few databases are held by some institutes as private properties, far from open and standard, and very few are free. Besides the speech database, phonetic and linguistic resources are also scarce, including phone set, lexicon, and language model. In this paper, we publish a speech database in Kazakh, a major minority language in the western China. Accompanying this database, a full set of phonetic and linguistic resources are also published, by which a full-fledged Kazakh ASR system can be constructed. We will describe the recipe for constructing a baseline system, and report our present results. The resources are free for research institutes and can be obtained by request. The publication is supported by the M2ASR project supported by NSFC, which aims to build multilingual ASR systems for minority languages in China.

Перевод пока недоступен

Идентификаторы

Цитирования и источники

Цитирований: 2Использованных источников: 0