Compiling of Phonetic Database Structure

Maya Heydarova

Abstract

The voice corpus of language is the essential part of the linguistic resources, and it contains the phonetic database. A phonetic database is a structured collection of software-delivered speech fragments. Nowadays, phonetic database or voice corpus became like a new element in speech technologies, and much investigation has taken place according to this event. The investigators' interest in voice corpus is related to the development of a speech recognition system. Today it is enough to experience in preparation of a phonetic database. Equipped with unique information on the preparation and usage of everyday speech corpus, the development level of speech technologies and the increasing power of computer technologies allow for the investigation of various language materials, largescale, and statistical phonetic research. These developed directions of linguistics were investigated in this article. Speech corpora are a valuable source of information for phonological research and the study of sound patterns. The study of speech corpora is in its infancy compared to other field studies in linguistics. Existing speech corpora form the part of the world's languages and do not fully represent all the dialects and speech forms by phonological aspect. The article analyses the history, structure, and importance of developing speech corpses, a branch of corpus linguistics and has developed in recent years. The article also lists the main features to be considered in the design of the speech corpus.



Keywords


phonetics, sound corpus, database, speech technology, speech signal

Full Text:

PDF


References


Bogdanova, N., Asinovskij, A., Rusakova, M., Ryko, A., Stepanova, S., & Sherstinova, T. (2009). Zvukovoj korpus kak sposob monitoringa i fiksacii raznyh form estestvennogo jazyka [Speech corpus as a tool for monitoring and fixation of various forms of natural language]. Retrieved from http://www.dialog-21.ru/digests/dialog2009/materials/html/07.htm (in Russian)
[Богданова, Н., Асиновский, А., Русакова, М., Рыко, А., Степанова, С., & Шерстинова, Т. (2009). Звуковой корпус как способ мониторинга и фиксации разных форм естественного языка. URL: http://www.dialog-21.ru/digests/dialog2009/materials/html/07.htm].

ELRA. (2021). About. Retrieved from http://www.elra.info

Heeman, P. (2016). CS550 Spoken Dialogue Systems. Retrieved from https://cslu.ohsu.edu/~heeman/cs550/

Krivnova, O., Zakharova, L., & Strokin, G. (2001), Rechevyje korpusy (opyt razrabotki i ispolzovanie) Dialog. Retrieved from http://www.dialog-21.ru/digest/2001/articles/krivnova (in Russian)
[Кривнова, О., Захаров, Л., Строкин, Г. (2001). Речевые корпусы (опыт разработки и использования). Диалог. URL: http://www.dialog-21.ru/digest/2001/articles/krivnova].

Linguistic Data Consortium. (2021). About. Retrieved from http://www.ldc.upenn.edu

Loseva, E. (2006). Formirovanie mnogojazychnoj foneticheskoj bazy dannyh (primenitel'no k rechevoj realizacii vibrantov) [Formation of the multilingual phonetic database (in relation to speech realization of vibrant)] (Doctoral dissertation). Moscow: n. d. (in Russian)
[Лосева, Е. (2006). Формирование многоязычной фонетической базы данных (применительно к речевой реализации вибрантов) (Кандидатская диссертация). Москва: n. d.].

Musayev, H., Əliyev, L., & Vəliyev, H. (2020). Audio və videoyazıların məhkəmə kriminalistik ekspertizası (elmi-praktik vəsait) [Forensic medical examination audio-and videos (scientific and practical grant)]. Baki: Məhkəmə Ekspertizası Mərkəzi (in Azerbaijanian).

N. d. (2021). Ponjatie foneticheskoj bazy dannyh. Trebovanija k sovremennym foneticheskim bazam dannyh dlja fundamental'nyh i prikladnyh issledovanij [Concept of the phonetic database. Requirements to modern phonetic databases for basic and applied researches]. URL: https://studexpo.ru/758977/literatura/ponyatie_foneticheskoy_bazy_dannyh_trebovaniya_sovremennym_foneticheskim_bazam_dannyh_fundamentalnyh_prikladnyh (in Russian)
[N. d. (2021). Понятие фонетической базы данных. Требования к современным фонетическим базам данных для фундаментальных и прикладных исследований. URL: https://studexpo.ru/758977/literatura/ponyatie_foneticheskoy_bazy_dannyh_trebovaniya_sovremennym_foneticheskim_bazam_dannyh_fundamentalnyh_prikladnyh].

Ostrejkovski, V. (2000), Informatika [Informatics]. Мoscow (in Russian)
[Острейковский, В. (2000). Информатика. Москва: Высшая школа].

Potapova, R. (1997). Rech: kommunikatsija, informatsija, kibernetika [Speech: communication, information, cybernetics]. Мoscow: Radio I Svjaz’ (in Russian)
[Потапова, Р. (1997). Речь: коммуникация, информация, кибернетика. Москва: Радио и связь].

Potapova, R., & Potapov, V. (2018). Rechevye bazy dannykh kak chast’ multimodal’nykh korpusov [Spoken language databases as a part of multimodal corps on the Internet]. Vestnykh MGLU. Gumanitarnye nauki, 6(797), 99-116 (in Russian)
[Потапова, Р., Потапов, В. (2018). Речевые базы данных как часть мультимодальных корпусов в Интернете. Вестник МГЛУ. Гуманитарные науки, 6(797), 99–116].

Zampolii, A. (1998). Introduction of the General Chairman. In First International Conference on Language Resources & Evolution, 28-30 May (pp. 15-25). Granada.

Zav’jalova, V. (2010). Znachimost’ spetsializirovannykh rechevykh baz dannykh dlja formirovanija foneticheskoj kompetentsii [Application of specialized speech databases for developing phonetic competence]. Vestnik Irkutskogo gosudarstvennogo lingvisticheskogo universiteta, 3, 151-156 (in Russian)
[Завьялова, В. (2010). Значимость специализированных речевых баз данных для формирования фонетической компетенции. Вестник Иркутского государственного лингвистического университета, 3, 151–156].


Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM

Refbacks

  • There are currently no refbacks.




Copyright (c) 2021 Maya Heydarova

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.