Automatic vocal quality classification

Authors

  • Mario Alejandro García, Doctorando/a Universidad Tecnológica Nacional - Facultad Regional Córdoba - Argentina
  • Eduardo A. Destéfanis Director

DOI:

https://doi.org/10.33414/ajea.4.408.2019

Keywords:

Deep Learning, Artificial Neural Network, Vocal Quality

Abstract

In order to classify the vocal quality on GRBAS scale, an approach of end-to-end neural network design is presented. Based on this approach, three neural networks are shown. These neural networks calculate the shortterm Fourier transform (STFT), cepstrum and shimmer of an audio signal. The training of the networks that calculate STFT and shimmer was successful. The network that calculates the cepstrum could not be trained, but an alternative model that calculates the autocovariance could. It is concluded that the developed neural networks are compatible with the proposed approach. This is because they allow the error gradient backpropagation, a necessary condition for training the complete model.

Downloads

Download data is not yet available.

Published

2019-11-01

How to Cite

García, M. A., & Destéfanis, E. A. (2019). Automatic vocal quality classification. AJEA (Proceedings of UTN Academic Conferences and Events), (4). https://doi.org/10.33414/ajea.4.408.2019