Automatic vocal quality classification

Authors

  • Mario Alejandro García, Doctorando/a Universidad Tecnológica Nacional - Facultad Regional Córdoba - Argentina
  • Eduardo A. Destéfanis Director

DOI:

https://doi.org/10.33414/ajea.4.408.2019

Keywords:

Deep Learning, Artificial Neural Network, Vocal Quality

Abstract

In order to classify the vocal quality on GRBAS scale, an approach of end-to-end neural network design is presented. Based on this approach, three neural networks are shown. These neural networks calculate the shortterm Fourier transform (STFT), cepstrum and shimmer of an audio signal. The training of the networks that calculate STFT and shimmer was successful. The network that calculates the cepstrum could not be trained, but an alternative model that calculates the autocovariance could. It is concluded that the developed neural networks are compatible with the proposed approach. This is because they allow the error gradient backpropagation, a necessary condition for training the complete model.

Downloads

Metrics

PDF views
299
Nov 01 '19Nov 04 '19Nov 07 '19Nov 10 '19Nov 13 '19Nov 16 '19Nov 19 '19Nov 22 '19Nov 25 '19Nov 28 '192.0
| |

Published

2019-11-01

How to Cite

García, M. A., & Destéfanis, E. A. (2019). Automatic vocal quality classification. AJEA (Proceedings of UTN Academic Conferences and Events), (4). https://doi.org/10.33414/ajea.4.408.2019