Automatic stress detection using speech and advanced machine learning methods

  1. MSc thesis
  2. Αυτόματη αναγνώριση καταστάσεων στρες μέσω φωνής και εξελιγμένων μεθόδων μηχανικής μάθησης
  3. KALATZANTONAKIS-JULLIEN, GEORGE-MARIOS
  4. Βιοπληροφορική και Νευροπληροφορική (ΒΝΠ)
  5. 08 Απριλίου 2021 [2021-04-08]
  6. Αγγλικά
  7. stress detection | speech | voice | Mel cepstral coefficients | artificial intelligence | machine learning | emotion recognition | affective computing | biosignals | feature selection | hyperparameter optimization | artificial neural networks
  8. 0
    • In this thesis, we aim to provide a comprehensive study on automatic stress detection based on human vocal features. Our study experimental dataset contains the voices of 58 Greek-speaking participants (24 male, 34 female, 26.9±4.8 years old), both in neutral and stressed conditions. We extracted a total of 81 speech-derived features after extensive study of the relevant literature. We investigated and selected the most robust features using automatic feature selection methods, comparing multiple feature ranking methods (such as RFE, mRMR, stepwise fit) to assess their pattern across gender and experimental phase factors. Then, classification was performed using 22 conventional machine learning classifiers, both for the entire dataset, and then for each experimental task, for both genders combined and separately. We finally performed classifier hyperparameter optimization using a grid search on the most promising classifiers, such as the Gaussian SVM and the ensemble AdaBoost classifier. The performance was evaluated using 10-fold cross-validation on the speakers, and using a leave-one-speaker-out approach. Our analysis achieved a best classification accuracy of 97.36% (97.57% F1-score) using the AdaBoost classification algorithm followed by the SVM with 91.88% accuracy (92.63% F1-score). Deep Learning methodologies were also utilized, in which the RNN achieved an accuracy of 76.2% while the combination of CNN with LSTM achieved 73.37% accuracy. From our analysis, specific vocal features were identified as being robust and relevant to stress along with parameters to construct the stress model. However, it is was observed that while for some participants all classifiers achieved good performance, for some others they performed very poorly, indicating the susceptibility of speech to bias and masking and thus the need for universal speech markers for stress detection.
  9. Items in Apothesis are protected by copyright, with all rights reserved, unless otherwise indicated.