IMPROVEMENT OF INFERENCE-TIME PREDICTION FOR SPEECH EMOTION RECOGNITION USING ITERATIVE kNN MAJORITY VOTING ON WavLM FEATURE EMBEDDINGS
| dc.contributor.author | FALANA, John Oluwaseun | |
| dc.contributor.author | Covenant University Dissertation | |
| dc.date.accessioned | 2025-10-02T14:58:17Z | |
| dc.date.issued | 2025-08 | |
| dc.description.abstract | The prediction inconsistency and poor decision boundaries in high-dimensional embedding spaces limit the performance of Speech Emotion Recognition (SER) systems. This study proposes a post-processing framework that applies iterative k-Nearest Neighbors (kNN) majority voting to refine the output of a fine-tuned WavLM model without requiring retraining. Using the CREMA-D, an English dataset with 7,442 samples, embeddings were extracted and iteratively relabelled based on local neighborhood structure in the latent space. This refinement process enhanced label consistency and leveraged proximity-based corrections at inference time. Model performance was evaluated using standard SER metrics (accuracy and F1-score) and t-SNE visualization. Results show that repeated kNN refinement improves both classification accuracy and the clarity of decision boundaries, with a 1.87% improvement in F1 score from baseline compared to an improvement of 0.67% by the SCL+kNN approach from baseline. The approach is model-agnostic, efficient, and data-centric, offering a viable alternative to computationally expensive retraining. It highlights the value of embedding-space operations for improving SER reliability in real-world settings. | |
| dc.identifier.uri | https://repository.covenantuniversity.edu.ng/handle/123456789/50404 | |
| dc.language.iso | en | |
| dc.publisher | Covenant University Ota | |
| dc.subject | Speech Emotion Recognition | |
| dc.subject | Human Computer Interaction | |
| dc.subject | K-Nearest Neighbors | |
| dc.subject | Self-Supervised Learning | |
| dc.subject | WavLM | |
| dc.title | IMPROVEMENT OF INFERENCE-TIME PREDICTION FOR SPEECH EMOTION RECOGNITION USING ITERATIVE kNN MAJORITY VOTING ON WavLM FEATURE EMBEDDINGS | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Pages from FALANA JOHN OLUWASEUN 23PCG02638 final printing copy.pdf
- Size:
- 287.52 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed to upon submission
- Description: