Evanshi Chand,
- Student, Department of Multimedia, BBK DAV College For Women, Amritsar, Punjab, India
Abstract
This research work focuses on comparative study of Librosa, a python-based library, and openSMILE, a C++ toolkit, with python bindings used in audio speech analysis. Librosa is ideal for beginners due to its simple structure and flexibility with strong integration with machine learning frameworks like TensorFlow and PyTorch. On the other hand, OpenSMILE is ideal for speech-centric tasks like speech-emotion recognition or paralinguistic studies, offering a wide range of pre-defined features like GeMAPS, eGeMAPS, ComParE etc. This study aims to contrast both tools based on various key factors such as their feature extraction, implementation, performance efficiency and limitations. The comparison focuses on their feature extraction capabilities, their performance and use, their strengths and weaknesses in context of audio processing feature extraction. Comparison between both would help users (researchers and developers) choose the better one for their tasks, whether its general audio analysis, emotion detection or large-scale speech feature extraction. After proper comparison of various features, this study tells that Librosa excels in flexibility and ease of use while openSmile offers higher scalability and deep speech based tasks.
Keywords: Librosa, openSMILE, speech recognition, feature extraction, python-based
[This article belongs to Journal of Open Source Developments ]
Evanshi Chand. Comparative Analysis Between Librosa and OpenSMILE. Journal of Open Source Developments. 2025; 12(03):06-10.
Evanshi Chand. Comparative Analysis Between Librosa and OpenSMILE. Journal of Open Source Developments. 2025; 12(03):06-10. Available from: https://journals.stmjournals.com/joosd/article=2025/view=232605
References
- Tzanetakis G, Cook P. Musical genre classification of audio signals. IEEE Trans Speech Audio Process. 2002 Jul 31; 10(5): 293–302.
- Atmaja BT, Akagi M. On the differences between song and speech emotion recognition: Effect of feature sets, feature types, and classifiers. In 2020 IEEE region 10 conference (TENCON). 2020 Nov 16; 968–972.
- McFee B, Raffel C, Liang D, Ellis DP, McVicar M, Battenberg E, Nieto O. Librosa: Audio and music signal analysis in python. SciPy. 2015 Jul 6; 2015: 18–24.
- Eyben F, Wöllmer M, Schuller B. Opensmile: the Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on Multimedia. 2010 Oct 25; 1459–1462.
- Lenain R, Weston J, Shivkumar A, Fristed E. Surfboard: Audio feature extraction for modern machine learning. arXiv preprint arXiv:2005.08848. 2020; 1–5.
- Singh A, Chodankar S, Suvarna A. Audio feature extraction tools. Int Res J Mod Eng Technol Sci. 2021 Apr; 3(4): 2340–2345.
- Chen J, Ye J, Tang F, Zhou J. Automatic detection of Alzheimer’s disease using spontaneous speech only. In Interspeech. 2021 Aug; 2021: 3830.
- Schuller B, Steidl S, Batliner A, Vinciarelli A, Scherer K, Ringeval F, Chetouani M, Weninger F, Eyben F, Marchi E, Salamin H. The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism. INTERSPEECH 2013. 2013; 1–6.
- Kumar Ankit, Singh Kshitiz, Sharma Anmol, Gupta Sachi. Advances in Speech Emotion Recognition and Analysis: A Review of Applied Machine Learning Methodologies. Int J Res Appl Sci Eng Technol. 2024; 12(4): 4617–4621.
- Schuller B, Wallhoff F, Arsic D, Rigoll G. Musical signal type discrimination based on large open feature sets. In 2006 IEEE International Conference on Multimedia and Expo. 2006 Jul 9; 1089–1092.

Journal of Open Source Developments
| Volume | 12 |
| Issue | 03 |
| Received | 28/04/2025 |
| Accepted | 20/09/2025 |
| Published | 31/10/2025 |
| Publication Time | 186 Days |
Login
PlumX Metrics