Analysis of audio signal using various transforms for enhanced audio processing

V. Arun Raj; M. Davidson Kamala Dhas

doi:10.53730/ijhs.v6nS2.8890

Authors

V. Arun Raj
arunraj@mepcoeng.ac.in
Department of ECE, Mepco Schlenk Engineering College (Autonomous), Sivakasi
M. Davidson Kamala Dhas Department of ECE, Mepco Schlenk Engineering College (Autonomous), Sivakasi

Keywords:

discrete fourier transform (DFT), discrete sine transform (DST), discrete cosine transform, modified discrete cosine transform (MDCT), integer modified discrete cosine transform

Abstract

Audio Signals are the portrayal of sounds. It changes with respect to frequencies rather than time, and it shows more information in the frequency domain. So it is much appropriate to evaluate in the frequency domain rather than the time domain. By using different transforms like DFT, DST, DCT, MDCT, Integer MDCT, the time domain audio signal can be converted into a frequency domain signal. The signal is reconstructed to analyze the features like mean square error, Signal to noise ratio, Peak signal to noise ratio between the original and reconstructed signal. Other features like energy, entropy, zero crossing rates (ZCR) were also considered for the evaluation. In this paper, different audio file formats were taken for interpretation. It includes wave file, mp3 file, m4a file, aac file, where wave file is in uncompressed format and mp3, m4a, aac are in compressed format. These compressed files come under lossy compression. The above-mentioned features are used for applications like music information retrieval (MIR). MIR includes onset detection, pitch detection and to measure the noise and loudness of the music.

Downloads

Download data is not yet available.

References

Theodoros Giannakopoulos, Aggelos Pikrakis ,“Introduction to Audio Analysis: A MATLAB Approach,” Academic press, 2014

Emmanuel Ravelli, Gaël Richardand Laurent Daudet “Audio Signal Representations for Indexing in the Transform Domain,” IEEE Transactions on audio, speech, and language processing, vol. 18, no. 3, March 2010

Sylvain Marchand, "Fourier-based methods for the spectral analysis of musical sounds," Signal processing conference (EUSIPCO), 2013 proceedings of the 21st european , vol., no., pp.1,5, 9-13 September. 2013

R.G. Moreno-Alvarado, Mauricio Martinez-Garcia,” DCT-compressive Sampling of Frequencysparse Audio Signals,” Proceedings of the World Congress on Engineering 2011 vol II, wce 2011, July 6 - 8, 2011, London, U.K.

Shuhua Zhang, Weibei Dou, Huazhong Yang, "MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing," Audio, speech, and language processing, IEEE Transactions on , vol.21, no.7, pp.1403,1414, July 2013

Dominique Fourer ,Sylvain Marchand, “Informed spectral analysis: audio signal parameter estimation using side information,” EURASIP Journal on Advances in Signal Processing, December 2013

R. R. Coifman, Y. Meyer, and V. Wickerhauser, “Wavelet analysis and signal processing,” in In Wavelets and their Applications. Citeseer,1992.

Vladimir Britnak, Pratnik Yip, Kamisetty R.Rao, “Discrete Cosine and Sine Transforms :General Properties, Fast Algorithms and Integer Approximations” Academic press, 2007

H.Malvar, “A Modulated Complex Lapped Transform and its Applications to Audio Processing,” in Proc.IEEE Int. Conf.Acoust.,Speech,Signal Process.(ICASSP ’99 ),March 1999, vol.3, pp.1421-1424.

C.Cheng, ”Method for estimating magnitude and phase in the MDCT domain,” in Proc. 116th AES Conv.,May 2004,pp.6091-6091,Audio Eng. Soc.

Mu-Huo Cheng and Yu-Hsin Hsu, “Fast IMDCT and MDCT Algorithms— A Matrix Approach,” IEEE Transactions on signal processing, vol. 51, no. 1, January 2003

Yaroslavsky, L., & Wang, Y., “ DFT, DCT, MDCT, DST and signal Fourier spectrum analysis,” EUPSICO 2000: European signal processing conference, pp. 1065-1068.

Yoshikazu Yokotani, Member IEEE, Ralf Geiger, Member IEEE, Gerald D.T.Schuller, Senior Member, IEEE & K.R.Rao,, “Lossless Audio Coding using the Int MDCT & the round error shaping”, IEEE trans. on Audio,Speech & Language Processing, Vol.14, No.6, Nov 2006.

Rongshan Yu, Member, IEEE, Susanto Rahardja, Lin Xiao and Chi Chung Ko, Senior Member, IEEE, “A Fine Granular Scalable to Lossless Audio Coder”, IEEE Trans. On Audio, Speech and Language Processing, Vol.14. No.4. July2006

Te Li, Student Member IEEE, Rongshan Yu, Member IEEE, Susanto Rahardja, Member IEEE, Soo Ngee Koh, Member IEEE, “On integer MDCT for Perceptual Audio Coding”, IEEE trans. on Audio,Speech & Language Processing, Vol.15, No.8, Nov 2007.