Dr. Rohit Sinha, EMST Lab IIT-G

Dr. Rohit Sinha

Publications

Research Papers in refereed journals:

Haris B C and R. Sinha, “Low-complexity Speaker Verification with Decimated Supervector Representations”, Speech Communication (Elsevier), 67(2), pp. 11-22, December 2014. [Online] http://dx.doi.org/10.1016/ j.specom.2014.12.005 [Impact Factor: 1.794]
Haris B C and R. Sinha, “Exploring Data-independent Dimensionality Reduction in Sparse Representation based Speaker Identification,” Circuits, Systems and Signal Processing, vol. 33(8), pp. 2521-2538, August 2014. [Impact Factor: 0.982]
S. K. Yadav, R. Sinha and P. K. Bora, “ECG Signal Denoising using Nonlocal Wavelet Transform Domain Filtering”, IET Signal Processing, vol. 9(1), pp. 88-96, March 2015. [Impact Factor: 0.691]
S. Shahnawazuddin, D. Thotappa, B D Sarma, A Deka, S R M Prasanna and R Sinha , “Low Complexity On-Line Adaptation Techniques in Context of Assamese Spoken Query System,” Springer Journal : Signal Processing Systems, 15 pages, May 2014 (In press). [Online] http://dx. doi.org/10.1007/s11265-014-0906-z. [Impact Factor: 0.564]
S. Shahnawazuddin and R. Sinha, “Improved Bases Selection in Acoustic Model Interpolation for Fast On-Line Adaptation,” IEEE Signal Processing Letters, vol. 21(4), pp. 493-497, April 2014. [Impact Factor: 1.693]
G. Pradhan, Haris B C, S. R. M. Prasanna, and R. Sinha, “Speaker verification in sensor and acoustic environment mismatch conditions”, International Journal of Speech Technology (Springer), vol. 15, pp. 381-392, June 2012. [Online] http://dx.doi.org/10.1007/s10772-012-9159-z
Haris B C, G. Pradhan, A. Misra, S. R. M. Prasanna, R. Das, and R. Sinha, “Multivariability speaker recognition database in Indian scenario”, International Journal of Speech Technology (Springer), vol. 15, pp. 441-453, March 2012, [Online] http://dx.doi.org/10.1007/s10772- 012-9140-x
S. Ghai and R. Sinha, “Exploring the Effect of Differences in the Acoustic Correlates of Adults’ and Children’s Speech in the Context of Automatic Speech Recognition”, in special issue on “Atypical Speech” of EURASIP Journal on Audio, Speech, and Music Processing, March 2010, Article ID 318785, 15 pages, doi:10.1155/2010/318785 [Impact Factor: 0.63]
Rohit Sinha and S. Umesh, “A shift-based approach to speaker normalization using nonlinear frequency-scaling model”, Speech Communication (Elsevier), vol. 50(3), pp. 191-202, March 2008. [Impact Factor: 1.609]
S. Umesh and Rohit Sinha, “A Study of Filter Bank Smoothing in MFCC Features for Recognition of Children’s Speech”, IEEE Trans. on Audio, Speech and Language Processing, vol. 15(8), pp. 2418-2430, November 2007. [Impact Factor: 1.848]
M.J.F. Gales, P.C. Woodland, H.Y. Chan, D. Mrva, Rohit Sinha, S.E. Tranter, “Progress in the CU-HTK Broadcast News Transcription System,” IEEE Trans. on Audio, Speech and Language Processing, vol. 14(5), pp. 1513-1525, September 2006. [Impact Factor: 1.945]

Research Papers in refereed international conferences:

S. Shahnawazuddin and R. Sinha , “A Low Complexity Model Adaptation Approach involving Sparse Coding over Multiple Dictionaries,” in Proc. of INTERSPEECH 2014, Singapore, Sept. 2014 [Acceptance ratio: 48%]
S. Shahnawazuddin and R. Sinha , “A low complexity cluster model interpolation based on-line adaptation technique for spoken query systems,” in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP), Singapore, Sept 2014 [Acceptance ratio: 53%]
S. K. Yadav, R. Sinha and P. K. Bora, “Image Denoising Using Ridgelet Transform in a Collaborative Filtering Framework,” in Proc. International Conference on Signal Processing and Communications (SPCOM), Bangalore, July 2014 2014 [Acceptance ratio: 22% = 91/410]
K. Khanikar, R. Sinha and R. Bhattacharjee, “Sparse Representation Based Tracking of Frequency Hopping Primary User for Cognitive Radio,” in Proc. International Conference on Signal Processing and Communications (SPCOM), Bangalore, July 2014 [Acceptance ratio: 22% = 91/410]
H. Kathania, S. Shahnawazuddin and R. Sinha, “Exploring HLDA Based Transformation for Reducing Acoustic Mismatch in Context of Children Speech Recognition,” in Proc. International Conference on Signal Processing and Communications (SPCOM), Bangalore, July 2014 [Acceptance ratio: 22% = 91/410]
Sunil Y. and Rohit Sinha, “Sparse Representation Based Approach to Artificial Bandwidth Extension of Speech,” in Proc. Intl. Conf. on Signal Proc. and Comm. (SPCOM), July 2014. [Accept ratio: 22% = 91/410]
Sunil Y. and Rohit Sinha, “Exploration of MFCC Based ABWE for Robust Children’s Speech Recognition Under Mismatched Condition,” in Proc. International Conference on Signal Processing and Communications (SPCOM), July 2014. [Acceptance ratio: 22% = 91/410]
S. Shahnawazuddin and R. Sinha, “Fast On-Line Adaptation using KSVD based Acoustic Clustering,” in Proc. of IEEE INDICON, Bombay, Dec. 2013. [Acceptance ratio: 55%]
O. P. Singh, Haris B C and R. Sinha, “Sparse Representation for Language Identification using Prosodic Features for Indian Languages”, in Proc. of IEEE INDICON, Bombay, Dec. 2013. [Acceptance ratio: 55%]
O. P. Singh, Haris B C and R. Sinha, “Language identification using sparse representation: A comparison between GMM supervector and i-vector based approaches”, in Proc. of IEEE INDICON, Bombay, Dec. 2013. [Acceptance ratio: 55%]
H. Kathania, S. Ghai and R. Sinha, “Soft-Weighting Technique for Robust Children Speech Recognition under Mismatched Condition”, in Proc. of IEEE INDICON, Bombay, Dec. 2013. [Acceptance ratio: 55%]
Haris B C, G. Pradhan, Rohit Sinha and S. R. M. Prasanna, “The IITG Speaker Varification Systems for NIST SRE 2012”, in Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013. [Acceptance ratio: 45% =1418/3150]
Haris B C and R Sinha,“On exploring the similarity and fusion of i-vector and sparse representation based speaker verification systems,” in Proc. of Odyssey 2012: The Speaker and Language Recognition Workshop, Singapore, June 2012 [Acceptance ratio: 43%]
Haris B C and R. Sinha , “Sparse representation over learned and discriminatively learned dictionaries for speaker verification,” in Proc. of IEEE ICASSP, Kyoto, Japan, March 2012 [Acceptance ratio: 49%]
Om Prakash Singh, Abhijit Mitra and Rohit Sinha,“Modified Joint Channel Source Decoding structure for reliable image transmission”, in Proc. of International Conference on Communication, Information & Computing Technology (ICCICT), Mumbai, October 2012
Haris B C and R. Sinha, Sparse Representation of Total Variability Smoothed GMM Mean Supervectors for Speaker Verification, in Proc. of Intl. Conf. on Signal Proc. and Comm. (SPCOM), Bangalore, July 2012
Sunil Y. and R. Sinha, “Exploration of Class Specific ABWE for Robust Children’s ASR under Mismatched Condition”, in Proc. Intl. Conf. on Signal Processing and Communications (SPCOM), Bangalore, July 2012
S. Ghai and R. Sinha, “Enhancing children’s speech recognition under mismatched condition by explicit acoustic normalization, in Proc. of INTERSPEECH 2011, Makuhari, Japan, Sept. 2011
Shweta Ghai and Rohit Sinha, “Analyzing Pitch Robustness of PMVDR and MFCC Features for Children’s Speech Recognition, in Proc. of International Conference on Signal Processing and Communications (SPCOM), Bangalore, India, July 2010, pp. 1-5
R. Sinha and Shweta Ghai, “On the use of Pitch Normalization for Improving Children’s Speech Recognition, in Proc. of INTERSPEECH, Brighton, UK, September 2009, pp. 568-571
Shweta Ghai and Rohit Sinha, “Exploring the Role of Spectral Smoothing in context of Children’s Speech Recognition, in Proc. of INTERSPEECH, Brighton, UK, September 2009, pp. 1607-1610
Shweta Ghai and Rohit Sinha, “An investigation into the effect of pitch transformation on children speech, in Proc. IEEE TENCON, Hyderabad, India, Nov. 2008, pp. 1-6.
Rohit Sinha and B. Sandeep Kumar, “Robustness of Speaker Normalization Approaches: A study in Proc. IEEE TENCON, Hyderabad, India, Nov. 2008, pp. 1-6.
Krishna Chaitanya, Rohit Sinha, “Energy and entropy based switching algorithm for speech endpoint detection in varying SNR conditions,” in Proc. INTERSPEECH , Brisbane, Australia, Sept. 2008, pp. 2578-2581
M.J.F. Gales, X. Liu, R. Sinha, P.C. Woodland, K. Yu, S. Matsoukas, T. Ng, K. Nguyen, L. Nguyen, J-L Gauvain, L. Lamel and A. Messaoudi, “Speech Recognition System combination for Machine Translation”, in Proc. of IEEE ICASSP, Honolulu, USA, April 2007, vol. 4, pp. 1227-1280.
M. Tomalin, M.J.F. Gales, X.A. Liu, R. Sinha, K.C. Sim, L. Wang, P.C. Woodland and K. Yu, “Improving speech transcription for Mandarin-English translation”, Proc. of ICASSP, Honolulu, USA, April 2007, vol. 4, pp. 97-100.
R. Sinha, M.J.F. Gales, D.Y. Kim, X.A. Liu, K.C. Sim and P.C. Woodland, “The CUHTK Mandarin Broadcast News Transcription System”, in Proc. of IEEE International Conference on Acoustic Speech and Signal Processing, Toulouse, France, May 2006, vol. 1, pp. 1077-1080.
SV Bharath Kumar, S Umesh, R Sinha, “Study of non-linear frequency warping functions for speaker normalization”, in Proc. of IEEE International Conference on Acoustic Speech and Signal Processing, Toulouse, France, May 2006. [Acceptance ratio: 48.1%=1465/3045]
R. Sinha, S.E. Tranter, M.J.F. Gales and P.C. Woodland, “The Cambridge University March 2005 Speaker Diarisation System”, in Proc. of European Conference on Speech Communication Technology, Lisbon, Sept. 2005, pp. 2437-2440. [Acceptance ratio: 62%=855/1379]
S.E. Tranter, M.J.F. Gales, Rohit Sinha, S. Umesh and P.C. Woodland, “The Development of Cambridge University RT-04 Diarisation System, in Proc. of Fall 2004 Rich Transcription Workshop (RT-04F), Palisades, NY, November 2004.
S. Umesh, Rohit Sinha and S. V. Bharath Kumar, ”An investigation into Front-End Signal Processing for Speaker Normalization,”, in Proc. of IEEE International Conference on Acoustic Speech and Signal Processing, Montreal, Canada, May 2004, vol. 1, pp. 121-124. [Acceptance ratio: 51.8%=1262/2434]
S. V. Bharath Kumar, S. Umesh and Rohit Sinha, “Non-uniform Speaker Normalization using Affine-transform”, in Proc. of IEEE International Conference on Acoustic Speech and Signal Processing, Montreal, Canada, May 2004, vol. 1, pp. 121-124. Voted top paper in its review category. [Acceptance ratio: 51.8%=1262/2434]
Rohit Sinha and S. Umesh, “A Method for Compensation of Jacobian in Speaker Normalization, in Proc. of IEEE International Conference on Acoustic Speech and Signal Processing, Hong Kong, April 2003, vol. 1, pp. 560-563. [Acceptance ratio: 61%=774/1261]
Rohit Sinha and S. Umesh, “Non-uniform Scaling based Speaker Normalization”, in Proc. of International Conference on Acoustic Speech and Signal Processing, Florida, USA, May 2002, vol. 1, pp. 589-592. [Acceptance ratio: 56.9%=1007/1770]
S. Umesh, S.V. Bharath Kumar, M.K. Vinay, Rohit Sinha and Rajesh Kumar, “A Simple Approach to Vowel Normalization, in Proc. of IEEE International Conference on Acoustic Speech and Signal Processing, USA, May 2002, vol.1, pp. 517-520. [Acceptance ratio: 56.9%=1007/1770]

Research Papers in refereed national conferences:

K. Khanikar, R. Sinha, and R. Bhattacharjee, “Sparse Coding Based Spectrum Sensing in Presence of Multiple Frequency Hopping Primary Users,” in Proc. of 21th NCC, IIT Bombay, Feb. 2015
S. Dey, S. Barman, P.K. Bhukya, R.K. Das, Haris B C, S.R.M. Prasanna and R. Sinha, “Speech biometric based attendance system”, in Proc. of 20th NCC, IIT Kanpur, Feb. 2014
S. Shahnawazuddin and R. Sinha, “Fast On-Line Adaptation using KSVD based Acoustic Clustering”, in Proc. of IEEE INDICON, IIT Bombay, Dec. 2013
O. P. Singh, Haris B C and R. Sinha, “Sparse Representation for Language Identification using Prosodic Features for Indian Languages”, in Proc. of IEEE INDICON, IIT Bombay, Dec. 2013
O. P. Singh, Haris B C and R. Sinha, “Language identification using sparse representation: A comparison between GMM supervector and i-vector based approaches”, in Proc. of IEEE INDICON, IIT Bombay, Dec. 2013
H. Kathania, S. Ghai and R. Sinha, “Soft-Weighting Technique for Robust Children Speech Recognition under Mismatched Condition”, in Proc. of IEEE INDICON, IIT Bombay, Dec. 2013
S Shahnawazuddin, D. Thotappa, B D Sarma, A Deka, S R M Prasanna and R Sinha, “Assamese Spoken Query System to Access the Price of Agricultural Commodities”, in Proc. of 19th NCC, IIT Delhi, Feb. 2013
Vipul Garg, Harsh Kumar and Rohit Sinha, “Speech based emotion recognition based on hierarchical decision tree with SVM, BLG and SVR classifiers”, in Proc. of 19th NCC, IIT Delhi, Feb. 2013
Haris B C and R. Sinha, “Speaker verification using sparse representation over KSVD learned dictionary, in Proc. of 18th NCC, IIT Kharagpur, Feb. 2012
Haris B C and R. Sinha, “Exploring sparse representation classification for speaker verification in realistic environment, in Proc. of Centenary Conference, Electrical Engineering, IISc, Bangalore, Dec. 2011
Sunil Y., S. Ghai and R. Sinha, “Exploration of Artificial Bandwidth Expansion for Improving Children’s ASR in Mismatched Condition, in Proc. of Centenary Conference, Electrical Engineering, IISc, Bangalore, Dec. 2011
Haris B C, G. Pradhan, A. Misra, S. Shukla, R. Sinha, and S. R. M. Prasanna, “Multivariability speech database for robust speaker recognition, in Proc. 17th NCC, IISc, Bangalore, Feb. 2011
Shweta Ghai and Rohit Sinha, “Maximum Likelihood Pitch Normalization for Improving Children’s Speech Recognition, in Proc. of 15th NCC., IIT Guwahati, India, January 2009, pp. 316-320
S. Umesh, Rohit Sinha, and D.R Sanand, “Using Vocal-Tract Length Normalisation in Recognition of Children Speech, in Proc. of 13th National Conference on Communications, IIT Kanpur, India, January 2007
Rohit Sinha and S. Umesh, “A Study into Front-End Signal Processing for Automatic Speech Recognition, in Proc. of Workshop on Spoken Language Processing, Tata Institute of Fundamental Research, Mumbai, India, January 2003, pp. 87-92.
Rohit Sinha and S. Umesh, “Investigation into Frequency Warping and Spectral Smoothing for Vocal Tract Length Normalization, in Proc. of 9th National Conference on Communications, IIT Madras, Chennai, India, January 2003, pp. 70-74.
S. Umesh, M. Belkhode and Rohit Sinha, “Comparison of Front-End Features for Speech Recognition, in Proc. of 5th National Conference on Communications, IIT Kharagpur, India, January 1999, pp. 163-170.

Research Papers in un-refereed international conferences:

R. Sinha and Haris B C, “IITG Speaker Verification Systems for NIST SRE-2012 Evaluations”, in Proc. NIST Speaker Recognition Workshop (SRE), Orlando, Florida, USA, December 2012, pp. 1-4.

Research Papers in un-refereed national conferences:

R. Sinha and S. Shahnawazuddin, “ Broad Acoustic Space Adaption in context of Assamese Agricultural Commodity Name Recognition System”, in Proc. of IETE Golden Jubilee Zonal Seminar on “Harnessing Relevant Technologies for NE Region”, Guwahati, India, May 2013, pp. 11-14.