ASPIRE: Publications

Our Publications

A list of our publications is below

Journal [J], Conference [C], Workshop [W] and Preprint [P]

2025

[W] J. Fan, and D. S. Williamson, "JSQA: Speech Quality Assessment with Perceptually-Inspired Contrastive Pretraining Based on JND Audio Pairs," in Proc. IEEE WASPAA , (to appear), 2025. [PDF]

[C] I. E. Kibria, and D. S. Williamson, "AttentiveMOS: A Lightweight Attention-Only Model for Speech Quality Prediction," in Proc. Interspeech, pp. 2340-2344, 2025. [PDF]

[C] S. Sultana, and D. Williamson, "A Pre-training Framework that Encodes Noise Information for Speech Quality Assessment," Proc. ICASSP, 2025. [PDF]

[C] A. Kumar, A. Perrault, and D. Williamson, "Using RLHF to align speech enhancement approaches to mean-opinion quality scores," Proc. ICASSP, 2025. [PDF]

2024

[P] I. Kibria, and D. Williamson, "SWIM: An Attention-Only Model for Speech Quality Assessment Under Subjecdtive Variance," arXiv preprent, 2024. [PDF]

[P] S.A. Alavi Bajestan, M. Pitt, and D. Williamson, "A Contrastive-Learning Approach for Auditory Attention Detection," arXiv preprent, 2024. [PDF]

[P] S. B. H. Pias, A. Freel, R. Huang, D. Williamson, M. Kim, and A. Kapadia, "Building Trust Through Voice: How Vocal Tone Impacts User Perception of Attractiveness of Voice Assistants," arXiv preprent, 2024. [PDF]

[W] S. B. H. Pias, A. Freel, T. Trammel, T. Akter, D. Williamson, and A. Kapadia, "The Drawback of Insight: Detailed Explanations Can Reduce Agreement with XAI," ACM CHI Workshop on Human-Centered Explainable AI (HCXAI), 2024. [PDF]

[C] S. B. H. Pias, R. Huang, D. Williamson, M. Kim, and A. Kapadia, "The Impact of Perceived Tone, Age, and Gender on Voice Assistant Persuasiveness in the Context of Product Recommendations," in Proc. ACM Conf. on Conversational User Interfaces (CUI) , pp. 1-15, 2024. (Best Student Paper Award - Top 2% of Submissions) [Paper Link]

[J] J. Fan and D. Williamson, "From the perspective of perceptual speech quality: the robustness of frequency bands to noise," in Journal of the Acoustical Society of America (JASA) , vol. 155, pp. 1916-1927, 2024. [PDF]

[C] P. Manocha, D. Williamson, and A. Finkelstein, "CORN: Co-trained full-reference and no-reference speech quality assessment," Proc. ICASSP, pp. 376-380, 2024. [PDF]

2023

[P] P. Manocha, D. Williamson, and A. Finkelstein, "CORN: Co-trained full-reference and non-reference audio metrics," arXiv preprint arXiv:2310.09388, 2023. [PDF]

[P] Y. Liu, A. Kapadia and D. Williamson,"Privacy-preserving and Privacy-attacking Approaches for Speech and Audio -- A Survey," in arXiv preprint arXiv:2309.15087, 2023. [PDF]

[J] K. M. Nayem and D. S. Williamson, "Attention-Based Speech Enhancement Using Human Quality Perception Modeling," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 250-260, 2024, doi: 10.1109/TASLP.2023.3328282., 2023. [PDF]

[J] Y. Li, Y. Liu, and D. Williamson, "A Composite T60 Regression and Classifiaction Approach for Speech Dereverberation," IEEE/ACM Trans. on Audio, Speech, and Language Processing , vol. 31, pp. 1013-1023, 2023. [PDF]

2022

N. Randall, ..., Y. Li, D. Williamson, ..., "Finding ikigai: How robots can support meaning in later life," in Frontiers in robotics and AI vol. 9, 2022. [PDF]

[C] Y. Liu, A. Kapadia, and D. Williamson, "Preventing sensitive-word recognition using self-supervised learning to preserve user-privacy for automatic speech recognition," in Proc. INTERSPEECH , pp. 4207-4211, 2022. [PDF] [Video]

[C] Z. Zhang, D. Williamson, and Y. Shen, "Investigation on the Band Importance of Phase-aware Speech Enhancement," in Proc. INTERSPEECH , pp. 4651-4655, 2022. [PDF] [Video]

[C] G. Yi, W. Xiao, Y. Xiao, B. Naderi, S. Möller, W. Wardah, G. Mittag, R. Cutler, Z. Zhang, D. S. Williamson, F. Chen, F. Yang, and S. Shang, "ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications," in Proc. INTERSPEECH, pp. 3308-3312, 2022. [PDF]

2021

[J] Z. Zhang, Y. Xu, M. Yu, S.-X. Zhang, L. Chen, D. Williamson, and D. Yu, "Multi-channel multi-frame ADL-MVDR for target speech separation," IEEE/ACM Trans. on Audio, Speech, and Language Processing , vol. 29, pp. 3526-3540, 2021. [PDF]

[C] K. Md. Nayem and D. Williamson, “Incorporating Embedding Vectors from a Human Mean-Opinion Score Prediction Model for Monaural Speech Enhancement,” in Proc. INTERSPEECH , pp. 216-220, 2021. [PDF][Video]

[C] P. Vyas, A. Kuznetsova, and D. Williamson “Optimally Encoding Inductive Biases into the Transformer Improves End-to-End Speech Translation,” in Proc. INTERSPEECH , pp. 2287-2291, 2021. (Best Student Paper Award) [PDF][Video]

Y. Liu, Z. Xiang, E.J. Seong, A. Kapadia and D. Williamson, "Defending against microphone-based attacks with personalized noise," in Proc. Privacy Enhancing Technologies Symposium , 130-150, 2021. [PDF][Video]

[C] K. Md. Nayem and D. Williamson, “Towards An ASR Approach Using Acoustic and Language Models for Speech Enhancement,” in Proc. ICASSP , pp. 7123-7127, 2021. [PDF][Video]

[C] Y. Li, Y. Liu, and D. Williamson, “On loss functions for deep-learning based T60 estimation,” in Proc. ICASSP , pp. 486-490, 2021. [PDF][Video]

[C] Z. Zhang, P. Vyas, X. Dong, and D. Williamson, “An end-to-end non-intrusive model for subjective and objective real-world speech assessment using a multi-task framework,” in Proc. ICASSP , pp. 316-320, 2021. (Outstanding Student Paper Award) [PDF][Video]

2020

[J] X. Dong and D. Williamson, "Towards real-world objective speech quality and intelligibility assessment using speech-enhancement residuals and convolutional long short-term memory networks," in Journal of the Acoustical Society of America (JASA) , vol. 148, pp. 3348-3359, 2020. [PDF]

[C] X. Dong and D. Williamson, "A Pyramid Recurrent Network for Predicting Crowdsourced Speech-Quality Ratings of Real-World Signals," in Proc. INTERSPEECH , pp. 4631-4635, 2020. [PDF] [Video]

[C] Z. Zhang, D. Williamson, and Y. Shen, "Investigation of Phase Distortion on Perceived Speech Quality for Hearing-impaired Listeners," in Proc. INTERSPEECH , pp. 2512-2516, 2020. [PDF][Video]

[C] Z. Zhang, C. Deng, Y. Shen, D. Williamson, Y. Sha, Y. Zhang, H. Song, and X. Li, "On Loss Functions and Recurrency Training for GAN-based Speech Enhancement Systems," in Proc. INTERSPEECH, pp. 3266-3270, 2020. [PDF][Video]

[C] Y. (Grace) Li and D. Williamson, "A Return to Dereverberation in the Frequency Domain Using a Joint Learning Approach," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 7549-7553, 2020. [PDF] [Video]

[C] K. Nayem and D. Williamson, "Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers in the Magnitude and Phase Responses," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 6224-6228, 2020. [PDF] [Video]

[C] X. Dong and D. Williamson, "An Attention Enhanced Multi-Task Model for Objective Speech Assessment in Real-World Environments," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 911-915, 2020. [PDF] [Video]

2019

[C] H. Krishnakumar and D. Williamson, "A Comparison of Boosted Deep Neural Networks for Voice Activity Detection," in Proc. IEEE Global Conference on Signal and Information Processing (GlobalSIP) , pp. 1-5, 2019. [PDF]

[W] K. Nayem and D. Williamson, "Incorporating Intra-Spectral Dependencies with a Recurrent Output Layer for Improved Speech Enhancement," in Proc. IEEE International Workshop on Machine Learning for Signal Processing (MLSP) , pp. 1-6, 2019. [PDF]

[W] X.Dong and D. Williamson, "A Classification-aided Framework for Non-Intrusive Speech Quality Assessment," in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) , pp. 100-104, 2019. [PDF]

Z. Zhang, D. Williamson, and Y. Shen, "Impact of Amplification on Speech Enhancement Algorithms using an Objective Evaluation Metric," in Proc. International Congress on Acoustics (ICA) , 2019. [PDF]

[C] Z. Zhang, Y. Shen, and D. Williamson, "Objective Comparison of Speech Enhancement Algorithms with Hearing Loss Simulation," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 6845-6849, 2019. [PDF]

K. Berkson et. al, "Building a Common Voice Corpus for Laiholh (Hakha Chin)," in Proc. Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL), pp. 5-10, 2019. [PDF]

2018

[W] D. Williamson, "Monaural Speech Separation Using A Phase-Aware Deep Denoising Auto Encoder," in Proc. IEEE International Workshop on Machine Learning for Signal Processing (MLSP) , 2018. [PDF]

[C] X. Dong and D. Williamson, "Long-term SNR estimation using noise residuals and a two-stage deep-learning framework," in Proc. International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), pp. 351-360, 2018. [PDF]

2017

[J] F. Mayer, D. Williamson, P. Mowlaee, and D. L. Wang, "Impact of Phase Estimation on Single-Channel Speech Separation Based on Time-Frequency Masking," Journal of the Acoustical Society of America (JASA) , vol. 141, pp. 4668-4679, 2017. [PDF]

[J] D. Williamson and D. L. Wang, "Time-frequency masking in the complex domain for speech dereverberation and denoising," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. (IEEE TASLP) , vol. 25, pp 1492-1501, 2017. [PDF]

[C] D. Williamson and D. L. Wang, "Speech Dereverberation and Denoising using Complex Ratio Masks," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , pp. 5590-5594, 2017. [PDF]

2016

[J] D. Williamson, Y. Wang, and D. L. Wang, "Complex ratio masking for monaural speech separation," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. (IEEE TASLP), vol. 24, pp. 483-492, 2016. [PDF]

[C] D. Williamson, Y. Wang, and D. L. Wang, "Complex ratio masking for joint enhancement of magnitude and phase," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5220-5224, 2016. [PDF]

2015

[J] D. Williamson, Y. Wang, and D.L. Wang, "Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality," Journal of the Acoustical Society of America (JASA), vol. 138, pp. 1399-1407, 2015. [PDF]

[C] D. Williamson, Y. Wang, and D.L. Wang, "Deep neural networks for estimating speech model activations," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5113-5117, 2015. [PDF]

2014

[J] D. Williamson, Y. Wang, and D.L. Wang, "Reconstruction techniques for improving the perceptual quality of binary masked speech.," Journal of the Acoustical Society of America (JASA), vol. 136, pp. 892-902, 2014. [PDF]

[C] D. Williamson, Y. Wang, and D.L. Wang, "A two-stage approach for improving the perceptual quality of separated speech," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 7084-7088, 2014. [PDF]

2013

[C] D. Williamson, Y. Wang, and D.L. Wang, "A Sparse Representation Approach for Perceptual Quality Improvement of Separated Speech," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 7015-7019, 2013. [PDF]

2006

Y. E. Kim, D. Williamson, and S. Pilli, "Towards quantifying the album effect in artist identification," in Proc. International Symposium on Music Information Retrieval (ISMIR), 2006 (online abstract and poster presentation). [PDF]

Thesis

Y. Liu, Representation Learning for Enhanced Audio Privacy: Towards Robust and Secure Conversational Agents, Ph.D. Dissertation, Departments of Computer Science, Indiana University, Bloomington, IN 2025.

K. Md. Nayem, Towards Realistic Speech Enhancement: A Deep Learning Framework Leveraging Speech Correlations, Spectral Language Models, and Perceptual Evaluation, Ph.D. Dissertation, Departments of Computer Science, Indiana University, Bloomington, IN 2024.

Z. Zhang, Investigations on the Deep Learning Based Speech Enhancement Algorithms for Hearing-Impaired Population, Ph.D. Dissertation, Departments of Computer Science and Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN 2022.

X. Dong, Data-Driven Non-Intrusive Speech Quality and Intelligibility Assessment, Ph.D. Dissertation, Department of Computer Science, Indiana University, Bloomington, IN, 2021.

D. Williamson, Deep Learning Methods for Improving the Perceptual Quality of Noisy and Reverberant Speech, Ph.D. Dissertation, Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, 2016.

D. Williamson, Automatic Music Similarity Assessment and Recommendation, M.S. Thesis, Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, 2007.

© 2025 The ASPIRE Group. All Rights Reserved | Design by W3layouts