Spectral-Temporal Saliency Masks and Modulation Tensorgrams for
Generalizable COVID-19 Detection
Abstract
Speech COVID-19 detection systems have gained popularity as they
represent an easy-to-use and low-cost solution that is well suited for
at-home long-term monitoring of patients with persistent symptoms.
Recently, however, the limited generalization capability of existing
deep neural network based systems to unseen datasets has been raised as
a serious concern, as has their limited interpretability. In this paper,
we propose two innovations to help overcome these issues. First, we
propose the use of a 3-dimensional modulation frequency tensor (called
modulation tensorgram representation, MTR) as input to a convolutional
recurrent neural network for COVID-19 detection. The representation is
known to provide robustness against different environmental factors seen
across datasets. Next, we propose the use of spectro-temporal saliency
masking to aggregate regions of the MTR related to COVID-19, thus
helping further improve the generalizability and interpretability of the
model. Experiments are conducted on three public datasets and results
show the proposed solution consistently outperforming two benchmark
systems in within-, across-, and unseen-dataset tests. The proposed
method relies on a similar number of parameters to the benchmark, thus a
promising solution for at-home monitoring of COVID-19 infection.