Abstract
Recent works have highlighted how misinformation is plaguing our online
social networks. Numerous algorithms on automated misinformation
detection are centered around deep learning~(DL) which
requires large data for training. However, privacy and ethical concerns
reduce data sharing by stakeholders, impeding data-driven misinformation
detection. Current data encryption techniques providing privacy
guarantees cannot be naively extended to text inference with DL models,
mainly due to the errors induced by stacked encrypted operations and
polynomial approximations of the otherwise encryption-incompatible
non-polynomial operations. In this paper, we show, formally and
empirically, the effectiveness of (1) $L_2$ regularized training to
reduce the overall error induced by approximate polynomial activations,
and (2) sigmoid activation to regulate the error accumulated due to
cascaded operations over encrypted data. We assume a federated
learning-encrypted inference~(FL-EI) setup for
text-based misinformation detection as a (secure and privacy-aware
cloud) service, where classifiers are securely trained in FL framework
and inference is performed on homomorphically encrypted data. We
evaluate three architectures—Logistic Regression~(LR),
Multilayer Perceptron~(MLP), and Self-Attention
Network~(SAN)—on two public text-misinformation
datasets with some interesting results, for example, by simply replacing
ReLU activation with sigmoid, we were able to reduce the output error by
$1750\times$ in the best case to
$43.75\times$ in the worst case.