Explore Critical Vulnerabilities of Self-supervised Learning

In this project, we plan to answer this question "do we have some strangely and maliciously constructed images that if they are given to contrastive-based SSL methods and the training is done securely and faithfully, the final features become useless like random features" building on our recent work on robust and secure deep learning [8].  The question is  how much corrupted data is needed. Say if a malicious user uploads 1-2% corrupted data and breaks the learning process (features become like random features), it could be very alarming. But if a lot of corrupt images are needed, then it means that the current systems are quite robust.

To relax requirement of obtaining labels for massive databases, SSL learns a representation function of the data via a pretext task using an augmentation strategy e.g, rotating and recoloring images [1–3]. Contrastive representation learning, in particular, has achieved SOTA performance on a number of tasks [3–5] where contrastive loss functions encourage the feature encoder to map semantically similar samples close to each other in the latent space, while repelling dissimilar samples.

In this project, our goal is to generate some adversarial noise that can increase the contrastive loss. We first train an encoder on a clean dataset and then we use this encoder to generate adversarial noise for a portion of data (a tailored adversarial noise for each data instance). The server then trains its own model on this poisoned dataset. We then optimize both the noise and the encoder simultaneously (mimicking a game b/w the adversary and server). The goal is to generate the best possible noise for the best possible encoder. 

This is an important problem to uncover vulnerabilities of well-known contrastive-based representation learning methods. Contrastive learning seems to be more sensitive to positive examples that are generated securely at hand, say in a server. It is also shown that contrastive-based representation learning is vulnerable to backdoor attacks [6,7]. 

In this project, we plan to answer this question "do we have some strangely and maliciously constructed images that if they are given to contrastive-based SSL methods and the training is done securely and faithfully, the final features become useless like random features" building on our recent work on robust and secure deep learning [8].  The question is  how much corrupted data is needed. Say if a malicious user uploads 1-2% corrupted data and breaks the learning process (features become like random features), it could be very alarming. But if a lot of corrupt images are needed, then it means that the current systems are quite robust.

This project is available for a master student with a strong machine learning background. If interested, please contact Associate Professor Ali Ramezani-Kebrya for details. Students should be familiar with PyTorch, computer vision, representation learning, and optimization.

[1] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. In International Conference on Learning Representations (ICLR), 2018.

[2] Richard Zhang, Phillip Isola, and Alexei A. Efros. Colorful image colorization. 2016.

[3] Carl Doersch, Abhinav Gupta, and Alexei A. Efros. Unsupervised visual representation learning by context prediction. In International Conference on Computer Vision (ICCV), 2015.

[3] Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Kr.henbühl. Sampling matters in deep embedding learning. In International Conference on Computer Vision (ICCV), 2017.

[4] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning (ICML), 2020.

[5] Jean-Bastien Grill, Florian Strub, Florent Altch., Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning. In Advances in Neural Information Processing Systems (NeurIPS), 2020.

[6] Nicholas Carlini and Andreas Terzis. Poisoning and backdooring contrastive learning. In International Conference on Learning Representations (ICLR), 2022.

[7] Jinyuan Jia, Yupei Liu, and Neil Zhenqiang Gong. Badencoder: Backdoor attacks to pre-trained encoders in self-supervised learning. In IEEE Symposium on Security and Privacy (SP), 2022.

[8] Ali Ramezani-Kebrya*, Iman Tabrizian*, Fartash Faghri, and Petar Popovski. MixTailor: Mixed gradient aggregation for robust learning against tailored attacks. Transactions on Machine Learning Research (TMLR), 2022.

 

Emneord: Self-supervised Learning, Representation Learning, Security
Publisert 10. aug. 2023 21:24 - Sist endret 10. aug. 2023 21:24

Veileder(e)

Omfang (studiepoeng)

60