Backdoors in Deep Learning @ NeurIPS 2023
The Good, the Bad and the Ugly - Modern AI development requires using and sharing of models and data safely. Uncovering backdoor, a foe and a friend at the front door.
Deep neural networks (DNNs) are revolutionizing almost all AI domains and have become the core of many modern AI systems. Despite their superior performance compared to classical methods, DNNs also face new security problems, such as adversarial and backdoor attacks, that are hard to discover and resolve due to their black-box-like property. Backdoor attacks are possible because of insecure model pretraining and outsourcing practices. Due to the complexity and the tremendous cost of collecting data and training models, many individuals/companies employ models or training data from third parties. Malicious third parties can add backdoors into their models or poison their released data before delivering it to the victims to gain illegal benefits. This threat seriously damages the safety and trustworthiness of AI development.
While most works consider backdoors “evil”, some studies leverage them for good purposes. A popular approach is to use the backdoor as a watermark to detect illegal uses of commercialized data/models. Watermarks can also be used to mark generated data, which becomes crucial with the introduction of big generative models (LLMs, text-to-image generators). For instance, the paper “A Watermark for Large Language Models” has received an outstanding paper award at ICML 2023, showing the community’s great interest in this critical topic. Besides, a few works employ the backdoor as a trapdoor for adversarial defense. Learning the underlying working mechanisms of backdoors also elevates our understanding of how deep learning models work. This workshop is designed to provide a comprehensive understanding of the current state of backdoor research. Our goal is to foster discussion and perspective exchange, as well as to engage the community in identifying social good applications of backdoors.
Schedule
⭐ Link to NeurIPS page: https://neurips.cc/virtual/2023/workshop/66550 ⭐
Start Time (CST/GMT-06:00, New Orleans) | Session | Speaker(s) |
---|---|---|
08:55 am | Opening Remarks | Organizers |
09:00 am | Invited Talk 1: A Blessing in Disguise: Backdoor Attacks as Watermarks for Dataset Copyright | Yiming Li |
09:30 am | Invited Talk 2: Recent Advances in Backdoor Defense and Benchmark | Baoyuan Wu |
10:00 am | Coffee Break | |
10:30 am | Invited Talk 3: The difference between safety and security for watermarking | Jonas Geiping |
11:00 am | Oral 1: Effective Backdoor Mitigation Depends on the Pre-training Objective | Sahil Verma, Gantavya Bhatt, Soumye Singhal, Arnav Das, Chirag Shah, John Dickerson, Jeff A Bilmes |
11:15 am | Invited Talk 4: Universal jailbreak backdoors from poisoned human feedback | Florian Tramèr |
11:45 am | Lunch Break | |
01:00 pm | Oral 2: VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models | Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho |
01:15 pm | Oral 3: The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline | Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi |
01:30 pm | Invited talk 5: Is this model mine? On stealing and defending machine learning models | Adam Dziedzic |
02:00 pm | Invited talk 6 | Ruoxi Jia |
02:30 pm | Coffee Break | |
03:00 pm | Poster Session | Paper Authors |
03:45 pm | Oral 4: Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection | Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin |
04:00 pm | Oral 5: BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models | Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li |
04:15 pm | Invited Talk 7: Decoding Backdoors in LLMs and Their Implications | Bo Li |
04:45 pm | Panel Discussion | Moderator: Eugene Bagdasaryan |
05:15 pm | Closing Remarks | Organizers |
Speakers
Bo Li Associate Professor, UIUC |
Ruoxi Jia Assistant Professor, Virginia Tech |
Adam Dziedzic Assistant Professor, CISPA |
Florian Tramèr Assistant Professor, ETH Zürich |
Yiming Li Research Professor, Zhejiang University |
Baoyuan Wu Associate Professor, CUHK-Shenzhen |
Jonas Geiping Research Group Leader, ELLIS Institute & MPI-IS |
Panelists
Franziska Boenisch Tenure-track Faculty, CISPA |
Bo Li Associate Professor, UIUC |
Baoyuan Wu Associate Professor, CUHK-Shenzhen |
Ruoxi Jia Assistant Professor, Virginia Tech |
Workshop Sponsors
We gratefully acknowledge the support from our Sponsors.
Organizers
Khoa D Doan VinUniversity, Vietnam |
Aniruddha Saha University of Maryland, College Park, USA |
Anh Tuan Tran VinAI Research, Vietnam |
Yingjie Lao Clemson University, USA |
Kok-seng Wong VinUniversity, Vietnam |
Ang Li Simular Research, USA |
Haripriya Harikumar Deakin University, Australia |
Eugene Bagdasaryan Cornell Tech, USA |
Micah Goldblum New York University, USA |
Tom Goldstein University of Maryland, College Park, USA |
Organizers affiliations
Call for Papers
We cordially invite submissions and participation in our “Backdoors in Deep Learning: The Good, the Bad, and the Ugly” workshop (neurips2023-bugs.github.io) that will be held on December 15 or 16, 2023 at NeurIPS 2023, New Orleans, USA.
The submission deadline is September 29, 2023 October 6th, 2023, 23:59 AoE and the submission link https://openreview.net/group?id=NeurIPS.cc/2023/Workshop/BUGS.
Motivation and Topics
We welcome submissions related to any aspect of backdoor research, including but not limited to:
- Backdoor attacks
- Poisoning attacks
- Dirty-label backdoor attacks
- Clean-label backdoor attacks
- Backdoors in various learning paradigms (supervised, semi-supervised, self-supervised)
- Backdoors in various computer vision tasks (object detection, segmentation)
- Backdoors in multimodal models (vision+language)
- Backdoors in federated learning
- Backdoors in NLP and less-studied domains (speech, graphs)
- Backdoors in generative models (e.g., Diffusion models)
- Backdoors in Large Language Models
- Backdoor defenses
- Backdoor detection (poisoned inputs, poisoned models) - Backdoor mitigation (data sanitization, model repair)
- Understanding backdoor behaviors
- Backdoor for social good
- Watermarking (for IP Protection, Ownership Verification, Generative Data Marking, etc…)
- Trapdoor/Honeypot defenses
- Model unlearning
- Deep model behavior understanding
The workshop will employ a double-blind review process. Each submission will be evaluated based on the following criteria:
- Soundness of the methodology
- Relevance to the workshop
- Societal impacts
We only consider submissions that haven’t been published in any peer-reviewed venue, including NeurIPS 2023 conference. We allow dual submissions with other workshops or conferences. The workshop is non-archival and will not have any official proceedings. All accepted papers will be allocated either a poster presentation or a talk slot.
Important Dates
- Submission deadline:
September 29th, 2023October 6th, 2023, 11:59 PM Anywhere on Earth (AoE) - Author notification: October 27th, 2023
- Camera-ready deadline: December 1st, 2023, 11:59 PM Anywhere on Earth (AoE)
- Workshop date: December 15th, 2023 (Full-day Event)
Submission Instructions
Papers should be submitted to OpenReview: https://openreview.net/group?id=NeurIPS.cc/2023/Workshop/BUGS
Submitted papers should have up to 6 pages (excluding references, acknowledgments, or appendices). Please use the NeurIPS submission template provided at https://neurips.cc/Conferences/2023/PaperInformation/StyleFiles. Submissions must be anonymous following NeurIPS double-blind reviewing guidelines, NeurIPS Code of Conduct, and Code of Ethics. Accepted papers will be hosted on the workshop website but are considered non-archival and can be submitted to other workshops, conferences, or journals if their submission policy allows.