Backdoors in Deep Learning

Deep neural networks (DNNs) are revolutionizing almost all AI domains and have become the core of many modern AI systems. Despite their superior performance compared to classical methods, DNNs also face new security problems, such as adversarial and backdoor attacks, that are hard to discover and resolve due to their black-box-like property. Backdoor attacks are possible because of insecure model pretraining and outsourcing practices. Due to the complexity and the tremendous cost of collecting data and training models, many individuals/companies employ models or training data from third parties. Malicious third parties can add backdoors into their models or poison their released data before delivering it to the victims to gain illegal benefits. This threat seriously damages the safety and trustworthiness of AI development.

While most works consider backdoors “evil”, some studies leverage them for good purposes. A popular approach is to use the backdoor as a watermark to detect illegal uses of commercialized data/models. Watermarks can also be used to mark generated data, which becomes crucial with the introduction of big generative models (LLMs, text-to-image generators). For instance, the paper “A Watermark for Large Language Models” has received an outstanding paper award at ICML 2023, showing the community’s great interest in this critical topic. Besides, a few works employ the backdoor as a trapdoor for adversarial defense. Learning the underlying working mechanisms of backdoors also elevates our understanding of how deep learning models work. This workshop is designed to provide a comprehensive understanding of the current state of backdoor research. Our goal is to foster discussion and perspective exchange, as well as to engage the community in identifying social good applications of backdoors.

Schedule

⭐ Link to NeurIPS page: https://neurips.cc/virtual/2023/workshop/66550 ⭐

Start Time (CST/GMT-06:00, New Orleans)	Session	Speaker(s)
08:55 am	Opening Remarks	Organizers
09:00 am	Invited Talk 1: A Blessing in Disguise: Backdoor Attacks as Watermarks for Dataset Copyright	Yiming Li
09:30 am	Invited Talk 2: Recent Advances in Backdoor Defense and Benchmark	Baoyuan Wu
10:00 am	Coffee Break
10:30 am	Invited Talk 3: The difference between safety and security for watermarking	Jonas Geiping
11:00 am	Oral 1: Effective Backdoor Mitigation Depends on the Pre-training Objective	Sahil Verma, Gantavya Bhatt, Soumye Singhal, Arnav Das, Chirag Shah, John Dickerson, Jeff A Bilmes
11:15 am	Invited Talk 4: Universal jailbreak backdoors from poisoned human feedback	Florian Tramèr
11:45 am	Lunch Break
01:00 pm	Oral 2: VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models	Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho
01:15 pm	Oral 3: The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline	Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi
01:30 pm	Invited talk 5: Is this model mine? On stealing and defending machine learning models	Adam Dziedzic
02:00 pm	Invited talk 6	Ruoxi Jia
02:30 pm	Coffee Break
03:00 pm	Poster Session	Paper Authors
03:45 pm	Oral 4: Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection	Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin
04:00 pm	Oral 5: BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models	Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li
04:15 pm	Invited Talk 7: Decoding Backdoors in LLMs and Their Implications	Bo Li
04:45 pm	Panel Discussion	Moderator: Eugene Bagdasaryan
05:15 pm	Closing Remarks	Organizers

Speakers


Bo Li Associate Professor, UIUC	Ruoxi Jia Assistant Professor, Virginia Tech	Adam Dziedzic Assistant Professor, CISPA	Florian Tramèr Assistant Professor, ETH Zürich

Yiming Li Research Professor, Zhejiang University	Baoyuan Wu Associate Professor, CUHK-Shenzhen	Jonas Geiping Research Group Leader, ELLIS Institute & MPI-IS

Panelists


Franziska Boenisch Tenure-track Faculty, CISPA	Bo Li Associate Professor, UIUC	Baoyuan Wu Associate Professor, CUHK-Shenzhen	Ruoxi Jia Assistant Professor, Virginia Tech

Workshop Sponsors

We gratefully acknowledge the support from our Sponsors.

Organizers


Khoa D Doan VinUniversity, Vietnam	Aniruddha Saha University of Maryland, College Park, USA	Anh Tuan Tran VinAI Research, Vietnam	Yingjie Lao Clemson University, USA	Kok-seng Wong VinUniversity, Vietnam

Ang Li Simular Research, USA	Haripriya Harikumar Deakin University, Australia	Eugene Bagdasaryan Cornell Tech, USA	Micah Goldblum New York University, USA	Tom Goldstein University of Maryland, College Park, USA

Organizers affiliations

Call for Papers

We cordially invite submissions and participation in our “Backdoors in Deep Learning: The Good, the Bad, and the Ugly” workshop (neurips2023-bugs.github.io) that will be held on December 15 or 16, 2023 at NeurIPS 2023, New Orleans, USA.

The submission deadline is ~~September 29, 2023~~ October 6th, 2023, 23:59 AoE and the submission link https://openreview.net/group?id=NeurIPS.cc/2023/Workshop/BUGS.

Motivation and Topics

We welcome submissions related to any aspect of backdoor research, including but not limited to:

Backdoor attacks
- Poisoning attacks
- Dirty-label backdoor attacks
- Clean-label backdoor attacks
- Backdoors in various learning paradigms (supervised, semi-supervised, self-supervised)
- Backdoors in various computer vision tasks (object detection, segmentation)
- Backdoors in multimodal models (vision+language)
- Backdoors in federated learning
- Backdoors in NLP and less-studied domains (speech, graphs)
- Backdoors in generative models (e.g., Diffusion models)
- Backdoors in Large Language Models
Backdoor defenses
- Backdoor detection (poisoned inputs, poisoned models) - Backdoor mitigation (data sanitization, model repair)
- Understanding backdoor behaviors
Backdoor for social good
- Watermarking (for IP Protection, Ownership Verification, Generative Data Marking, etc…)
- Trapdoor/Honeypot defenses
- Model unlearning
- Deep model behavior understanding

The workshop will employ a double-blind review process. Each submission will be evaluated based on the following criteria:

Soundness of the methodology
Relevance to the workshop
Societal impacts

We only consider submissions that haven’t been published in any peer-reviewed venue, including NeurIPS 2023 conference. We allow dual submissions with other workshops or conferences. The workshop is non-archival and will not have any official proceedings. All accepted papers will be allocated either a poster presentation or a talk slot.

Important Dates

Submission deadline: ~~September 29th, 2023~~ October 6th, 2023, 11:59 PM Anywhere on Earth (AoE)
Author notification: October 27th, 2023
Camera-ready deadline: December 1st, 2023, 11:59 PM Anywhere on Earth (AoE)
Workshop date: December 15th, 2023 (Full-day Event)

Submission Instructions

Papers should be submitted to OpenReview: https://openreview.net/group?id=NeurIPS.cc/2023/Workshop/BUGS

Submitted papers should have up to 6 pages (excluding references, acknowledgments, or appendices). Please use the NeurIPS submission template provided at https://neurips.cc/Conferences/2023/PaperInformation/StyleFiles. Submissions must be anonymous following NeurIPS double-blind reviewing guidelines, NeurIPS Code of Conduct, and Code of Ethics. Accepted papers will be hosted on the workshop website but are considered non-archival and can be submitted to other workshops, conferences, or journals if their submission policy allows.

Backdoors in Deep Learning @ NeurIPS 2023

The Good, the Bad and the Ugly - Modern AI development requires using and sharing of models and data safely. Uncovering backdoor, a foe and a friend at the front door.