Deep neural networks (DNNs) are revolutionizing almost all AI domains and have become the core of many modern AI systems. Despite their superior performance compared to classical methods, DNNs also face new security problems, such as adversarial and backdoor attacks, that are hard to discover and resolve due to their black-box-like property. Backdoor attacks are possible because of insecure model pretraining and outsourcing practices. Due to the complexity and the tremendous cost of collecting data and training models, many individuals/companies employ models or training data from third parties. Malicious third parties can add backdoors into their models or poison their released data before delivering it to the victims to gain illegal benefits. This threat seriously damages the safety and trustworthiness of AI development.

While most works consider backdoors “evil”, some studies leverage them for good purposes. A popular approach is to use the backdoor as a watermark to detect illegal uses of commercialized data/models. Watermarks can also be used to mark generated data, which becomes crucial with the introduction of big generative models (LLMs, text-to-image generators). For instance, the paper “A Watermark for Large Language Models” has received an outstanding paper award at ICML 2023, showing the community’s great interest in this critical topic. Besides, a few works employ the backdoor as a trapdoor for adversarial defense. Learning the underlying working mechanisms of backdoors also elevates our understanding of how deep learning models work. This workshop is designed to provide a comprehensive understanding of the current state of backdoor research. Our goal is to foster discussion and perspective exchange, as well as to engage the community in identifying social good applications of backdoors.

Schedule

Link to NeurIPS page: https://neurips.cc/virtual/2023/workshop/66550

Start Time (CST/GMT-06:00, New Orleans) Session Speaker(s)
08:55 am Opening Remarks Organizers
09:00 am Invited Talk 1: A Blessing in Disguise: Backdoor Attacks as Watermarks for Dataset Copyright Yiming Li
09:30 am Invited Talk 2: Recent Advances in Backdoor Defense and Benchmark Baoyuan Wu
10:00 am Coffee Break  
10:30 am Invited Talk 3: The difference between safety and security for watermarking Jonas Geiping
11:00 am Oral 1: Effective Backdoor Mitigation Depends on the Pre-training Objective Sahil Verma, Gantavya Bhatt, Soumye Singhal, Arnav Das, Chirag Shah, John Dickerson, Jeff A Bilmes
11:15 am Invited Talk 4: Universal jailbreak backdoors from poisoned human feedback Florian Tramèr
11:45 am Lunch Break  
01:00 pm Oral 2: VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho
01:15 pm Oral 3: The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi
01:30 pm Invited talk 5: Is this model mine? On stealing and defending machine learning models Adam Dziedzic
02:00 pm Invited talk 6 Ruoxi Jia
02:30 pm Coffee Break  
03:00 pm Poster Session Paper Authors
03:45 pm Oral 4: Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin
04:00 pm Oral 5: BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li
04:15 pm Invited Talk 7: Decoding Backdoors in LLMs and Their Implications Bo Li
04:45 pm Panel Discussion Moderator: Eugene Bagdasaryan
05:15 pm Closing Remarks Organizers

Speakers

Bo Li
Associate Professor, UIUC
Ruoxi Jia
Assistant Professor, Virginia Tech
Adam Dziedzic
Assistant Professor, CISPA
Florian Tramèr
Assistant Professor, ETH Zürich
Yiming Li
Research Professor, Zhejiang University
Baoyuan Wu
Associate Professor, CUHK-Shenzhen
Jonas Geiping
Research Group Leader, ELLIS Institute & MPI-IS

Panelists

Franziska Boenisch
Tenure-track Faculty, CISPA
Bo Li
Associate Professor, UIUC
Baoyuan Wu
Associate Professor, CUHK-Shenzhen
Ruoxi Jia
Assistant Professor, Virginia Tech

Workshop Sponsors

We gratefully acknowledge the support from our Sponsors.

Organizers

Khoa D Doan
VinUniversity, Vietnam
Aniruddha Saha
University of Maryland, College Park, USA
Anh Tuan Tran
VinAI Research, Vietnam
Yingjie Lao
Clemson University, USA
Kok-seng Wong
VinUniversity, Vietnam
Ang Li
Simular Research, USA
Haripriya Harikumar
Deakin University, Australia
Eugene Bagdasaryan
Cornell Tech, USA
Micah Goldblum
New York University, USA
Tom Goldstein
University of Maryland, College Park, USA

Organizers affiliations

Call for Papers

We cordially invite submissions and participation in our “Backdoors in Deep Learning: The Good, the Bad, and the Ugly” workshop (neurips2023-bugs.github.io) that will be held on December 15 or 16, 2023 at NeurIPS 2023, New Orleans, USA.

The submission deadline is September 29, 2023 October 6th, 2023, 23:59 AoE and the submission link https://openreview.net/group?id=NeurIPS.cc/2023/Workshop/BUGS.

Motivation and Topics

We welcome submissions related to any aspect of backdoor research, including but not limited to:

  • Backdoor attacks
    • Poisoning attacks
    • Dirty-label backdoor attacks
    • Clean-label backdoor attacks
    • Backdoors in various learning paradigms (supervised, semi-supervised, self-supervised)
    • Backdoors in various computer vision tasks (object detection, segmentation)
    • Backdoors in multimodal models (vision+language)
    • Backdoors in federated learning
    • Backdoors in NLP and less-studied domains (speech, graphs)
    • Backdoors in generative models (e.g., Diffusion models)
    • Backdoors in Large Language Models
  • Backdoor defenses
    • Backdoor detection (poisoned inputs, poisoned models) - Backdoor mitigation (data sanitization, model repair)
    • Understanding backdoor behaviors
  • Backdoor for social good
    • Watermarking (for IP Protection, Ownership Verification, Generative Data Marking, etc…)
    • Trapdoor/Honeypot defenses
    • Model unlearning
    • Deep model behavior understanding

The workshop will employ a double-blind review process. Each submission will be evaluated based on the following criteria:

  • Soundness of the methodology
  • Relevance to the workshop
  • Societal impacts

We only consider submissions that haven’t been published in any peer-reviewed venue, including NeurIPS 2023 conference. We allow dual submissions with other workshops or conferences. The workshop is non-archival and will not have any official proceedings. All accepted papers will be allocated either a poster presentation or a talk slot.

Important Dates

  • Submission deadline: September 29th, 2023 October 6th, 2023, 11:59 PM Anywhere on Earth (AoE)
  • Author notification: October 27th, 2023
  • Camera-ready deadline: December 1st, 2023, 11:59 PM Anywhere on Earth (AoE)
  • Workshop date: December 15th, 2023 (Full-day Event)

Submission Instructions

Papers should be submitted to OpenReview: https://openreview.net/group?id=NeurIPS.cc/2023/Workshop/BUGS

Submitted papers should have up to 6 pages (excluding references, acknowledgments, or appendices). Please use the NeurIPS submission template provided at https://neurips.cc/Conferences/2023/PaperInformation/StyleFiles. Submissions must be anonymous following NeurIPS double-blind reviewing guidelines, NeurIPS Code of Conduct, and Code of Ethics. Accepted papers will be hosted on the workshop website but are considered non-archival and can be submitted to other workshops, conferences, or journals if their submission policy allows.