AAAI Workshop on Privacy-Preserving Artificial Intelligence

Scope and Topics

The availability of massive amounts of data, coupled with high-performance cloud computing platforms, has driven significant progress in artificial intelligence and, in particular, machine learning and optimization. It has profoundly impacted several areas, including computer vision, natural language processing, and transportation. However, the use of rich data sets also raises significant privacy concerns: They often reveal personal sensitive information that can be exploited, without the knowledge and/or consent of the involved individuals, for various purposes including monitoring, discrimination, and illegal activities.
The second AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-21) held at the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) builds on the success of last year’s AAAI PPAI to provide a platform for researchers, AI practitioners, and policymakers to discuss technical and societal issues and present solutions related to privacy in AI applications. The workshop will focus on both the theoretical and practical challenges related to the design of privacy-preserving AI systems and algorithms and will have strong multidisciplinary components, including soliciting contributions about policy, legal issues, and societal impact of privacy in AI.

PPAI-21 will place particular emphasis on:

Algorithmic approaches to protect data privacy in the context of learning, optimization, and decision making that raise fundamental challenges for existing technologies.
Privacy challenges created by the governments and tech industry response to the Covid-19 outbreak.
Social issues related to tracking, tracing, and surveillance programs.
Algorithms and frameworks to release privacy-preserving benchmarks and data sets.

Topics

The workshop organizers invite paper submissions on the following (and related) topics:

Applications of privacy-preserving AI systems
Attacks on data privacy
Differential privacy: theory and applications
Distributed privacy-preserving algorithms
Human rights and privacy
Privacy issues related to the Covid-19 outbreak
Privacy policies and legal issues
Privacy preserving optimization and machine learning
Privacy preserving test cases and benchmarks
Surveillance and societal issues

Finally, the workshop will welcome papers that describe the release of privacy-preserving benchmarks and data sets that can be used by the community to solve fundamental problems of interest, including in machine learning and optimization for health systems and urban networks, to mention but a few examples.

Format

The workshop will be a one-day and a half meeting. The first session (half day) will be dedicated to privacy challenges, particularly those risen by the Covid-19 pandemic tracing and tracking policy programs. The second, day-long, session will be dedicated to the workshop technical content about privacy-preserving AI. The workshop will include a number of (possibly parallel) technical sessions, a virtual poster session where presenters can discuss their work, with the aim of further fostering collaborations, multiple invited speakers covering crucial challenges for the field of privacy-preserving AI applications, including policy and societal impacts, a number of tutorial talks, and will conclude with a panel discussion.

Important Dates

November 16, 2020 – Submission Deadline [Extended]
December 7, 2020 – AAAI Fast Track Submission Deadline [New]
January 7, 2021 – Acceptance Notification [Updated]
February 8 and 9, 2020 – Workshop Date

Submission Information

Submission URL: https://cmt3.research.microsoft.com/PPAI2021

Submission Types

Technical Papers: Full-length research papers of up to 7 pages (excluding references and appendices) detailing high quality work in progress or work that could potentially be published at a major conference.
Short Papers: Position or short papers of up to 4 pages (excluding references and appendices) that describe initial work or the release of privacy-preserving benchmarks and datasets on the topics of interest.

Submission Tracks

Technical Track: This track is dedicated to the privacy-preserving AI technical content. It welcomes research contributions centered around the topics described above.
Privacy Challenges and Social Issues Track: This track is dedicated to discussion of privacy challenges, particularly those risen by the Covid-19 pandemic tracing and tracking policy programs. It welcomes both technical contributions and position papers.

[New] AAAI Fast Track (Rejected AAAI papers)

Rejected AAAI papers with *average* scores of at least 4.5 may be asubmitted directly to PPAI along with previous reviews. These submissions may go through a light review process or accepted if the provided reviews are judged to meet the workshop standard.

All papers must be submitted in PDF format, using the AAAI-21 author kit. Submissions should include the name(s), affiliations, and email addresses of all authors.
Submissions will be refereed on the basis of technical quality, novelty, significance, and clarity. Each submission will be thoroughly reviewed by at least two program committee members.
Submissions of papers rejected from the AAAI 2021 technical program are welcomed.

For questions about the submission process, contact the workshop chairs.

Program

All times are in Eastern Standard Time (UTC-5).

Invited talks, Tutorials, and Panel discussion: Will be live streamed (recording available).
Spotlights and Poster Talks: Are pre-recorded and accessible at any time (click on the play button next to the associated paper). There will be additional Q&A and discussion at the poster sessions.
Poster sessions: are hosted on Discord. See instructions below.

PPAI Day 1 - February 8, 2021

Time	Talk / Presenter
08:50	Introductory remarks
09:00	Invited Talk by John M. Abowd
Session chair: Xi He
09:45	Spotlight Talk: On the Privacy-Utility Tradeoff in Peer-Review Data Analysis
10:00	Spotlight Talk: Leveraging Public Data in Practical Private Query Release: A Case Study with ACS Data
10:30	Invited Talk by Aswin Machanavajjhala
11:15	Break
11:20	Tutorial: A tutorial on privacy amplification by subsampling, diffusion and shuffling, by Audra McMillan
12:50	Break
Session chair: Marco Romanelli
13:30	Spotlight Talk: Efficient CNN Building Blocks for Encrypted Data
13:45	Spotlight Talk: Differentially Private and Fair Deep Learning: A Lagrangian Dual Approach
14:00	Spotlight Talk: A variational approach to privacy and fairness
14:15	Invited Talk by Steven Wu
15:00	Poster Session 1	join (on Discord)
17:00	End of Workshop (day 1)
PPAI Day 2 - February 9, 2021
Time	Talk / Presenter
09:00	Invited Talk by Reza Shokri
Session chair: TBA
09:45	Spotlight Talk: Coded Machine Unlearning
10:00	Spotlight Talk: DART: Data Addition and Removal Trees
10:30	Invited Talk by Nicolas Papernot
11:15	Break
11:20	Tutorial: Privacy and Federated Learning: Principles, Techniques and Emerging Frontiers by Brendan McMahan, Kallista Bonawitz, and Peter Kairouz
12:50	Break
Session chair: Mark Bun
13:30	Spotlight Talk: Reducing ReLU Count for Privacy-Preserving CNNs
13:45	Spotlight Talk: Output Perturbation for General Differentially Private Convex Optimization with Improved Population Loss Bounds, Runtimes and Applications to Private Adversarial Training
14:00	Spotlight Talk: An In-depth Review of Privacy Concerns Raised by the COVID-19 Pandemic
14:15	Panel: “Differential Privacy: Implementation, deployment, and receptivity. Where are we and what are we missing?”
15:00	Poster Session 2	join (on Discord)
17:00	End of Workshop

Poster Presentation instructions on Discord

Each contributed paper is associated with two channels:

Text channel: Here is where you will find a short video presentation (2 min) and a poster slide associated with each paper.
You can: 1. Visualize the material, and 2. Leave questions for the authors to respond (outside the poster session time frame).
Video channel: It is used for Q&A and discussion during the poster session time. When joining a voice channel, remember turn on your mic (and, optionally, your camera).

Accepted Papers

Spotlight Presentations

On the Privacy-Utility Tradeoff in Peer-Review Data Analysis [ArXiv]
Wenxin Ding (Carnegie Mellon University); Nihar Shah (CMU); Weina Wang (CMU)
Leveraging Public Data in Practical Private Query Release: A Case Study with ACS Data
Terrance Liu (Carnegie Mellon University); Giuseppe Vietri (University of Minnesota); Thomas Steinke (Google); Jonathan Ullman (Northeastern University); Steven Wu (Carnegie Mellon University)
Efficient CNN Building Blocks for Encrypted Data [ArXiv]
Nayna Jain (IIIT Bangalore); Karthik Nandakumar (Mohamed Bin Zayed University of Artificial Intelligence, UAE); Nalini Ratha (SUNY Buffalo); Sharath Pankanti (Microsoft); Uttam Kumar (IIIT Bangalore)
Differentially Private and Fair Deep Learning: A Lagrangian Dual Approach [ArXiv]
Cuong Tran (Syracuse University)
A variational approach to privacy and fairness [ArXiv]
Borja Rodríguez Gálvez (KTH Royal Institute of Technology); Ragnar Thobaben (KTH Royal Institute of Technology); Mikael Skoglund (KTH Royal Institute of Technology)
Coded Machine Unlearning
Nasser Aldaghri (University of Michigan); Hessam Mahdavifar (University of Michigan); Ahmad Beirami (Facebook, USA)
DART: Data Addition and Removal Trees [ArXiv]
Jonathan Brophy (University of Oregon); Daniel Lowd (University of Oregon)
Reducing ReLU Count for Privacy-Preserving CNNs [ArXiv]
Inbar Helbitz (Tel Aviv University); Shai Avidan (Tel Aviv University)
Output Perturbation for General Differentially Private Convex Optimization with Improved Population Loss Bounds, Runtimes and Applications to Private Adversarial Training [ArXiv]
Andrew Lowy (USC); Meisam Razaviyayn (USC)
An In-depth Review of Privacy Concerns Raised by the COVID-19 Pandemic [ArXiv]
Jiaqi Wang (Penn State University)

Poster Presentations

Differentially Private Random Forests for Regression and Classification
Shorya Consul (University of Texas at Austin); Sinead Williamson (UT Austin/CognitiveScale)
An Analysis Of Protected Health Information Leakage In Deep-Learning Based De-Identification Algorithms [ArXiv]
Salman Seyedi (Emory University)
Dopamine: Differentially Private Secure Federated Learning on Medical Data [ArXiv]
Mohammad Malekzadeh (Imperial College London); Burak Hasircioglu ( Imperial College London); Nitish Mital (Imperial College London ); Kunal Katarya (Imperial College London ); Mehmet Emre Ozfatura (Imperial College London); Deniz Gunduz (Imperial College London)
Differential Privacy Meets Maximum-weight Matching [paper not available]
Panayiotis Danassis (École Polytechnique Fédérale de Lausanne); Aleksei Triastcyn (EPFL); Boi Faltings (EPFL)
Intelligent Frame Selection as a Privacy-Friendlier Alternative to Face Recognition [ArXiv]
Mattijs Baert (Ghent University - IMEC); Sam Leroux (Ghent University - IMEC); Pieter Simoens (Ghent University - imec)
Accuracy and Privacy Evaluations of Collaborative Data Analysis [ArXiv]
Akira Imakura (University of Tsukuba); Anna Bogdanova (University of Tsukuba); Takaya Yamazoe (University of Tsukuba); Kazumasa Omote (University of Tsukuba); Tetsuya Sakurai (University of Tsukuba)
Maintaining the Utility of Privacy-Aware Schedules [paper not available]
Arik Senderovich (University of Toronto); Ali Kaan Tutak (Humboldt University of Berlin); Christopher Beck (University of Toronto); Stephan Fahrenkrog-Petersen (Humboldt University of Berlin); Matthias Weidlich (Humboldt-Universität zu Berlin)
A Study of F0 Modification for X-Vector Based Speech Pseudo-Anonymization Across Gender [ArXiv]
Champion Pierre (INRIA); Denis Jouvet (INRIA); Anthony Larcher (Universitad du Mans - LIUM)
Private Emotion Recognition with Secure Multiparty Computation [ArXiv]
Kyle J Bittner (University of Washington Tacoma); Rafael Dowsley (Monash University); Martine De Cock (University of Washington Tacoma)
Optimized Data Sharing with Differential Privacy: A Game-theoretic Approach
Nan Wu (Macquarie University and CSIRO's Data61); Farhad Farokhi (The University of Melbourne); David Smith (DATA61, CSIRO); Mohamed Ali Kaafar (Macquarie University and CSIRO-Data61)
Personalized privacy protection in social networks through adversarial modeling
Sachin G Biradar (Amazon.Inc); Elena Zheleva (University of Illinois at Chicago)
Hybrid Privacy Scheme
Yavor Litchev (Lexington High School); Abigail Thomas (Nashua High School South)
Compressive Differentially-Private Federated Learning Through Universal Vector Quantization
Saba Amiri (University of Amsterdam); Adam Belloum (Multiscale Networked Systems (MNS) Research Group, University of Amsterdam, 1098 XH Amsterdam, The Netherlands); Leon Gommans (Air France KLM); Sander Klous (Vrije Universiteit Amsterdam)
S++: A Fast and Deployable Secure-Computation Framework for Privacy-Preserving Neural Network Training [ArXiv]
Prashanthi Ramachandran (Ashoka University); Shivam Agarwal (Ashoka University); Aastha Shah (Ashoka University); Arup Mondal (Ashoka University); Debayan Gupta (Ashoka University)
Differentially Private Multi-Agent Constraint Optimization [paper not available]
Sankarshan Damle (Machine Learning Lab, International Institute of Information Technology, Hyderabad); Aleksei Triastcyn (EPFL); Boi Faltings (EPFL); Sujit P. Gujar (Machine Learning Laboratory, International Institute of Information Technology, Hyderabad)

Tutorials

Privacy and Federated Learning: Principles, Techniques and Emerging Frontiers

by Brendan McMahan (Google), Kallista Bonawitz (Google), Peter Kairouz (Google)

Abstract:
Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. Similarly, federated analytics (FA) allows data scientists to generate analytical insight from the combined information in distributed datasets without requiring data centralization. Federated approaches embody the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in federated learning and analytics, this tutorial will provide a gentle introduction to the area. The focus will be on cross-device federated learning, including deep dives on differential privacy and secure computation in the federated setting; federated analytics and cross-silo federated learning will also be discussed.

A tutorial on privacy amplification by subsampling, diffusion and shuffling

by Audra McMillan (Apple)

Abstract:
Practical differential privacy deployments require tight privacy accounting. A toolbox of “privacy amplification” techniques has been developed to simplify the privacy analysis of complicated differentially private mechanisms. These techniques can be used to design new differentially private mechanisms, as well as provide tighter privacy guarantees for existing mechanisms. In this tutorial, we will discuss three main privacy amplification techniques; subsampling, diffusion and shuffling. We will discuss the intuition for why each technique amplifies privacy, and where it is useful in practice. Finally, we will use differentially private stochastic gradient descent as an example of how each technique can be used to easily provide a tight, or almost tight, privacy analysis.

Invited Talks

Differential Privacy and the 2020 Census in the United States

by John M. Abowd (U.S. Census Bureau)

Abstract:
The talk will focus on the implementation of differential privacy used to protect the data products in the 2020 Census of Population and Housing. I will present a high-level overview of the design used for the majority of the data products, known as the TopDown Algorithm. I will focus on the high-level policy and technical challenges that the U.S. Census Bureau faced during the implementation including the original science embodied in that algorithm, implementation challenges arising from the production constraints, formalizing policies about privacy-loss budgets, communicating the effects of the algorithms on the final data products, and balancing competing data users' interests against the inherent privacy loss associated with detailed data publications.

Three Flavors of Private Machine Learning

by Nicolas Papernot (University of Toronto)

Abstract:
Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. To address this problem, algorithms for private machine learning have been proposed. In this talk, we first show that training neural networks with privacy requires rethinking their architectures with the goals of privacy-preserving gradient descent in mind. Second, we explore how private aggregation surfaces the synergies between privacy and generalization in machine learning. Third, we present recent work towards a form of collaborative machine learning that is both privacy-preserving in the sense of differential privacy, and confidentiality-preserving in the sense of the cryptographic community.

Deploying Differential Privacy for Social Good: Opportunities and Challenges

by Ashwin Machanavajjhala (Duke University)

Abstract:
Several organizations, especially federal statistical agencies, routinely release fine grained statistical data products for social good that are critical for enabling resource allocation, policy and decision making as well as research. Differential privacy, the gold standard privacy technology, has long been motivated by this use case. In this talk, I will describe our recent experiences deploying differential privacy at scale at US federal statistical agencies. I will highlight how the process of deploying DP at these agencies differs from the idealized problem studied in the research literature, and illustrate a few key technical challenges we encountered in these deployments.

Modeling Privacy Erosion: Differential Privacy Dynamics in Machine Learning

by Reza Shokri (National University of Singapore)

Abstract:
Machine learning models leak information about their training data. Randomizing gradients during training is a technique to preserve differential privacy, and protect against inference attacks. The general method to compute the differential privacy bound is to use composition theorems: to view the training process as a sequence of differentially-private algorithms, and to compute the composition of their DP bounds. This results in a loose bound on the privacy loss of the released model, as it accounts for the privacy loss of all training epochs (even if the intermediate parameters are not released). I will present a novel approach for analyzing the dynamics of privacy loss, throughout the training process, assuming that the internal state of the algorithm (its parameters during training) remains private. This enables computing how privacy loss changes after each training epoch, and the privacy loss at the time of releasing the model. I show that differential privacy bound converges, and it converges to a tight bound.

Leveraging Heuristics for Private Synthetic Data Release

by Steven Wu (Carnegie Mellon University)

Abstract:
This talk will focus on differentially private synthetic data---a privatized version of the dataset that consists of fake data records and that approximates the real dataset on important statistical properties of interest. I will present our recent results on private synthetic data that leverage practical optimization heuristics to circumvent the computational bottleneck in existing work. Our techniques are motivated by a modular, game-theoretic framework, which can flexibly work with methods such as integer program solvers and deep generative models.