This class uses Moodle; please sign up.
This course provides a broad introduction to data privacy, an important topic for technology developers as well as legislators and regulators. It will give students interested in pursuing privacy research an introduction to active research areas and open problems. This course will also expose students to many of the issues that privacy engineers, program managers, researchers and designers deal with in industry.
There is no textbook for this course. We will use research papers, case studies and policy guidance to study the following privacy areas:
Inference Attacks and Defenses: In 2006, AOL released user search logs stripped of email addresses, account holder names and other explicitly identifying information. The New York Times promptly examined the queries in the data and identified user 4417749 as a 62 year old resident of Lilburn, GA. This is one of many ‘reidentification’ attacks. More recently, researchers have shown how pseudonymous eBay users can be identified and how Netflix users can be identified from their movie ratings. We will study attacks like these as well as defenses such as k-anonymity and differential privacy. An important consideration in our study of these defenses will be the trade-off between utility and privacy.
Online Tracking, Advertising and Web Privacy: Online tracking is a key enabler of advertising, and thus many free services, but can lead to user confusion and unhappy surprise. We will discuss the web privacy underpinnings of online tracking, such as cookies and referers, as well as some of the advertising models that tracking supports. We will also discuss some tracking defenses, efforts to regulate tracking and privacy-enhanced approaches to advertising. When evaluating these technologies we will consider the user experience of tracking and advertising as well as their technical aspects.
Measurement Challenges: The ‘privacy paradox’ refers to the perception that stated user privacy preferences are at odds with their behavior. For example, you may have heard of studies in which users disclose sensitive information (e.g. SSNs) for candy. Through studying a series of user experiments we will explore the privacy paradox and why measuring privacy attitudes and behaviors is difficult. We will also discuss some efforts to categorize users by their privacy preferences.This module will touch on behavioral economics as well as survey design.
Measurement Applications: In this module, we will use recent research papers to understand what is known about privacy attitudes and behaviors in a number of different areas including reputation management, social media, advertising, permissions/settings, online anonymity and emerging technologies. We will use what we learned about measurement challenges to look critically at study design and identify open problems. The papers covered will use qualitative measurement techniques as well as surveys, data mining/machine learning and natural language processing.
Learning Outcomes: Students who complete this course will be able to:
Assignments for this course will consist of problem sets, case studies presented by students in class and a small group project. There will be no midterms or final exam.
The syllabus (including details on grading) is now available and the tentative schedule is below.