By Bee-Chung Chen, Yahoo! Research, USA, firstname.lastname@example.org | Daniel Kifer, Penn State University, USA, email@example.com | Kristen LeFevre, University of Michigan, USA, firstname.lastname@example.org | Ashwin Machanavajjhala, Yahoo! Research, USA, email@example.com
Privacy is an important issue when one wants to make use of data that involves individuals' sensitive information. Research on protecting the privacy of individuals and the confidentiality of data has received contributions from many fields, including computer science, statistics, economics, and social science. In this paper, we survey research work in privacy-preserving data publishing. This is an area that attempts to answer the problem of how an organization, such as a hospital, government agency, or insurance company, can release data to the public without violating the confidentiality of personal information. We focus on privacy criteria that provide formal safety guarantees, present algorithms that sanitize data to make it safe for release while preserving useful information, and discuss ways of analyzing the sanitized data. Many challenges still remain. This survey provides a summary of the current state-of-the-art, based on which we expect to see advances in years to come.
This monograph is dedicated to those who have something to hide. It is a book about "privacy preserving data publishing" – the art of publishing sensitive personal data, collected from a group of individuals, in a form that does not violate their privacy. This problem has numerous and diverse areas of application, including releasing Census data, search logs, medical records, and interactions on a social network.
The purpose of this monograph is to provide a detailed overview of the current state of the art as well as open challenges, focusing particular attention on four key themes: