[ML Seminars] Learning and privacy
Christos Dimitrakakis talks about the challenge of identity protection.
If we wish to publish a database, or an analysis based on it, we must protect identities of people involved. Erasing directly identifying informationdoes not really work most of the time, especially since attackers can have side-information that can reveal the identities of individuals in the original data. While k-anonymity can protect against specific re-identification attacks when used with care, it says little about what to do when the adversary has a lot of power. In this talk, we introduce the concept of differential privacy, which offers protection against adversaries with unlimited side-information or computational power. Informally, an algorithmic computation is differentially-private if an adversary cannot distinguish two similar database based on the result of the computation. We then discuss its application to both theoretical and practical problems in machine learning.