k-Anonymization by Freeform Generalization

Abstract

Syntactic data anonymization strives to (i) ensure that an adversary cannot identify an individual’s record from published record attributes with high probability, and (ii) preserve data utility. These mutually conflicting goals can be expressed as an optimization problem with privacy as the constraint and utility as the objective function. Conventional research on k-anonymity, a popular privacy model, has resorted to generalizing data values in homogeneous groups. However, such grouping is not necessary. Instead, data values can be recast in a heterogeneous manner that allows for higher utility. Nevertheless, previous work in this direction did not define the problem in the most general terms; thus, the utility gains achieved are limited. In this paper, we propose a methodology that achieves the full potential of heterogeneity and gains higher utility. We formulate the problem as a network flow problem and develop an optimal solution therefor using Mixed Integer Programming, an O(kn^2) Greedy algorithm that has no time-complexity disadvantage vis-á-vis previous approaches, an O(kn^2 log n) enhanced version thereof, and an O(kn^3) adaptation of the Hungarian algorithm; these algorithms build a set of k perfect matchings from original to anonymized data. Our techniques can resist adversaries who may know the employed algorithms. Experiments with real-world data verify that our schemes achieve near-optimal utility (with gains of up to 41%), while they can exploit parallelism and data partitioning, gaining an efficiency advantage over simpler methods.

Speaker

Dr. Panagiotis Karras
Assistant Professor
Skolkovo Institute of Science and Technology
Moscow, Russian Federation

Date & Time

7 Jan 2016 (Thursday) 11:00 - 12:00

Venue

E11-4045 (University of Macau)

Organized by

Department of Computer and Information Science

Biography

Panagiotis Karras is an Assistant Professor at the Skolkovo Institute of Science and Technology. His interests are in the confluence of data management, data mining, and data security. He earned a Ph.D. in Computer Science from the University of Hong Kong and an M.Eng. in Electrical and Computer Engineering from the National Technical University of Athens. Panos has also held faculty and research appointments at Rutgers University, the National University of Singapore, the University of Zurich, and the Technical University of Denmark. His work has been awarded with the Hong Kong Young Scientist Award in Physical/Mathematical Science, funded by Singapore’s Lee Kuan Yew Endowment, and cited over 1200 times. He regularly serves as a program committee member and referee for the major international conferences and journals in the above areas.