Abstract
Contingency tables are widely used in many fields to analyze the relationship or infer the association between two or more variables. Indeed, due to their simplicity and ease, they are one of the first methods used to analyze gathered data. Typically, the construction of contingency tables from source data is considered straightforward since all data is supposed to be aggregated at a single party. However, in many cases, the collected data may actually be federated among different parties. Privacy and security concerns may restrict the data owners from free sharing of the raw data. However, construction of the global contingency tables would still be of immense interest. In this paper, we propose techniques for enabling secure construction of contingency tables from both horizontally and vertically partitioned data. Our methods are efficient and secure. We also examine cases where the constructed contingency table may itself leak too much information and discuss potential solutions.
Chapter PDF
Similar content being viewed by others
References
Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 247–255 (2001)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, pp. 439–450 (2000)
Bedi, R.: Money Laundering - Controls and Prevention, 1st edn. ISI Publications (2004)
Benaloh, J.C.: Secret Sharing Homomorphisms: Keeping Shares of a Secret Secret. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 251–260. Springer, Heidelberg (1987)
Blum, M., Goldwasser, S.: An efficient probabilistic public-key encryption that hides all partial information. In: Blakely, R. (ed.) Advances in Cryptology – Crypto 84 Proceedings. Springer, Heidelberg (1984)
L. Cauley.: Nsa has massive database of americans’ phone calls (May 2006) (USA Today).
Clifton, C., Kantarcioglu, M., Lin, X., Vaidya, J., Zhu, M.: Tools for privacy preserving distributed data mining. SIGKDD Explorations 4(2), 28–34 (2003)
Cornfield, J.: A method of estimating comparative rates from clinical data: Applications to cancer of the lung, breast, and cervix. Journal of the National Cancer Institute 11, 1269–1275 (1951)
Dellaportas, P., Tarantola, C.: Model determination for categorical data with factor level merging. Journal of the Royal Statistical Society 67, 269–283 (2005)
Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–228 (2002)
Fienberg, S.E.: The Analysis of Cross-classified Categorical Data, 2nd edn. M.I.T. Press, Cambridge (1980)
Goethals, B., Laur, S., Lipmaa, H., Mielikäinen, T.: On Secure Scalar Product Computation for Privacy-Preserving Data Mining. In: Park, C.-s., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 104–120. Springer, Heidelberg (2005)
Goldreich, O.: General Cryptographic Protocols. In: The Foundations of Cryptography, vol. 2. Cambridge University Press, Cambridge (2004)
Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game - a completeness theorem for protocols with honest majority. In: 19th ACM Symposium on the Theory of Computing, pp. 218–229 (1987)
Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, June 13-16 (2005)
Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: KDD 2005: Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 593–599. ACM Press, New York (2005)
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the Third IEEE International Conference on Data Mining (ICDM 2003) (2003)
Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Naccache, D., Stern, J.: A new public key cryptosystem based on higher residues. In: Proceedings of the 5th ACM conference on Computer and communications security, pp. 59–66. ACM Press, San Francisco (1998)
Okamoto, T., Uchiyama, S.: A New Public-Key Cryptosystem as Secure as Factoring. In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS, vol. 1403, pp. 308–318. Springer, Heidelberg (1998)
Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)
Rizvi, S.J., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proceedings of 28th International Conference on Very Large Data Bases. VLDB, Hong Kong, August 20-23, pp. 682–693 (2002)
Vaidya, J., Clifton, C.: Secure set intersection cardinality with application to association rule mining. Journal of Computer Security 13(4) (November 2005)
Vaidya, J., Clifton, C., Zhu, M.: Privacy-Preserving Data Mining, 1st edn. Advances in Information Security. Springer, Heidelberg (2005)
Yao, A.C.: Protocols for secure computation (extended abstract). In: Proceedings of the 23th IEEE Symposium on Foundations of Computer Science, pp. 160–164. IEEE, Los Alamitos (1982)
Yu, H., Vaidya, J.: Secure matrix addition (UIOWA Technical Report) (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 IFIP International Federation for Information Processing
About this paper
Cite this paper
Lu, H., He, X., Vaidya, J., Adam, N. (2008). Secure Construction of Contingency Tables from Distributed Data. In: Atluri, V. (eds) Data and Applications Security XXII. DBSec 2008. Lecture Notes in Computer Science, vol 5094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70567-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-70567-3_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70566-6
Online ISBN: 978-3-540-70567-3
eBook Packages: Computer ScienceComputer Science (R0)