EAGER: Spectral Analysis for Fraud Detection in Large-scale Networks
National Science Foundation (CCF-1047621)
This project takes a unified spectral transformation approach to address challenges of analyzing network topology and identifying fraud patterns in large-scale dynamic networks by using data spectral transformation with network topology visualization. Large-scale social and communication networks contain rich topological information embedded inside, in addition to various structured, semi-structured, and unstructured data. The research is characterizing patterns of various attacks in the spectral projection space of graph topology and developing spectrum based methods to identify these attacks. The approach, which exploits the spectral space of the underlying interaction structure of the network, is orthogonal to traditional approaches using content profiling. The ability to perform this spectral analysis is dependent upon the development of complex mathematical techniques. Critical issues that are being explored include the scalability of the methods to very large data sets, the determination of the dimensionality of the node representation in spectral space, and the interpretation of patterns in spectral space.
PSNet: Privacy and Spectral Analysis of Social Networks
National Science Foundation (CNS-0831204)
Social networks are of significant importance in various application domains. Most previous studies are focused on revealing interesting properties of networks and discovering efficient and effective analysis methods. However, there has been little work dedicated to privacy preserving social network analysis. In this project, we investigate the application of graph perturbation techniques to protect privacy of individual nodes and their sensitive link relationships. We conduct theoretical study and empirical evaluation on the tradeoff between utility and privacy of various graph randomization techniques as well as investigation of various potential attacking methods from adversaries. To quantify the utility loss, we focus on the change of the spectrum and eigenvectors since they have inherent relation with many real space graph characteristics. We expect to develop some spectrum/utility preserving randomization techniques which can better preserve graph utility without sacrificing much privacy protection.
CAREER: Towards Privacy and Confidentiality Preserving Databases
National Science Foundation (CAREER Award IIS-0546027)
Many databases from government, commercial and non-profit organizations maintain a huge amount of data on sensitive or confidential information such as income and medical records. As a result, protecting the privacy and confidentiality of such databases is of primary concern. In this project, we focus on quantifying and evaluating the tradeoffs between the data utility and the disclosure risk on applications of various perturbation techniques in practice. We expect to provide a prototype system which can fully conduct disclosure analysis using both model based and randomization based approaches to satisfy users' complex privacy and confidentiality specifications.