Award Abstract # 1838139
BIGDATA: F: Privacy in Unsupervised Learning
| NSF Org: |
IIS
Div Of Information & Intelligent Systems
|
| Awardee: |
JOHNS HOPKINS UNIVERSITY, THE
|
| Initial Amendment Date: |
September 11, 2018 |
| Latest Amendment Date: |
November 23, 2020 |
| Award Number: |
1838139 |
| Award Instrument: |
Standard Grant |
| Program Manager: |
Ralph Wachter
rwachter@nsf.gov
(703)292-8950
IIS
Div Of Information & Intelligent Systems
CSE
Direct For Computer & Info Scie & Enginr
|
| Start Date: |
October 1, 2018 |
| End Date: |
September 30, 2022 (Estimated) |
| Total Intended Award Amount: |
$911,398.00 |
| Total Awarded Amount to Date: |
$911,398.00 |
| Funds Obligated to Date: |
FY 2018 = $911,398.00
|
| History of Investigator: |
-
Raman
Arora
(Principal Investigator)
arora@cs.jhu.edu
|
| Awardee Sponsored Research Office: |
Johns Hopkins University
1101 E 33rd St
Baltimore
MD
US
21218-2686
(443)997-1898
|
| Sponsor Congressional District: |
07
|
| Primary Place of Performance: |
Johns Hopkins University
3400, N Charles St
Baltimore
MD
US
21218-2608
|
Primary Place of Performance Congressional District: |
07
|
| DUNS ID: |
001910777
|
| Parent DUNS ID: |
001910777
|
| NSF Program(s): |
Big Data Science &Engineering
|
| Primary Program Source: |
040100 NSF RESEARCH & RELATED ACTIVIT
|
| Program Reference Code(s): |
062Z,
8083
|
| Program Element Code(s): |
8083
|
| Award Agency Code: |
4900
|
| Fund Agency Code: |
4900
|
| Assistance Listing Number(s): |
47.070
|
ABSTRACT

Modern data sets are largely unlabeled. Unsupervised learning of useful representations to better understand the structure in data is a critical challenge in data science and machine learning; it finds application in computational and social science, including information retrieval, web mining, and recommendation systems. As we progress further into the age of Big data, and the amount of data to be processed grows faster than the growth in our computational resources, better and faster ways for performing unsupervised learning and data analysis on such big data sets become ever more necessary. Furthermore, with the advent of the internet of things, private data is collected rather ubiquitously and seamlessly through devices such as smartphones, cameras, microphones, radio-frequency identification (RFID) readers, and social networks, raising serious concerns about an individual's privacy. Therefore, in this project, we initiate a formal investigation into privacy-aware unsupervised learning for Big data applications.
Taking a stochastic optimization view of unsupervised learning, we capture more general learning problems than previously studied in the privacy literature. One such class of learning problems is non-convex problems, such as matrix learning, tensor factorization, deep learning, and many more. While most of these problems are NP-hard, in practice we find that we can efficiently find solutions to these problems. We conjecture that noisy stochastic gradient descent updates that have recently been shown to efficiently find local minima for a large class of non-convex problems also guarantees privacy implicitly. Finally, we consider extensions of the privacy model from that of a single curator to those to distributed learning, continual release model, streaming model, and a novel sliding window model.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
(Showing: 1 - 10 of 13)
(Showing: 1 - 13 of 13)
Rothchild, Daniel and Panda, Ashwinee and Ullah, Enayat and Ivkin, Nikita and Stoica, Ion and Braverman, Vladimir and Gonzalez, Joseph and Arora, Raman
"FetchSGD: Communication-Efficient Federated Learning with Sketching."
Proceedings of Machine Learning Research
, 2020
https://doi.org/
Citation Details
Wang, Yunjuan and Mianjy, Poorya and Arora, Raman
"Robust Learning for Data Poisoning Attacks"
Proceedings of Machine Learning Research
, v.139
, 2021
Citation Details
Arora, Raman and Bartlett, Peter and Mianjy, Poorya and Srebro, Nathan
"Dropout: Explicit Forms and Capacity Control"
Proceedings of Machine Learning Research
, v.139
, 2021
Citation Details
Rothchild, Daniel and Panda, Ashwinee and Ullah, Enayat and Ivkin, Nikita and Stoica, Ion and Braverman, Vladimir and Gonzalez, Joseph and Arora, Raman
"FetchSGD: Communication-Efficient Federated Learning with Sketching"
Proceedings of Machine Learning Research
, 2020
Citation Details
Ivkin, Nikita and Rothchild, Daniel and Ullah, Enayat and Braverman, Vladimir and Stoica, Ion and Arora, Raman
"Communication-efficient distributed SGD with sketching"
Advances in neural information processing systems
, 2019
Citation Details
Upadhyay, Jalaj
"Sublinear Space Private Algorithms Under the Sliding Window Model"
Proceedings of Machine Learning Research
, 2019
Citation Details
Arora, Raman and Upadhyay, Jalaj
"Differentially Private Graph Sparsification and Applications"
Advances in neural information processing systems
, 2019
Citation Details
Arora, Raman and Braverman, Vladimir and Upadhyay, Jalaj
"Differentially private robust low-rank approximation"
Advances in neural information processing systems
, 2018
Citation Details
Upadhyay, Jalaj
"The Price of Privacy for Low-rank Factorization"
Advances in neural information processing systems
, 2018
Citation Details
Ullah, Enayat and Mai, Tung and Rao, Anup and Rossi, Ryan A. and Arora, Raman
"Machine Unlearning via Algorithmic Stability"
Proceedings of Machine Learning Research
, v.134
, 2021
Citation Details
Upadhyay, Jalaj and Upadhyay, Sarvagya
"A Framework for Private Matrix Analysis in Sliding Window Model"
Proceedings of Machine Learning Research
, v.139
, 2021
Citation Details
Arora, Raman and Upadhyay, Jalaj and Upadhyay, Sarvagya
"Differentially Private Analysis on Graph Streams"
Proceedings of Machine Learning Research
, v.130
, 2021
Citation Details
Arora, Raman and Vanislavov, Teodor Marinov and Mohri, Mehryar
"Corralling Stochastic Bandit Algorithms"
Proceedings of Machine Learning Research
, v.130
, 2021
Citation Details
(Showing: 1 - 10 of 13)
(Showing: 1 - 13 of 13)
Please report errors in award information by writing to: awardsearch@nsf.gov.