text-only page produced automatically by Usablenet Assistive Skip all navigation and go to page content Skip top navigation and go to directorate navigation Skip top navigation and go to page navigation
National Science Foundation
design element
Search Awards
Recent Awards
Presidential and Honorary Awards
About Awards
Grant Policy Manual
Grant General Conditions
Cooperative Agreement Conditions
Special Conditions
Federal Demonstration Partnership
Policy Office Website

Award Abstract #1139158

Making Sense at Scale with Algorithms, Machines, and People

Div Of Information & Intelligent Systems
divider line
Initial Amendment Date: March 29, 2012
divider line
Latest Amendment Date: September 15, 2015
divider line
Award Number: 1139158
divider line
Award Instrument: Continuing grant
divider line
Program Manager: Aidong Zhang
IIS Div Of Information & Intelligent Systems
CSE Direct For Computer & Info Scie & Enginr
divider line
Start Date: April 1, 2012
divider line
End Date: March 31, 2017 (Estimated)
divider line
Awarded Amount to Date: $10,000,000.00
divider line
Investigator(s): Michael Franklin franklin@cs.berkeley.edu (Principal Investigator)
Scott Shenker (Co-Principal Investigator)
Alexandre Bayen (Co-Principal Investigator)
Ion Stoica (Co-Principal Investigator)
Michael Jordan (Co-Principal Investigator)
divider line
Sponsor: University of California-Berkeley
Sponsored Projects Office
BERKELEY, CA 94704-5940 (510)642-8109
divider line
divider line
Program Reference Code(s): 7723
divider line
Program Element Code(s): 1640, 7723


Making Sense at Scale with Algorithms, Machines, and People

University of California, Berkeley

The world is increasingly awash in data. As more and more human activities move on line, and as a growing array of connected devices become integral part of daily life, the amount and diversity of data being generated continues to explode. According to one estimate, more than a Zettabyte (one billion terabytes) of new information was created in 2010 alone, with the rate of new information increasing by roughly 60% annually. This data takes many forms: free-form tweets, text messages, blogs and documents; structured streams produced by computers, sensors and scientific instruments; and media such as images and video.

Buried in this flood of data are the keys to solving huge societal problems, for improving productivity and efficiency, for creating new economic opportunities, and for unlocking new discoveries in medicine, science and the humanities. However, raw data alone is not sufficient; we can only make sense of our world by turning this data into knowledge and insight. This challenge, known as the Big Data problem, cannot be solved by the straightforward application of current data analytics technology due to the sheer volume and diversity of information. Rather, to solve it requires throwing away old preconceptions about data management and breaking down many of the traditional boundaries in and around Computer Science and related disciplines.

The Algorithms, Machines, and People (AMP) expedition at the University of California, Berkeley is addressing this challenge head-on. AMP is a collaboration of researchers with a wide range of data-related expertise, committed to working together to create a new data analytics paradigm. AMP will produce fundamental innovations in and a deep integration of three very different types of computational resources:

1. Algorithms: new machine-learning and analysis methods that can operate at large scale and can give flexible tradeoffs between timeliness, accuracy, and cost.

2. Machines: systems infrastructure that allows programmers to easily harness the power of scalable cloud and cluster computing for making sense of data.

3. People: crowdsourcing human activity and intelligence to create hybrid human/computer solutions to problems not solvable by today's automated data analysis technologies alone.

AMP research will be guided and evaluated through close collaboration with domain experts in key societal applications including: cancer genomics and personalized medicine, large-scale sensing for traffic prediction and environmental monitoring, urban planning, and network security. Advances pioneered by the project will be made widely available through the development of the Berkeley Data Analysis System (BDAS), an open source software platform that seamlessly blends Algorithm, Machine and People resources to solve big data problems.

For more information visit http://amplab.cs.berkeley.edu


Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Peter Bailis, Shivaram Venkataraman, Michael Franklin, Joseph M. Hellerstein, Ion Stoica. "Probabilistically Bounded Staleness for Practical Partial Quorums," Proceedings of the VLDB Endowment 2012, v.5, 2012, p. 776-787.

Jiannan Wang, Tim Kraska, Michael Franklin, Jianhua Feng. "CrowdER: Crowdsourcing Entity Resolution," Proceedings of the VLDB Endowment 2012, v.5, 2012, p. 1483-1494.

Kay Ousterhout, Aurojit Panda, Joshua Rosen, Shivaram Venkataraman, Reynold Xin, Sylvia Ratnasamy, Scott Shenker, and Ion Stoica. "The Case for Tiny Tasks in Compute Clusters," In Proceeding of the 14th USENIX Conference on Hot Topics in Operating Systems, 2013, p. 14.

Reynold Xin, Joseph Gonzalez, Michael Franklin, Ion Stoica. "GraphX: A Resilient Distributed Graph System on Spark," Proceedings of the First International Workshop on Graph Data Management Experience and Systems (GRADES 2013), 2013. 

Rishabh Iyer, Stefanie Jegelka, and Jeff Bilmes. "Fast Semidifferential-Based Submodular Function Optimization," In Proc. International Conference of Machine Learning (ICML), 2013. 

Timothy Hunter, Tathagata Das, Matei Zaharia, Pieter Abbeel, and Alexandre M. Bayen. "Large-Scale Estimation in Cyberphysical Systems Using Streaming Data: A Case Study with Arterial Traffic Estimation," IEEE Transactions on Automation Science and Engineering, v.10, 2013, p. 884. 

Peter Bailis, Kyle Kingsbury. "The Network is Reliable: An informal survey of real-world communications failures," Communications of the ACM, v.12, 2014, p. http://qu. 


Please report errors in award information by writing to: awardsearch@nsf.gov.



Print this page
Back to Top of page
Research.gov  |  USA.gov  |  National Science Board  |  Recovery Act  |  Budget and Performance  |  Annual Financial Report
Web Policies and Important Links  |  Privacy  |  FOIA  |  NO FEAR Act  |  Inspector General  |  Webmaster Contact  |  Site Map
National Science Foundation Logo
The National Science Foundation, 4201 Wilson Boulevard, Arlington, Virginia 22230, USA
Tel: (703) 292-5111, FIRS: (800) 877-8339 | TDD: (800) 281-8749
  Text Only Version