Award Abstract # 1952302
IRES Track-1: I/O Research for Data-Intensive Analytics and Deep Learning

NSF Org: OISE
Office Of Internatl Science &Engineering
Recipient: FLORIDA STATE UNIVERSITY
Initial Amendment Date: April 7, 2020
Latest Amendment Date: April 7, 2020
Award Number: 1952302
Award Instrument: Standard Grant
Program Manager: Fahmida Chowdhury
fchowdhu@nsf.gov
 (703)292-4672
OISE
 Office Of Internatl Science &Engineering
O/D
 Office Of The Director
Start Date: May 1, 2020
End Date: April 30, 2024 (Estimated)
Total Intended Award Amount: $299,963.00
Total Awarded Amount to Date: $299,963.00
Funds Obligated to Date: FY 2020 = $299,963.00
History of Investigator:
  • Weikuan Yu (Principal Investigator)
    yuw@cs.fsu.edu
Recipient Sponsored Research Office: Florida State University
874 TRADITIONS WAY
TALLAHASSEE
FL  US  32306-0001
(850)644-5260
Sponsor Congressional District: 02
Primary Place of Performance: Florida State University
874 Traditions Way, 3rd Floor
TALLAHASSEE
FL  US  32306-4166
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): JF2BLNN4PJC3
Parent UEI: D4GCCCMXR1H3
NSF Program(s): IRES Track I: IRES Sites (IS)
Primary Program Source: 040100 NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 5921, 7639
Program Element Code(s): 7727
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.079

ABSTRACT

Applications of data science are becoming increasingly diverse. These applications include computation, input-output analysis, deep learning and several other fields. These diverse applications tend to generate and process their datasets in very different patterns. Their complex I/O patterns pose numerous challenges due to contention, congestion, and performance variabilities at multiple layers of the I/O stack including I/O middleware libraries, parallel file systems and storage devices. This IRES project aims to organize an international collaboration between Japan and the U.S. for research on I/O performance efficiency and data reliability for data-intensive analytics and deep learning applications. The IRES Track-1 site will be hosted at the Florida State University (FSU), through close collaboration with the RIKEN Center for Computational Science (R-CCS) in Kobe, Japan. As a world-renowned national lab, R-CCS has hosted the fastest K supercomputer in Japan and has been chosen as the site to host Japan?s future exascale computer, Fugaku. This project leverages such facilities for research and training of IRES participants and enriches the portfolio of international collaborations between the U.S. and Japan. Each year for the duration of the project, five (4 graduate and 1 undergraduate) U.S. students will be selected to participate in the IRES program to visit and do research at the R-CCS for 10 weeks.

This project pursues cross-layer optimizations on I/O middleware libraries, parallel file systems, and storage configurations, serving data-intensive analytics and deep learning applications. The project consists of a number of research activities, including (1) I/O characterization of large-scale data-intensive applications and parallel file systems on large-scale supercomputers, (2) application-oriented I/O pipelining for deep learning applications and data reduction through compression; (3) user-level cross-layer optimizations of file and storage systems; and (4) development of multi-level checkpoint/restart with optimal checkpoint/restart intervals across hierarchical storage devices. The research can lead to many insights on how to develop efficient and reliable I/O techniques on high-performance computing (HPC) systems. The experience and lessons learned through this research can benefit the development of storage systems on leadership HPC systems for data analytics and deep learning applications and is expected to enhance the professional development of participating students.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fang, Xingang and Klawohn, Julia and De Sabatino, Alexander and Kundnani, Harsh and Ryan, Jonathan and Yu, Weikuan and Hajcak, Greg "Accurate classification of depression through optimized machine learning models on high-dimensional noisy data" Biomedical Signal Processing and Control , v.71 , 2022 https://doi.org/10.1016/j.bspc.2021.103237 Citation Details
Bhattacharya, Subhadeep and Yu, Weikuan and Chowdhury, Fahim Tahmid and Mohror, Kathryn "O(1) Communication for Distributed SGD through Two-Level Gradient Averaging" 2021 IEEE International Conference on Cluster Computing (CLUSTER) , 2021 https://doi.org/10.1109/Cluster48925.2021.00054 Citation Details
Khan, Md Muhib and Yu, Weikuan "ROBOTune: High-Dimensional Configuration Tuning for Cluster-Based Data Analytics" ICPP 2021: 50th International Conference on Parallel Processing , 2021 https://doi.org/10.1145/3472456.3472518 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page