Title : NSF 96-85 Speech, Text, Image, and MULtimedia Advanced Technology Effort Type : Program Guideline NSF Org: CISE Date : May 20, 1996 File : nsf9685 STIMULATE: Speech, Text, Image, and MULtimedia Advanced Technology Effort Program Solicitation A JOINT INITIATIVE OF: NATIONAL SCIENCE FOUNDATION COMPUTER AND INFORMATION SCIENCE AND ENGINEERING DIRECTORATE NATIONAL SECURITY AGENCY OFFICE OF RESEARCH AND SIGINT TECHNOLOGY OFFICE OF RESEARCH AND DEVELOPMENT CENTRAL INTELLIGENCE AGENCY and DEFENSE ADVANCED RESEARCH PROJECTS AGENCY INFORMATION TECHNOLOGY OFFICE DEADLINE: September 1, 1996 INTRODUCTION The Information, Robotics and Intelligent Systems Division (IRIS) of the Computer and Information Science and Engineering Directorate (CISE) of the National Science Foundation (NSF); the Office of Research and SIGINT Technology of the National Security Agency (NSA); the Office of Research and Development (ORD) of the Central Intelligence Agency (CIA); and the Information Technology Office (ITO) of the Defense Advanced Research Projects Agency (DARPA) plan to jointly support fundamental research devoted to understanding multimodal human communication and application of such understanding to computer technology. The aim of this joint initiative among NSF, NSA, ORD, and DARPA is to accelerate the progress in information technology by supporting new directions in research and development for understanding human communication in multiple modalities and languages. Such modalities include speech, text, image, video, gesture, facial expression, handwriting, and other means by which humans communicate. They also include degraded or noisy signals, such as may result from optical character recognition (OCR) or cellular telephones. Technical advances in understanding human communication have so far progressed mainly along the lines of single modalities and single languages. Progress in some areas has reached the point where significant impact on the national information infrastructure and the well-being of the nation is now possible. Further advances, however, may require taking advantage of the fact that most human communication takes place in more than one modality at the same time, or may require the development of new approaches to understanding a single modality. Therefore, it is important to pursue research to explore these possibilities. In addition, such a program of basic, scientific research is seen as an important vehicle for the development of new talent in the area of multimodal understanding. TOPICS OF INTEREST Building on previous inter-agency efforts in Human Language Technology (NSF 93-19) and Human Language Resources (NSF 95-100), this initiative seeks to extend research into multimodal human communications. In particular, proposals are sought which seek to synthesize multidisicplinary approaches to the processing of separate modalities or to explore new approaches to understanding single-modality human communications. An additional goal is to train new investigators in research on human communications. This initiative is focused on several areas of research: (1) Automated processing of multimodal human communications; (2) Discourse and dialogue phenomena for a wide variety of multimodal tasks; (3) New algorithm paradigms or representation schemes for processing within a modality; and (4) Multimodal architectures that permit the separation of application functionality from modality of user interaction. Examples follow, but are suggestions and not restrictions of possible proposal topics: Area 1: Automated processing of multimodal human communications, such as: Understanding of mixed language communications, such as telephone conversations involving bilingual speakers; Telephone-based information retrieval, involving speech recognition, real-time text summarization (such as of WWW pages) and prosodic speech synthesis; Event detection to spot emerging topics of interest in real time from on-going multimodal sources, including multimedia sources such as video or email; Topic tracking across multimodal and/or multilingual sources over time; Topic spotting from compressed data; Modality and/or language-independent query representation for information retrieval against heterogeneous sources; Machine translation from alternative modes involving ill-formed input, such as OCR, speech, email, etc.; I. Text summarization based on multiple documents from multimodal and/or multilingual sources, including a) summarization based on relevant frames extracted from video, or b) summarization from database records, or outputs of data fusion or other analysis tools; and Techniques for processing multimedia sources, such as exploiting audio cues in video to locate a good image of an object. Area 2: Discourse and dialogue phenomena for a wide variety of multimodal tasks, such as: Investigation into augmentation of human-computer interactivity, including a) determining human- computer interaction modality through negotiation (such as invoking spreadsheets when entering numbers, or invoking drawing tools when using the mouse in a particular way), or b) goal refinement through consultation (to determine how better to support what the user is trying to accomplish); Resolution of linguistic referring expressions, gesturing and pointing in multimodal and interactive communication environments; Error correction within multimodal and interactive communication environments; and Improved semantic, discourse, and dialogue modeling to enhance understanding regardless of modality. Area 3: New algorithm paradigms or representation schemes for processing within a modality, such as: Alternatives to Hidden Markov Models for speech recognition, such as biologically plausible models, or models based on articulatory features; Language models that rely on less training data for speech recognition and language understanding (for both speech and text); Machine translation techniques that can be rapidly ported to new languages where available on-line training data is scarce; Combining outputs of different engines to enhance performance on a specific task, such as machine translation, information retrieval, speech processing, document image analysis or video analysis; Integration of statistical and symbolic approaches to exploit performance tradeoffs in communications processing, such as automatic analysis of multimedia communications, or handwritten non- Roman character recognition; and Automated acquisition of knowledge sources needed for language understanding, including word-sense tagging, lexical, grammatical, discourse and domain knowledge. Area 4: Multimodal architectures that permit the separation of application functionality from modality of user interaction, such as: Transformation of map data into synthesized speech for hands-free and eyes-free guidance in unfamiliar territory; and Support of people with specific sensory or physical impairments for acess to full range of information services. SCOPE AND TYPE OF SUPPORT Proposals received under this solicitation will be subject to all normal NSF processing, including external review, selection criteria, and conditions of awards. There will be no special restrictions on research results, except those that normally apply to NSF grants. Namely, NSF will conduct the review of proposals received as a result of this announcement in accordance with their standard merit review process and selection criteria. In addition, a panel consisting of representatives from DARPA, NSA, ORD, and NSF will jointly select research projects to be funded under this initiative from among those recommended for funding by the above NSF merit review process. All awards will be based on the selection criteria and themes specified in the STIMULATE announcement. Projects may be awarded for single or multiple years. This initiative is expected to provide overall a total of approximately $6.7 million, depending on funding availability, to several awardees over the three-year period of this solicitation. Awards and projects are expected to: Be either a three-year continuing grant of $150K to $250K per year, or one-year standard grant of approximately $150K to $250K. The latter category is intended for high-risk projects, where the fundamental science does not yet exist. Success in this type of project may result in a follow-on proposal for a three-year award. Address multiple communication modalities or languages or new approaches for a single modality or language. Multiple-investigator, multi- disciplinary projects are encouraged in either case. PREPARATION AND SUBMISSION OF PROPOSALS All proposals should refer to this Program Solicitation by number (NSF 96-85), and should be prepared and submitted in accordance with the guidelines contained in Grant Proposal Guide (NSF 95-27, August, 1995). Nine (9) copies of each proposal, including one bearing original signatures, should be addressed to: STIMULATE National Science Foundation Proposal Processing Unit 4201 Wilson Blvd. Room P60 Arlington, VA 22230 One information copy should be sent to: Gary W. Strong, Program Director Interactive Systems National Science Foundation 4201 Wilson Blvd. Room 1115 Arlington, VA 22230 Proposers are encouraged to examine the NSF web site (http://www.nsf.gov) for award abstracts on topics related to those in which a proposal will be made. These may be of use in identifying work already in progress or in identifying researchers who are actively engaged in related research. NSF policies and guidelines are available on the web as well. WHO MAY APPLY Academic and other not-for-profit research institutions in the United States with computer and information science research capability are invited to submit proposals. While proposals may involve unfunded collaboration with industry or other agencies of the government, an academic or research institution must be the prime research management organization submitting the proposal. WHEN TO SUBMIT Proposals submitted in response to this solicitation must be: (1) received by NSF no later than 5:00PM September 1, 1996; (2) be postmarked no later than five (5) days prior to the deadline date; or (3) be sent via commercial overnight mail no later than two (2) days prior to the deadline date to be considered for award. Awards are planned to begin as soon as the review process has been completed. At least two additional annual competitions are expected to be held for this program but will be separately announced by new solicitations each year. Due dates and special requirements for the competitions will appear in those solicitations. Interested parties should contact the person below if they miss the due date of this solicitation but are interested in submitting a proposal in a following year. INQUIRIES Telephone and email queries about this announcement are welcomed and should be addressed to: Gary W. Strong, Program Director Interactive Systems (703) 306-1928 gstrong@nsf.gov PROPOSAL EVALUATION AND AWARD Proposals will be subject to review by an invited panel of external experts from the scientific community. Supplemental ad hoc reviews may be solicited as feasible and necessary to achieve a fair and accurate review of all proposals. Criteria by which the proposals will be judged include those published in the Grant Proposal Guide (NSF 95-27) but with special emphasis to be placed on novel, innovative approaches or projects that involve new investigators. The participating agencies will jointly make the final selection of all awards under this initiative, considering the recommendations of all the external reviewers. Awards to successful projects will be made through NSF from funding provided by participating agencies. AWARD ADMINISTRATION Grants are administered in accordance with the terms and conditions of NSF Grant General Conditions (GC-1) or Federal Demonstration Project General Terms and Conditions (FDP), depending on the organization, copies of which may be requested from the NSF Forms and Publications Unit cited below under the section ADDITIONAL INFORMATION, or from the NSF world- wide web address http://www.nsf.gov:80/bfa/cpo/start.htm. More comprehensive information is contained in the NSF Grant Policy Manual (NSF 95-26, July, 1995), available as above or through a subscription offered by the Superintendent of Documents, Government Printing Office, Washington, DC 20402. The Foundation provides awards for research in the sciences and engineering. The awardee is wholly responsible for the conduct of such research and preparation of the results for publication. The Foundation does not assume responsibility for such findings or their interpretation. The Foundation welcomes proposals on behalf of all qualified scientists and engineers, and strongly encourages women, minorities and persons with disabilities to compete fully in any of the research and research related programs described in this document. In accordance with Federal statues and regulations and NSF policies, no person, on grounds of race, color, age, sex, national origin, or disability shall be excluded from participation in, denied the benefits of, or be subject to discrimination under any program or activity receiving financial assistance from the National Science Foundation. THE NSF has TDD (Telephonic Device for the Deaf) capability, which enables individuals with hearing impairment to communicate with the Division of Human Resource Management about NSF programs, employment, or general information. This number is (703) 306-0090. AWARDEE EXPECTATIONS Project progress reports are expected to be presented at two workshops, which are expected to be held annually: the Intractive Systems Program PI s Workshop and a special workshop for grantees of this solicitation. Proposers should include sufficient funds in their budgets to support travel to and to present at the latter workshop which is expected to be held in the Washington DC area once each year for a two-day meeting. ADDITIONAL INFORMATION NSF information and publications are available electronically via the World Wide Web (the URL is http://www.nsf.gov/), via Internet Gopher (on host stis.nsf.gov), via anonymous FTP (from ftp://stis.nsf.gov), or by sending an email request (sent to info@nsf.gov if you don't know the publication number or pubs@nsf.gov if you do). You may also send a written request to: NSF Forms and Publications Unit Room P-15 4201 Wilson Blvd. Arlington, VA 22230 FACILITATION AWARDS FOR SCIENTISTS AND ENGINEERS WITH DISABILITIES (FASED) These awards provide funding for special assistance or equipment to enable persons with disabilities (investigators and other staff, including student research assistants) to work on NSF projects. See the program announcement or contact the program coordinator at (703) 306-1636. PRIVACY AND PUBLIC BURDEN STATEMENTS The information requested on proposal forms is solicited under the authority of the National Science Foundation Act of 1950, as amended. It will be used in connection with the selection of qualified proposals and may be disclosed to qualified reviewers and staff assistants as part of the review process; to applicant institutions/grantees; to provide or obtain data regarding the application review process, award decisions, or the administration of awards; to government contractors, experts, volunteers, and researchers as necessary to complete assigned work; and to other government agencies in order to coordinate programs. See System of Records, NSF-50, Principal Investigator/Proposal File and Associated Records and NSF-51, 60 Federal Register 4449 (January 23, 1995), Reviewer/Proposal File and Associated Records, 59 Federal Register 8031 (February 17, 1994). Submission of the information is voluntary. Failure to provide full and complete information, however, may reduce the possibility of your receiving an award. Public reporting burden for this collection of information is estimated to average 120 hours per response, including the time for reviewing instructions. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to: Herman G. Fleming Reports Clearance Officer Division of Contracts, Policy, and Oversight National Science Foundation Arlington, VA 22230 and to: Office of Management and Budget OIRM-Paperwork Reduction Project (3145-0058) Washington, DC 20503 OMB 3145-0058 P.T.: 34 K.W.: 1004144; 1004000; 0410000 Catalog of Federal Domestic Assistance No. 47.070 NSF 96-85 (New)