Conference on Scientific and Technical Data Exchange and Integration
National Institutes of Health
Natcher Conference Center
December 15, 1997
As I looked over our agenda, I was reminded of a story I heard from a colleague some months back. It was late last summer, and she and her family had just returned from a weekend of camping. She said the most memorable moment of the trip came one evening when the entire family was gathered around the campfire.
Her young daughter kept wandering around, circling the fire, intent on watching the flames jump and the sparks fly toward the sky. After a few minutes of this, the youngster finally looked up and said, "Wow mom, neat graphics!"
I don't mean to suggest that young people today have trouble separating reality from virtual reality. What is important to recognize is that the revolution in computing and communications has transformed the way we see our world.
Today I want to focus on what this means for both research and education in science and engineering. It's an appropriate topic for this time of year. We call this the season of giving -- much in keeping with the old phrase, 'Tis better to give than receive. I must confess that my own grandchildren have yet to embrace the true spirit of this message. They still most look forward to the receiving part of the season -- and as a grandparent, I am only too happy to indulge them. To be truthful, I'm not so sure I ever fully accepted this wisdom as a child.
My emphasis today -- if you'll indulge me -- will be on the importance of giving and receiving scientific data and information. I will explore two different facets of how data-sharing has become vital to the future of science and engineering.
By the end, I hope to make clear why I would suggest a substitute for my talk: 'Tis Better to Give and Receive: Benefits of Data-Sharing for Research and Education.
The importance of data sharing in research was brought to light in a very interesting special edition of U.S. News and World Report that came out some months back. The cover story was entitled, "Great Science Mysteries." I used to read my science in Phys. Rev., now it's U.S. News.
The first thing that caught my eye was the subtext presented in the Table of Contents. It read, "the wonder of science is that the more we know, the more we know there is to know."
It was refreshing to see this statement in a major newsmagazine. We live in an era where books are being published saying the exact opposite -- that we know all there is to know, and all the great questions have been answered. It was therefore reassuring and inspiring to see a major media publication helping to deflate some of the arrogance of our era. Great frontiers of learning, discovery, and progress still lie before us -- so long as we retain the desire and the courage to take some risks and boldly explore where no one has gone before.
The centerpiece of the U.S. News issue was a collection of essays by leading researchers and science writers. They examined some 19 unanswered questions that run the gamut of science and engineering fields. A few examples:
At first glance, one could easily conclude that the questions on this list have virtually nothing in common with each other. Of course, before this past October, we said much the same thing about stock prices in New York and interest rates in Hong Kong. Because our world is now defined by linkages and interdependencies, it is instructive to examine how much the different questions on the U.S. News list actually do have in common with each other.
Some of you may know that NSF has launched an ambitious multidisciplinary effort under the heading of Knowledge and Distributed Intelligence, or KDI for short. It involves programs in all parts of the Foundation, as it is intended to foster innovative ways of developing, analyzing, representing, accessing, and transmitting complex information. This includes many activities related to working with large data-sets. There is also substantial overlap with our efforts in the area we call Life and Earth's Environment, which I know Bob Corell will have more to say about later this morning. A few examples:
The U.S. News list provided one more reminder that data-intensive, interdisciplinary approaches like KDI might well hold the key to unraveling science's great mysteries.
Whether we are asking how old the universe is, how many species there are, or what causes ice ages, the answers we seek will be locked away forever -- unless we spur progress in efforts under the KDI umbrella.
To quote: "The job of cataloging the world's species is straightforward, methodical, and slow. Taxonomists...add an average of 13,000 species a year to the list of known organisms. At that rate it would take centuries to complete the census. Because no central storehouse coordinates the results, even the number of species named so far -- between 1.5 million and 1.8 million -- is uncertain."
These few examples represent just a small sampling of the great mysteries that work in KDI will help us unravel. Even more important is that the advances in information science and technology needed to address these larger scientific challenges in research will likely bring even greater gains to our society as a whole.
It was Francis Bacon, I believe, who first wrote that "knowledge is power," but not even he could have foreseen how knowledge and information have come to power our society today. The phrase "information is everywhere" is more than just a cliche. We can now link to a rich array of information sources virtually anywhere and anytime we desire. We can check the weather at our favorite ski resorts, see who won the late game, and keep tabs on the market -- all without ever leaving our living rooms.
Now comes the hard part. Is this increased access to information enriching us as individuals and as a society? Is it creating opportunities that benefit all Americans? Critics and cynics point out that these advances have numerous unanticipated and perhaps undesired consequences. There are disturbing signs that these new technologies have further widened long-standing gaps and divisions in our society -- creating a world of technological haves and have-nots. Even some of our strongest supporters have openly wondered if all we've done is create new distractions for our young people that keep them from their studies. They say we are just seeding and fertilizing new crops of couch potatoes.
The access we have gained to widely distributed sources of information marks a major accomplishment for human civilization. It is nevertheless only the first step. Access to information is one thing. But intelligently absorbing, refining, and analyzing this information to glean useful knowledge is another altogether. This represents the driving force behind NSF's efforts in Knowledge and Distributed Intelligence.
We don't need to look back very far into history for a precedent that bodes well for our future success. I know the early 1990s hardly count as ancient history, but one could say that's the stone age in terms of the World Wide Web. At the time, the Web was literally the exclusive domain of researchers working at NSF supercomputer centers and other major facilities around the globe. But right around that time a sharp undergraduate took a job as a programmer at the National Center for Supercomputing Applications, the NSF-supported supercomputer center at the University of Illinois.
This student knew that there had to be something better than gophers and FTPs for linking data and exchanging files across different sites and applications. He came up with a program -- named it Mosaic -- and the web browser was born. The student's name is Marc Andreessen. He's since turned Mosaic into Netscape, and he's also provided us with a great story on the financial rewards students can reap from working on NSF-supported research projects.
The Netscape story provides a good example of how basic research can produce outcomes of enormous economic benefit -- and that those benefits are completely unpredictable. It brings to mind Newton's famous observance that advances in science always require "standing on the shoulders of giants." We should take nothing away from the many talented people who have successfully taken these new tools and technologies to the commercial sector. But, it is also beyond doubt that their success truly rests upon the shoulders of countless dedicated and talented individuals -- many of whom are here in this room today. Of course, it would be nice if we could all get a share of the profits.
This brings me to the second part of my talk because our society stands to profit in countless ways from the advent of these data-intensive approaches to science and engineering. Some of the greatest returns, I believe, will likely come from improved approaches to education and public outreach.
Let me share with you a few points from a recent study by the OECD, the Organization for Economic Co-operation and Development. It is entitled, Science in the Public Eye. It tells us that virtually all of the world's major industrialized countries share one troubling trait. Interest in scientific news and events is surprisingly high, while understanding of scientific concepts and methods is disturbingly low.
The term potential takes on special meaning within the realm of physics. It describes a difference in energy at two locations. We see this in the "plus" and "minus" signs on the batteries that power our flashlights. A more vivid example might be the feeling one gets when standing at the base of a structure like Hoover dam - imagining the potential contained in the massive reservoir of water pressing against the concrete wall.
Needless to say, across all of science and engineering, there is a great human reservoir of energy we can tap to reduce the public's potential difference and raise awareness and understanding of our work. We just have to find the keys that open the floodgates.
Our friends at NASA tell us that they received over 565 million hits on the web sites for the Pathfinder mission to Mars in one five week period. In the days following the landing, the site was logging in excess of 45 million hits per day. I won't try to translate all these figures into actual numbers of people -- except to say that it is a very large number.
Even more important is that these numbers are only the beginning. The frontiers of science are now accessible as never before, and we can now open our doors to anyone interested in the discovery process.
NSF and the Department of Energy have helped a team at the Lawrence Berkeley Laboratory launch a project known as "Hands On Universe." It's a prime example of what my colleagues in our Education and Human Resources Directorate call a "Student-Scientist Partnership." Students and teachers draw upon data and applications developed at top observatories, and their findings contribute directly to the research base.
Suppose, for example, you want to track supernovae. Astrophysicists have calculated that supernovae occur at the rate of roughly one per week. Even if every astrophysicist in the world worked full time searching for them, we probably still would not identify all of them.
"Hands On Universe" gets teachers and students involved in this search. They download images and software, and then they start sifting through the data. This has already led to startling discoveries and valuable science. One student team actually spotted the first light of the ninth supernova of 1994, and they will appear as co-authors on the paper.
Even Nintendo can't compete with the excitement and motivation this generates. One student put it best, saying: "I didn't just learn about science. I am a scientist."
Other types of these student-scientist partnerships have become indispensable components of major data collection efforts. It is now mid-December, the leaves are off the trees, and the winter solstice is just a week away. At times like this, it is difficult to believe that Spring is just around the corner, and we have a hard time imagining that proverbial "first butterfly of spring."
It turns out that one of the great mysteries of science involves the migration patterns of the familiar, orange and black Monarch butterflies that grace our gardens in spring and summer. Monarchs move across the continent from Canada to Mexico each year in the late summer and fall. We are not sure if they move in specific directions or along certain pathways. We also don't know if weather is a factor, or if the migration patterns change over time.
Answering all of these questions requires data - lots of it - collected at regular intervals, from as many different sites as possible, and then entered into shared databases. An NSF-funded project based at the University of Kansas is helping to coordinate a national effort to develop this database.
One key to the project is mapping the Monarch's migrations. This requires tagging the butterflies by - ever so gently - placing a small polypropylene patch on their wings. It's not a simple task, as it is all too easy to damage the butterflies' delicate wings.
Given the need for kind and gentle handling, one might think that this is last job one would turn over to a group of 12-year-olds. You'd be surprised. It turns out that very small fingers are a major plus when securing the tags. Seasoned researchers therefore appreciate the benefits of including kids on their research teams. This past year, over 50,000 Monarchs were tagged, and the tracking of the tagged butterflies is well underway. In fact, this year's data-set is yielding the best results ever on the migration patterns. One Monarch that was tagged in Connecticut was recovered over 900 miles away in North Florida. That's a lot of flaps for a delicate wing!
Again, student groups are key parts of this entire process - the tagging, the tracking, and the data entry and analysis. They share in the excitement and the satisfaction that comes with being full partners in the discovery process. In this case, the database likely would not exist without their contribution.
These examples and countless similar efforts around the country relate directly to the goals of our gathering here today. It takes more than luck and happenstance to make the exchange and integration of advanced scientific data a reality. It requires careful planning at the front-end of projects - planning that reflects a commitment to cooperative research as well as to education and outreach.
This brings to mind the famous line from the movie, "Field of Dreams." When it comes to data-sets that represent the frontiers of science, we should be guided by a philosophy of, "if you build it, they will come."
In closing, let me return to a point from the beginning of my remarks. Advances in information technologies have transformed how we see our world. Nowhere is this more evident than in research and education. Our new-found abilities to collect, represent, and exchange data and information hold the key to solving science's great mysteries, and to opening floodgates that will raise public awareness and appreciation for our work.
And so, I leave you with a seasonal message: it truly is better to give and receive. This applies to many endeavors in life - and especially to the exchange of and integration of scientific data and information. These resources are an immensely valuable gift to our society, and their value gets multiplied many times over when we see to it that they are open and available to all.