Pooling Data in Astronomy and Particle Physics
High-speed computer networks and breakthroughs in telescopes
now allow astronomers and particle physicists to collect vast amounts of data
in very short periods of time. But this has created a problem: there's too much
data for anyone to look at or make sense of. That fact prompted Dr. Alex Szalay
to propose the KDI-funded project Accessing Large Distributed Archives in
Astronomy and Particle Physics.
Dr. Szalay, the project's
principal investigator, is the Alumni Centennial Professor at the Department of
Physics and Astronomy at Johns Hopkins University. He worked with co-principal
investigators Drs. Aihud Pevsner and Ethan Vishniac (also from Johns Hopkins's
Physics and Astronomy Department), Dr. Jim Gray from Microsoft, Dr. Michael Goodrich
from the University's Computer Science Department, Drs. Harvey Newman and Julian
Bunn from California Institute of Technology (Caltech), and Dr. Tom Nash from
Fermilab, in addition to postdocs and graduate students.
Dr. Szalay explains that in today's world, computers are
involved in all the sciences, and one of the functions they perform is to
collect data whose volume is growing exponentially. "Every year," he says, "we
collect as much data as was ever collected from the beginningfrom the
time of the Greeks. Experiments in astronomy and particle physics can acquire
data at the rate of several terabytes (a thousand gigabytes) a year, and soon
that will accelerate to a petabyte (a thousand terabytes) a year." How to
organize this avalanche of data and render it accessible to researchers? This
question fueled the work of Dr. Szalay and his team.
"In astronomy and particle physics, as well as in other
sciences such as biology and oceanography, there are projects going on all over
the world and interesting new data sets are being collected, but scientists
don't have time to process this information to a central location," says Dr.
Szalay. In the field of astronomy, Dr. Szalay notes, archives are stored in
different geographic locations, and individual publishers of astronomical data
become the curators of their data sets. "Ten years ago there was nothing that
one could do about this," says Dr. Szalay, "so astronomers were more isolated.
But today there are fast computer networks and a fast Internet, so now we could
ask: How do we use this data as part of a single data set?"
The team built several test applications to generate
solutions to the problem of how to pool resources. "We ended up solving the
problems that we set out to solve," Dr. Szalay says. "What we wanted was to
enable simultaneously searches in multiple data sets, and we accomplished this.
We demonstrated that it can be doneand done easily."
But the team's work didn't end when the project concluded.
"We started to participate in this computational grid as a result of the KDI,"
said Dr. Szalay, "but the project had a much broader impact, well beyond what
was originally intended." He explains that the collaboration gradually grew
into a larger and larger collaboration of physicists and astronomers and
particle physicists. This led to the formation of the National Virtual
Observatory, which creates standards for astronomical data collections that
will be used by the astronomical community. For more information about the
National Virtual Observatory, go to their Web site:
"The KDI grant was an unbelievably important catalyst, at
precisely the right place and the right time," said Dr. Szalay. "A lot of [our
research] would have gone in another direction if it weren't for this grant. I
cannot emphasize how important this KDI project was, for me personally, and for
the research direction that I have ended up pursuing."
If you'd like to learn more about projects like this one,
go to http://www.skyquery.net This
interactive site has a good demonstration of how typical applications in the
Virtual Observatory will look.
Back to Top of