Building the Scholarly Electronic Brain

Michael Kurtz

Harvard-Smithsonian Center for Astrophysics

It is now a commonplace observation that human society is becoming a coherent super-organism, and that the information infrastructure forms its emerging brain. Perhaps, as the underlying technologies are likely to become billions of times more powerful than those we have today, we could say that we are now building the lizard brain for the future organism.

The current state of the practice is that information from fundamental sources (The Congressional Record, The Hubble Space Telescope Archive) is collated and analyzed by humans, then this work is published in secondary venues (The New York Times, The Astrophysical Journal) which are then machine indexed by third parties (Google, ADS). At the same time specialized indices (Thomas, SIMBAD) do a more thorough job of indexing the fundamental sources. The close similarity between the broad cultural and scholarly realms here frees scholars to concentrate on the problems and opportunities inherent in their disciplines, assuming that larger infrastructure issues will (mostly) be solved by others.

In Astronomy the long term trend has been toward increasing collaboration, interoperability, and interdependence of the major information groups (archives, journals, indexers); already substantially increased capabilities have resulted. This trend will continue, driven both by the shared goals of all groups and a rapidly changing technological base. Some of the key concepts which will drive these future developments are: very fine grained information; semantic markup and semantic inferencing engines; text mining; machine learning.