By BERT GAMBINI
UB is a leading center in the field of applied ontology, which develops the means necessary to build logically coherent data classification systems. One critical resource for this work has now been recognized as an international standard by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) as conformant to the joint ISO/IEC Top-Level Ontology standard.
That resource is the Basic Formal Ontology (BFO), which is used in efforts ranging from genomics research, to digital manufacturing, to the U.S. Department of Defense.
BFO is the first piece of philosophy ever to receive such recognition.
BFO was created and is maintained by a team led by Barry Smith, SUNY Distinguished Professor in the Department of Philosophy, College of Arts and Sciences, and director of UB’s National Center for Ontological Research (NCOR). Smith led the UB team behind the effort, which included Werner Ceusters, division chief, Biomedical Ontology, Jacobs School of Medicine and Biomedical Sciences, and Alan H. Ruttenberg, former director of clinical and translational data exchange, UB School of Dental Medicine.
The term “ontology” originally referred to a branch of metaphysics. In the current context, an “ontology” is a computational inventory of types of objects, processes and the relationships between them. Ontologies provide controlled vocabularies whose common use by database and information systems developers allows their data to be more effectively combined, retrieved and analyzed.
Hundreds of ontologies have been developed in many specific domains, but BFO is a “top-level” ontology that is applicable to any domain whatsoever. It provides a common architecture for use in supporting any formal vocabulary.
“The publication of the new ISO/IEC standard has far-reaching implications that will encourage new levels of innovation and collaboration not previously possible,” Smith explains. “In various disciplines for the last 20 years or so, people have engaged in efforts to standardize categories in different areas. Those standardized sets are ontologies. Increasingly, BFO provides the architecture used to do that work.”
The history of ontology
The success of the Human Genome Project gave rise very rapidly to immense quantities of new kinds of data — data about genes, proteins and other molecular sequences. Biologists and medical scientists had to find some way to connect these new data to the work they were doing on biological processes in humans and other organisms, including processes leading to disease. The Gene Ontology (GO), created in 1998, provided a consensus vocabulary that could be used to tag sequences found in all sorts of experiments involving biological phenomena.
In 2004, the GO leadership invited Smith to help address the logical problems that arose when the GO itself needed to be linked to other ontologies covering domains such as chemistry or anatomy. The initial version of BFO, which provided a domain-neutral starting point for ontology building, was the result of this work.
Its successes in biology led the U.S. Army to consider BFO for some of its intelligence needs in 2010. Since then, multiple agencies in the U.S. Department of Defense and the U.S. intelligence community have followed suit. Today, there are roughly 500 institutions and groups that have ontology initiatives using BFO.
When Smith started working with the U.S. Army, he also became involved with CUBRC, a Buffalo-based scientific research, development, testing and systems integration company that arose through a strategic relationship between UB and a local technology company in 1983 as a bridge between academia and industry. Today, CUBRC employs more than 170 engineers and scientists who perform advanced research, development, engineering and testing services in the areas of aeroscience; chemical, biological and medical sciences; and information sciences.
“CUBRC’s primary customers come from all elements of the U.S. Department of Defense,” says Smith. “Based on the ever-growing need to align and semantically enhance very large and diverse data to support a wide variety of defense and intelligence applications, CUBRC started a small ontology team comprised of former UB students and associates, which built what came to be the Common Core Ontologies (CCO), a very wide-ranging suite of open source ontologies based on BFO that is now used in important initiatives relating to defense and intelligence.”
Early on, CUBRC recognized the potential impact across a wide variety of applications of subjecting BFO to the ISO process.
“As a result, CUBRC provided both technical and financial support to Dr. Smith as he navigated the long and complicated process that resulted in the publication of the new international standard,” says Michael Moskal, CUBRC CIO and senior vice-president.
How it works
BFO is relatively small but universal in the sense that, according to Smith, its categories can be used by more or less everyone with data. Everyone needs time and space; objects and processes; qualities and places. These are just the sorts of general terms defined computationally by BFO.
“Let’s say you have a person, or a tank, or a molecule. In each case, you have an object. Object is a general category, while tank is lower down the scale,” says Smith. “Similarly, if you have a clot forming or a bullet firing, then you have a process, which is another top-level category. BFO provides a means for domain ontologists to build categories at lower levels in a consistent way. That consistency is significant because of the ever-recurring need to pool resources, as with interagency collaborations, or corporate mergers, or scientific collaborations. When databases are built in isolation, it becomes very expensive when they need to be combined.”
To see the sorts of problems that arise, consider the word “post,” which has multiple definitions and shades of meaning. Speakers of American English might think of a wooden object when they hear “post,” but speakers of British English might be referring to the process of delivering mail. Human beings are able to work out the nuances separating a piece of timber from the postal service, but computers need to plow forward with just the information that has been coded. Widespread use of BFO can at least alleviate some of the problems that then arise.
As another example, take the term “hole.” German and French engineers working in 2006 on separate pieces of the fuselage for the Airbus 380 had conflicting ways of representing holes in their respective computer-aided design packages. But their databases plowed forward with their design instructions. The discrepancy between the two definitions was not realized until the two pieces being built separately in Hamburg and Toulouse were brought together. It was only then that crews threading the hundreds of miles of wiring through the plane’s airframe announced that they ran out of slack before reaching the necessary connection points.
Price tag to fix a few millimeters of mismeasurement? About $6 billion.
BFO creation of philosophers
A philosophical background is ideally suited for creating something like BFO, notes Smith, because of the discipline’s broad nature.
Specialists build their knowledge within a particular specialty or domain.
“Engineers know a lot about engineering, and computer scientists know a lot about computer science, but philosophers know a little bit — namely the very general bit — about everything, and that’s what is required to build a top-level architecture that everyone can share, whether you’re talking about gene sequencing or military mission planning,” says Smith.
Massive data storage and integration demands are modern realities of all scientific undertakings.
“The ISO/IEC standard establishes BFO as the first Top Level Ontology for describing complex processes, objects and functions in ways that will allow scientists to proceed with greater confidence that their data will be reusable by others,” says Smith.