Cross-industry semantic interoperability, part three: The role of a top-level ontology

July 26, 2017 Victor Berrios, Zigbee Alliance; Richard Halter, ARTS; Mark Harrison, GS1; Scott Hollenbeck, VeriSign; Elisa Kendall, OMG; Doug Migliori, ControlBEAM; John Petze, SkyFoundry; and J. Clarke Stevens, Shaw Communications

 

This multi-part series addresses the need for a single semantic data model supporting the Internet of Things (IoT) and the digital transformation of buildings, businesses, and consumers. Such a model must be simple and extensible to enable plug-and-play interoperability and universal adoption across industries.

Part two identified consortia and their approaches to application layer interop. In part three, we discuss the role of a top-level ontology in solving the metadata challenge, and how elements of alternative approaches can improve scalability.

This is intended to be a living series that incorporates relevant emerging concepts and reader comments over time. The community’s participation is encouraged.

Navigate to other parts of the series here:

 

“There are two words for everything.” – E.V. Lucas

What is an ontology?

Ontologies, as parts of science, have many faces. Originally, ontology was the part of philosophy about “being,” or the universal system of knowledge that describes objects, phenomena, and regularities of the world.

In recent years, the development of ontologies has been moving from the realm of artificial intelligence (AI) to the Semantic Web. The ontologies on the web range from categorizing general web content (such as schema.org) to categorizations of products for sale and their features (such as on amazon.com).

As a facilitator of semantic interoperability, an ontology provides a standardized classification of the concepts associated with metadata identifiers for a particular domain (e.g., healthcare). While incorporating characteristics of a taxonomy and thesaurus, an ontology uses strict semantic relationships among terms and attributes with the goal of knowledge representation in machine-readable form (Figure 15).[7]

 


[Figure 15 | Semantic levels]

The methodologies used to develop an ontology are critical to facilitating scalability and must consider all relevant applications. The applications this article series considers include the business and IoT use cases of five inter-related industries – Homes & Buildings, Energy, Retail, Healthcare, and Transportation & Logistics.

While syntactic languages (such as OWL, RDF, and RDFS) can be used to construct ontologies, part three of the series will focus on methodologies agnostic to any specific modeling language.

A controlled vocabulary for consistency

A controlled vocabulary is a carefully selected collection of words and phrases (i.e., terms) that are given well-defined meaning consistent across contexts. A vocabulary can be used to maintain consistency in ontology development, which defines the contextual relationships behind the terms.

All terms in a vocabulary controlled by a registration authority should have an unambiguous, non-redundant definition. If multiple terms can be used to mean the same thing, one of the terms should be identified as the preferred term in the controlled vocabulary, with the other terms listed as synonyms or aliases (as shown in Figure 16 and the Glossary of Terms).

Controlled vocabularies should provide National Language Support for global applications. Standard vocabularies representing terms within domains of knowledge are already available freely from various organizations (e.g., lov.okfn.org).

 


[Figure 16 | Controlled terms with aliases and translations]

An object class for every thing

An ontology can provide a standardized classification of domain concepts through a collection of classes. Each class (concept) can represent a category of like things or objects which can be uniquely identified. A class is defined to reflect the attributes, restrictions, and relationships unique to its objects (instances). A class can represent physical objects such as sensors and persons as well as information objects such as business transactions [ISO 11179]. An ontology, together with a set of individual instances of classes, constitutes a knowledge base.[8]

A hierarchy for classes

Like a taxonomy, an ontology can define its classes within a hierarchical structure, which can be as deep as needed (Figure 17). A class (such as Sensor or Actuator) can be a subclass (type) of another class (Device).

 

[Figure 17 | Hierarchical structure of an ontology.]

All subclasses inherit the attributes of their class. For example, if Power Status is an attribute of the Device class, all Sensor and Actuator objects will have this attribute.

An attribute is attached at the most general class applicable to all of its objects, including subclasses. Since all classes are types of objects, the class hierarchy has one root class, Object, which comprises attributes, such as Identifier (an O-DEF classification property), that are inherited by all objects (see Figure 19).

While this methodology parallels object-oriented programming, it represents a metadata abstraction from programming. The metadata representing the ontology can be maintained in a repository (ISO 11179) completely abstracted from any programming environment.

A top-level for cross-industry interop

Top-level object classes (e.g., ODEF core index) can facilitate the exchange of data and interoperability across different domains (e.g., buildings, retail, healthcare) since they ensure that foundational terms are used in a unified and semantically compatible manner.

The semantic data models of the consortia identified in this series include top-level classes that support their targeted industries and use cases (Figure 18).

 



[Figure 18 | Consortia top-level object classes.]

While terminology may differ, the consortia share many foundational concepts (classes). A “blending” of these concepts can form a top-level ontology capable of supporting industry-specific use cases and cross-industry interoperability (Figure 19).

 



[Figure 19 | Blended, cross-industry top-level class hierarchy.]

The Name and Description attributes of the root Object class can describe these top-level classes, and are included in the Glossary of Terms:

While Person and Organization are considered top-level classes by some consortia models (O-DEF, schema.org), they are both subclasses of a Party concept within business models (GS1 EDI, ARTS ODM). A Party class includes attributes common to both persons and organizations, and enables one class to be associated with business transactions and other relationships.[9] A Party is capable of legal ownership and can be related to an Owner Party attribute of the root Object class. A Party instance can own both tangible (vehicle) and intangible (sales order) objects.

Although not explicitly defined within these consortia, a top-level Relationship class is included in this blended approach to abstract the ontology from any specific ontology language defining many-to-many relationships.

A class for an information model

An information model, as a knowledge domain, can have its own ontology that can model a multi-level ontology. An Information Model top-level object class (ODEF Information-Set) can be used to contain subclasses that define an information model (Figure 20). These include:

 


[Figure 20 | An information model class hierarchy.]

An ontology for data types

Ontologies for data types and measurement units (such as QUDT.org) can provide foundational semantic interoperability in business and technology.

A Data Type class can be modeled as a subclass of Information Model. All data based on digital electronics is represented as bits (alternatives 0 and 1) on the lowest level, and a Bits attribute of the Data Type class can be inherited by all subclasses. Number and String are atomic data types (direct subclasses of the Data Type class), as their values cannot be described in smaller parts. Integer and Float “primitives” can be defined as subclasses of a Number class (as with schema.org). Atomic and primitive data types have been defined by standards organizations (e.g., ISO.org 11404, W3.org XML Schema), but inconsistencies among them are challenging to manage.

Additional data types (like Quantity) having unique attributes can be derived from primitive data types and defined as their subclasses. However, the use of specific primitive and derived data types has varied among programming languages and consortia data services, limiting semantic interoperability (Figure 21).

 


[Figure 21 | Consortia data types]

A data type for a term

A Term data type (similar to Haystack’s “Marker”) can be used by an attribute (similar to a Haystack tag) to classify an object separately or in conjunction with an object class hierarchy.

When utilized with a Controlled Vocabulary, the value of a Term attribute can represent a Term object. For example, in Figure 19 the Name attribute of the root Object class is assigned to the Term data type. The value of the Name attribute for the root Object class is related to the “Object” Term in a Controlled Vocabulary (Figure 16).

The concept of Term can also be modeled as a subclass of Information Model (Figure 20).

A data type for a relation

A Relation data type (similar to Haystack’s “Ref”) can be assigned to an attribute to denote a relationship with an object of the same or different class. For example, the Class attribute of the root Object class is assigned to the Relation data type (Figure 19). The “Within Class” attribute of the Attribute class is also assigned to the Relation data type (Figure 20). In this case, the relation represents containment of Attribute objects within a Class object.

An attribute assigned to a Relation data type should be restricted to objects within a single class, which should be the most restrictive subclass to properly reflect the relationship.

Quantity data types for measurements

Business and technology depend on measured numbers, most of which have units. A Data Type ontology can define a measured Quantity (schema.org QuantitativeValue) data type as a subclass of a Number data type. Data types can also be defined for each type or “dimension” of measurement, as instances (objects) of the Quantity data type. For example, a Temperature data type (UN/CEFACT Temp-MeasureType in Figure 21) can be defined as an instance of the Quantity subclass.

By modeling a Monetary (Currency) amount as just another measurement type, processes, including value conversions, can be normalized across all measurement types. A mechanism (similar to xe.com) can be used to retrieve dynamic value changes of a Conversion Factor (currency exchange rate) associated with a Monetary unit.

A class for measurement units

The most widely used system of units is the International System of Units, or SI. ISO 80000-1 further defines quantities and units of the SI and the International System of Quantities (ISQ).

A Unit class can be modeled as a subclass of Information Model. Figure 22 shows attributes (Identifier, Name, Class) of the Object class inherited by each object in the dataset. The figure also includes the attributes of the class (Unit) identified by the object’s Class attribute.

 

[Figure 22 | Example instances of a Unit class with Object and Unit attributes]

A Unit identifier (such as ºF) paired with a quantity value in a dataset (such as Haystack tagged data) can resolve to a Quantity data type (such as Temperature) within the identified Unit object. Attributes of the Unit object can also support a unit conversion process (Figure 23).

 

[Figure 23 | Temperature value conversion using Unit instance with conversion attributes]

Roles for an object

The concept of a role (such as within O-DEF) describes a function that can be performed by an object in a particular context. A Role class can be modeled as a subclass of Information Model and can include instances that apply to different object classes (Figure 24).

 

[Figure 24 | Example instances of a Role class with Object and Role attributes]

An instance within the Relationship class can assign an instance of a Role to an object. An object can have more than one role. For example, an instance of a Person can have Employee, Parent, and Passenger roles. An instance of a type of Device can be a Sensor and a Communicator. The purpose of many devices is to assume the same roles of Persons. Thus, a Role can be assigned across object classes.

Some Roles (Customer) have a corresponding Reverse Role (Vendor). When a Customer role is assigned to a Party (modeled in ARTS ODM), a corresponding Vendor role is assigned to another Party to form a trade relationship.

Part four considers the intersection of business and device ontologies. Part five discusses OWL, RDF, and RDFS as approaches to metadata management and syntactical interoperability.

For term definitions, see the Glossary.

Victor Berrios is Vice President of Technology for Zigbee Alliance and has twenty years of experience in the wireless communication industry. He is a recognized expert in the short-range wireless industry as evidenced by his contributions to the RF4CE Network; Zigbee Remote Control, Zigbee Input Device, Zigbee Healthcare, and Zigbee Low Power End Device Specifications. He was recognized by the Continua Health Alliance as its Spring 2011 Key Contributor to the success of the Test and Certification Work Group.

Richard Halter was the Chief Technology Architect for the Association for Retail Technology Standards (ARTS) for over 18 years. In this role, Richard was responsible for all ARTS artifacts including Unified POS Devices, Operation Data Model, POSLog XML Schema, Integration Technical Reports, and best practices papers. In addition, he contributed to the ARTS Data Dictionary, describing thousands of retail technology terms. Richard was a member of Conexnus (convenience store group), HITIS – Hotel Industry, ARTS, and GS1.

Mark Harrison provides technical consultancy to GS1 in areas including end-to-end supply chain traceability and blockchain, Linked Data/Semantic Web, and the GS1 SmartSearch vocabulary. As former Director of Cambridge Auto ID Lab, Mark contributed to development of GS1 EPCglobal open standards for networked RFID (including EPC Information Services (EPCIS), Discovery Services, Networked/Event-based Electronic Pedigree, and EPC Tag Data Translation).

Scott Hollenbeck is Senior Director of Verisign’s Registry Services Lab. Scott has developed expertise in the Domain Name System (DNS) and is the author of the Extensible Provisioning Protocol (EPP) for the registration and management of Internet infrastructure data, including domain names. He has contributed to several industry efforts including internationalized domain names, ENUM, public key cryptography, S/MIME, the Extensible Markup Language (XML), and the Transport Layer Security (TLS) protocol. He has served as a member of the Internet Engineering Steering Group of IETF.

Doug Migliori has over 20 years of experience in supply chain and retail automation systems and mobile application development platforms. He has contributed to several open source consortia solving interoperability challenges including GS1, ARTS, OMG, CABA, IPSO Alliance, and OCF. Doug administers the IoT in Retail, IoT in Healthcare, and IoT in Homes and Buildings LinkedIn Groups. He is a principal with ControlBEAM, which provides a Unified Commerce platform built on the BEAM common data service.

John Petze is a partner at SkyFoundry, the developers of SkySpark, an analytics platform for building, energy, and equipment data. John has over 30 years of experience in building automation, energy management and M2M/IoT, having served in senior level positions for manufacturers of hardware and software products including: President & CEO of Tridium, VP Product Development for Andover Controls, and Global Director of Sales for Cisco Systems Smart and Connected Buildings group, and is a member of the Association of Energy Engineers. He is the Executive Director of Project-Haystack.org.

Ron Schuldt is Chairman of The Open Group’s Semantic Interoperability Work Group and was a principal in the creation of the Open Data Element Framework (O-DEF) standard. He has over 28 years of experience as a systems engineer for Lockheed Martin working on systems design and integration. He has been involved in multiple data interchange standards activities and is recognized as an expert on data standards. Currently, he is Manager of Data-Harmonizing, LLC, providing data integration training and consulting services.

J. Clarke Stevens is Principal Architect of emerging technologies at Shaw Communications. In this role, he analyzes emerging technologies and works with senior executives to develop product strategy. He is a public speaker on the IoT and an active technical contributor to the Open Connectivity Foundation (OCF). He has occasionally been a judge for the CES Innovation Awards. Clarke served on the board of directors of Universal Plug-n-Play Forum (UPnP), chaired the Technical Committee, and led the Internet of Things task force until UPnP was acquired by OCF.

References:

7. Harpring, Patricia, “Introduction to Controlled Vocabularies”, 2010 J. Paul Getty Trust.

8. Noy, Natalya F. and McGuinness, Deborah L. “Ontology Development 101: A Guide to Creating Your First Ontology”, protégé.Stanford.edu, 2001.

9. Hay, David C., Data Model Patterns: Conventions of Thought, Dorset House Publishers, Inc. (New York: 1996)

 

 

Previous Article
5G Public Private Partnership aims for lofty 5G goals

The 5G Public Private Partnership (5G-PPP) is a join initiative between the European ICT industry and Europ...

Next Article
How to choose the optimal wavelength for your biosensor application
How to choose the optimal wavelength for your biosensor application

When visible and infrared waves penetrate human skin, they get absorbed and scattered through the skin laye...