Content

Uma Balakrishnan, Dagobert Soergel, Olivia Helfer, Representing Concepts through Description Logic Expressions for Knowledge Organization System (KOS) Mapping in:

International Society for Knowledge Organziation (ISKO), Marianne Lykke, Tanja Svarre, Mette Skov, Daniel Martínez-Ávila (Ed.)

Knowledge Organization at the Interface, page 455 - 459

Proceedings of the Sixteenth International ISKO Conference, 2020 Aalborg, Denmark

1. Edition 2020, ISBN print: 978-3-95650-775-5, ISBN online: 978-3-95650-776-2, https://doi.org/10.5771/9783956507762-455

Series: Advances in Knowledge Organization, vol. 17

Bibliographic information
Uma Balakrishnan – Verbundzentrale des GBV (VZG), Göttingen, Germany Dagobert Soergel – Dpt. of Information Science, Univ. at Buffalo, United States Olivia Helfer – WNYLRC, Buffalo, NY, United States Representing Concepts through Description Logic Expressions for Knowledge Organization System (KOS) Mapping Abstract: Mapping among KOS can be achieved by representing the concepts in each KOS through a canonical expression: DDC: Canonical expression GND 386.8 Inland waterway >Ports◄▬ Traffic station ⊓ Inland water transport ▬►Binnenhafen We explored representing concepts through description logic expressions using a sample of 150 library KOS classes and subject headings, 50 each from the Dewey Decimal Classification (DDC), the Regensburg Verbundsklassifikation (RVK), and the German Integrated Authority File (GND). Working from the ground up, we compiled a small vocabulary of relationships/roles and the beginnings of a faceted classification of elemental concepts. Large-scale application of this approach requires a large hierarchically structured vocabulary of relationships/roles and a universal faceted classification. We discuss the task of developing these tools drawing on many sources. 1.0 Introduction A description logic expression is a combination of elemental concepts giving for each elemental concept the role it plays in the context. Adding the roles is a refinement of simple semantic factoring. A DL expression is a formal definition of a concept (Baader et al. 2017). We use a simplified version of DL expressions, similar to the Semantic Code (Perry and Kent 1958). On the other hand, we found, it necessary to allow for nesting in DL expressions, indicated by [ ]. Some examples: RVK: ZO3700 Technology > Traffic > Traffic safety = safety ⊓ Transportation GND: Verkehrssicherheit (Traffic safety) = safety ⊓ Transportation RVK: PN808 Law > … > Hazardous material = law (topical area) ⊓ [ goods ⊓ hazardous] DDC: 343.0938 Law > … > traffic safety = law (topical area) ⊓ [ safety ⊓ Transportation] DDC: 343.09322 Law > Transportation > Transportation of goods > Hazardous material = law (topical area) ⊓ [ transportation system ⊓ [ goods ⊓ hazardous]] GND: Gefahrgutbefoerderungsrecht (Law on transportation of hazardous goods} = law (topical area) ⊓ [ transportation system ⊓ [ goods ⊓ hazardous]] Classes or subject headings from two KOS that have the same DL expression refer to the same concept and can be mapped with skos:directMatch. A reasoner working on 456 a database of DL expressions can infer other relationships between concepts. For example, there is an associative relationship between RVK: PN808 and DDC: 343.0938, since Safety and hazardous are related; see Section 2.2. For the context of this work see Balakrishnan et al. 2018, 2019. For the theoretical basis see Soergel 1972, 2011, 2017. 2.0 Creating DL expressions and analyzing pair-wise mappings 2.1 Creating DL expressions for classes / subject headings - The challenges Requires a good understanding of KOS structure and considerable domain knowledge. Understanding KOS structure. Consider a class in its hierarchical context RVK: CM 5000 Information theory, cybernetics is actually RVK: CM 5000 Psychology > General, history and methods > Information theory, cybernetics = Information theory, cybernetics ⊓ psychology Domain Knowledge is essential; often need to look up a definition. A DL expression is crystallized domain knowledge DDC: 150.1 Psychology > philosophy, theory, systems, schools = philosophy, theory, systems, schools ⊓ psychology GND: Philosophical Psychology = psychology ⊓ topics in philosophical psychology ⊓ philosophical method GND: Historical psychology = psychology ⊓ [topics in psych. ⊓ past] Ambiguity: Caption is ambiguous, perhaps used both ways. Two DL expressions DDC: 610. 82 Women in Medicine#1 = Person ⊓ female ⊓ Medicine DDC: 610. 82 Women in Medicine#2 = Person ⊓ female ⊓ Medicine GND: Ethnomathematics#1 = math. ⊓ identifiable cultural group GND: Ethnomathematics#2 = curriculum subject ⊓ [ culture, math.] 2.2 Analysis of mapping pairs based on DL Expressions Given the DL-Expressions for each class or subject heading in a pair taken from KOS A and KOS B, a system can infer the type of mapping, as shown in the examples. skos:exactMatch RVK: CP 3200 General psychology > Feelings, emotion = emotion GND: Feeling, emotion = emotion DDC: 150.9#1 Psychology > History, [biographic treatment, biography] > = history ⊓ psychology RVK: CM 2000 Psychology > History of psychology = history ⊓ psychology The history part of the DDC class (without biography) is an exact match for the RVK class. 457 skos:narrowMatch RVK: ZO 9300 Traffic, transport > Transportation system > Transportation of goods = transportation system ⊓ goods GND: Law of transportation of goods = law (topical area) ⊓ [ transportation system ⊓ goods] skos:broadMatch DDC: 386.6 Inland waterway and ferry transportation > Ferry transportation = transportation system ⊓ inland water transport ⊓ regular schedule RVK: ZO 6080 Inland water transport, canals = transportation system ⊓ inland water transport skos:relatedMatch DDC: 150.1 Psychology > Philosophy and theory, systems and schools = philosophy or theory or discussion of systems or schools ⊓ psychology RVK: CM 5000 Psych. > General, history and methods > Information theory, cybernetics = Information theory, cybernetic ⊓ psychology Both DL expressions include psychology; method relates somewhat to systems or schools No match – Mapping Error. Lack of domain knowledge of mapping editor DDC: 150.9#1 Psychology > History, biographic treatment, biography > = history or biography ⊓ psychology GND: Historical psychology = psych. ⊓ [ topics in psych. ⊓ past] 3.0 Toward a system relationships/roles and a universal faceted classification To apply the approach illustrated at large scale requires a large list of relationships/roles and a large universal faceted classification. The following sections give a flavor of what needs to be done, but the task is monumental. 3.1 Relationship types / roles For this exploration we introduced relationship types/roles as needed. Some are quite obvious, such as , , or , but others are not so common or quite specialized, such as the following: , , , , , , , , . Our ultimate goal is either to locate each relationship/role in a standard or widely used ontology or to contribute to some inventory of relationship types / roles. 458 3.2 A universal faceted classification of elemental concept At this stage of the project, we introduced elemental concepts as needed for the DL expressions, standardizing terminology in this small set of 150 classes and subject headings. The next step would be to develop a universal faceted classification, a monumental task using many sources, including standard classifications, see Figure 1. The nature of the classes in DDC and RVK often requires elemental concepts that are defined using or such as Comparison or harmonization (with NT Comparison and NT Harmonization) Cognition or intelligence Philosophy or theory or discussion of viewpoints or schools (frequent subdivision in DDC) We need Level of education, as seen from some concepts we used as components. The International Standard Classification of Education (ISCED) maintained by UNESCO includes a classification for Level of education. For purpose of illustration we built a hierarchy consisting mainly of ISCED concepts (ISCED does not present these as a hierarchy). We added a number of concepts, some to group ISCED concepts, some extensions further down (Kindergarten), because we know or expect that these are needed in building DL expressions for DDC and RVK. ISCED level 0 or 1 Early childhood education to Primary ed. . ISCED: ISCED level 0 – Early childhood education . . ISCED: Early childhood educational development . . ISCED: Preprimary education . . . Kindergarten . ISCED: ISCED level 1 – Primary education Secondary education . ISCED: ISCED level 2 – Lower secondary education . ISCED: ISCED level 3 – Upper secondary education ISCED: ISCED level 4 – Post-secondary non-tertiary ed. ISCED: Tertiary education . ISCED: ISCED level 5 – Short-cycle tertiary education . ISCED: ISCED level 6 – Bachelor’s or equivalent level . ISCED: ISCED level 7 – Master’s or equivalent level . ISCED: ISCED level 8 – Doctoral or equivalent level K-12 education . NT Kindergarten . NT ISCED level 1 – Primary education . NT Secondary education Youth and adult education level . Youth education level NT Secondary education . Adult education level NT ISCED: ISCED level 4 – Post-sec. non-tertiary ed. NT ISCED: Tertiary education Figure 1.Hierarchy of Level of education 3.3 Concept definitions Elemental concepts must be defined, particularly concepts that have been introduced specifically for DL expressions. Consider our pattern for DL expressions for subject disciplines: environmental science = subject discipline ⊓ environment But what to do with psychology? We could come up with psychology = subject discipline ⊓ mental states and processes and behaviors But that is a cryptic characterization of what psychologists study. Better: psychology = subject discipline ⊓ topics in psychology Now, we can create a full definition of topics in psychology and put many narrower terms under it, for example from (www.apa.org). For another example, consider several related meanings of law: 459 law (subject discipline), law (topic area), and Body of law, statutes. 4.0 Conclusions and future work In our exploration representing KOS concepts through simple DL expressions worked generally well. Comparing KOS mapping using DL expressions with mappings found in our database convinced us that it is worthwhile to test the idea further in a larger pilot that would use proper software to support developing and maintaining the systems discussed in Section 3 and to partially automate the creation of DL expression using linguistic analysis of captions.1 References Baader, Franz, Ian Horrocks, Carsten Lutz, and Uli Sattler. 2017. An Introduction to Description Logic. 1 edition. Cambridge: Cambridge University Press. Balakrishnan, Umamaheswari, Jakob Voß, and Dagobert Soergel. 2018. “Towards Integrated Systems for KOS Management, Mapping, and Access. Coli-Conc and its Collaborative Computer-Assisted KOS Mapping Tool Cocoda.” In Challenges and Opportunities for Knowledge Organization in the Digital Age: Proceedings of the Fifteenth International ISKO Conference 9-11 July 2018 Porto, Portugal, edited by Fernanda Ribeiro and Maria Elisa Cerveira. Advances in knowledge organization 16. Baden-Baden: Ergon, 128-136. Balakrishnan, Umamaheswari and Dagobert Soergel. 2019. “Concept Mapping Through a Hub: Coli-Conc Pilot Study.” Presented at ISKO-LC 2019 Conference. https://doi.org/10.5281/zenodo.3257136 Perry, James W. and Allen Kent. 1958. Tools for Machine Literature Searching: Semantic Code Dictionary, Equipment, Procedures. New York: Interscience Publishers. Soergel, Dagobert. 1972. “A General Model for Indexing Languages: The Basis for Compatibility and Integration.” In Subject Retrieval in the Seventies, edited by Hans Hanan Wellisch and Thomas Daniel Wilson. New York: Greenwood; College Park, Md.: University of Maryland, School of Library and Information Services, 36-61. Soergel, Dagobert. 2011. “Conceptual Foundations for Semantic Mapping and Semantic Search.” In Concepts in Context: Proceedings of the Cologne Conference on Interoperability and Semantics in Knowledge Organization July 19th - 20th, 2010, edited by Felix Boteram, Winfried Gödert, and Jessica Hubrich. Würzburg: Ergon, 13-35. Soergel, Dagobert. 2017. “The Principle of Compositionality and Entity-Relationship Modelling: Faceted Classification in a Broader Context.” In Faceted Classification Today: Theory, Technology and End Users: Proceedings Of The International UDC Seminar 2017, London (UK), 14-15 September, edited by Aida Slavic and Claudio Gnoli. Würzburg: Ergon Verlag, 43-60. 1 The data sets created in this exploration (list of relationships/roles, partially structured list of elemental concepts, definitions, and DL expressions) are available from the authors.

Chapter Preview

References

Abstract

The proceedings explore knowledge organization systems and their role in knowledge organization, knowledge sharing, and information searching.

The papers cover a wide range of topics related to knowledge transfer, representation, concepts and conceptualization, social tagging, domain analysis, music classification, fiction genres, museum organization. The papers discuss theoretical issues related to knowledge organization and the design, development and implementation of knowledge organizing systems as well as practical considerations and solutions in the application of knowledge organization theory. Covered is a range of knowledge organization systems from classification systems, thesauri, metadata schemas to ontologies and taxonomies.

Zusammenfassung

Der Tagungsband untersucht Wissensorganisationssysteme und ihre Rolle bei der Wissensorganisation, dem Wissensaustausch und der Informationssuche. Die Beiträge decken ein breites Spektrum von Themen ab, die mit Wissenstransfer, Repräsentation, Konzeptualisierung, Social Tagging, Domänenanalyse, Musikklassifizierung, Fiktionsgenres und Museumsorganisation zu tun haben. In den Beiträgen werden theoretische Fragen der Wissensorganisation und des Designs, der Entwicklung und Implementierung von Systemen zur Wissensorganisation sowie praktische Überlegungen und Lösungen bei der Anwendung der Theorie der Wissensorganisation diskutiert. Es wird eine Reihe von Wissensorganisationssystemen behandelt, von Klassifikationssystemen, Thesauri, Metadatenschemata bis hin zu Ontologien und Taxonomien.