Content

Ceri Binding, Claudio Gnoli, Gabriele Merli, Marcin Trzmielewski, Douglas Tudhope, Integrative Levels Classification as a Networked KOS: A SKOS Representation of ILC2 in:

International Society for Knowledge Organziation (ISKO), Marianne Lykke, Tanja Svarre, Mette Skov, Daniel Martínez-Ávila (Ed.)

Knowledge Organization at the Interface, page 49 - 58

Proceedings of the Sixteenth International ISKO Conference, 2020 Aalborg, Denmark

1. Edition 2020, ISBN print: 978-3-95650-775-5, ISBN online: 978-3-95650-776-2, https://doi.org/10.5771/9783956507762-49

Series: Advances in Knowledge Organization, vol. 17

Bibliographic information
Ceri Binding – University of South Wales, UK Claudio Gnoli – University of Pavia, Italy Gabriele Merli – University of Pavia, Italy Marcin Trzmielewski – Paul-Valéry University of Montpellier 3, France Douglas Tudhope – University of South Wales, UK Integrative Levels Classification as a Networked KOS A SKOS Representation of ILC2 Abstract: Recently, there is a need to move knowledge organization systems (KOS) to online applications, by using Semantic Web technologies, in order to optimize indexing and searching. The present paper reports the representation of the Integrative Levels Classification (ILC) as a networked KOS, through conversion of its second edition into the W3C standard SKOS (Simple Knowledge Organization System) format. 1.0 Introduction In recent years, there is an increased need to move knowledge organization systems (KOS) to online applications, such as library catalogues or research data repositories, to optimize indexing and searching. Such need leads to represent traditional organization of concepts in the syntax of Semantic Web technologies (Binding and Tudhope 2016; Trzmielewski and Gnoli 2019). As Peponakis et al. (2019) highlight, enumerative disciplinary classifications and subject headings are harder to represent as machine processable and expressive semantic networks while thesauri are more suitable for this purpose. Therefore, it is also interesting to observe which solutions may be adopted with freely faceted classifications, which have a richer structure closer to that of thesauri and a greater expressive power. The present paper reports on the representation of the Integrative Levels Classification (ILC) as a networked KOS, through conversion of its second edition into the W3C standard SKOS (Simple Knowledge Organization System) format. The SKOS work on standards for thesauri and other knowledge organization systems grew out of the EC FP5 SWAD-Europe project. The aim was to facilitate the migration of KOSs to the Semantic Web, and work was carried forward by the W3C Semantic Web Best Practices and Deployment Working Group. The SKOS standard is published as a W3C Recommendation (W3C 2009). While SKOS was designed with thesauri primarily in mind, the availability of a relatively simple and accessible standard, expressible in RDF, has undoubtedly contributed to a major interest in KOSs generally for Semantic Web and linked data application development and also the mapping (and linking) of one KOS to another. Representation in SKOS (and RDF) exposes KOSs to a wide potential audience of developers and users. This was the rationale for investigating the representation in SKOS and making available a machine readable version of ILC. 2.0 The features of ILC The Integrative Levels Classification is a general KOS that is being developed since 2004 by an international team of scholars, including the authors of this paper. It draws from the tradition of faceted bibliographic classifications as developed by Ranganathan 50 and the Classification Research Group. However, it differs from these mainly for listing phenomena — such as iron, lakes, trade unions or orchestras — instead of disciplines — such as chemistry, geography, economics or musicology (Gnoli 2016). This has important consequences, both theoretical and applied. One of them is that the same classes can be applied to bibliographic records (e.g. an article on bagpipes), museum objects (e.g. a bagpipe specimen), products (a bagpipe model offered in a maker website) and so on, possibly combined with additional dimensions (“bagpipes, in articles”; “bagpipes, in museums”...: see Gnoli, Park, and Ledl, 2019). Another original feature is that ILC facets are not only special facets limited to a specific main class (e.g. the processes of biology, or the materials of mining), but also free facets that can be used to connect any two classes from the whole spectrum of knowledge (e.g. cervid populations affected by road traffic). This KOS variety, described by Austin (1976) as freely faceted classification, offers a powerful expressivity very similar to that of a full language; at the same time, it implies a certain amount of syntactic complexity that is more demanding to be represented carefully as linked data (see next section). The first stable edition of the system (ILC1) was published in 2011 and consisted of 7,052 classes and facets. In September 2019, the developing new edition has been frozen to become the second stable edition (ILC2), consisting of 10,845 classes and facets (Gnoli 2020). Compared to ILC1, it has evolved in some renamed or moved main classes, better development of many subclasses, rearrangement and new definitions of various facet categories, distinction between facets by nature (“wheels” as parts of vehicles) and facets by function (“vehicles, with wheels”), and other details in notation. These changes are described by Park et al. (2020). Specific fields for mapping between different ILC editions, and between ILC and the Dewey Decimal Classification, are provided in the ILC MySQL database. ILC features involve a rich semantic structure with many components: basic classes (a-y), common facets (0-9), special facets (90-99), expected foci, deictics (A-Z), etc. While not all these structural components are provided for in the standard SKOS format, good compromises and solutions can be found for many of them (Gnoli et al. 2011). 3.0 Procedures It was necessary to transform the working representation of ILC2 used by the editorial team to SKOS and RDF. We were able to draw on previous experience by the Hypermedia Research Group at the University of South Wales with publishing national UK heritage thesauri (Heritage Data n.d.) as SKOS based linked data (Binding and Tudhope 2016). In order to generate a SKOS representation of ILC2, it was necessary to transform the relational (MySQL as exported into CSV) expression of the ILC2 classification system. This was achieved using the STELETO transformation tool developed previously (Binding, Tudhope and Vlachidis 2018). STELETO converts input data to any textual output format via a user-defined textual template. It is a cross-platform command line application (open source) that performs bulk transformation of delimited text tabular data into other textual formats via a custom template (Binding 2019). 51 Due to the complexity of a faceted classification system such as ILC, bespoke rules were added to the process for ILC purposes. For example, it was necessary to derive the hierarchical structure of the classification from the notational codes. Database fields for synonyms and descriptions of classes, of facet indicators and of foci have been treated variously in order to obtain meaningful labels. Mappings to DDC classes are available for all ILC main classes and for most 3-digit subdivisions of DDC (000-999). These have been linked to OCLC DDC URIs. The following solutions have been adopted: • records having purely alphabetic notation values (basic classes) are modelled as skos:Concept • records having purely numeric notation values (common facets) are modelled as rdf:Property, using the notation to determine the subproperty/superproperty relationships. Single number notations (i.e. the fundamental categories) are sub-property of skos:related. These properties are modelled with domain and range specified as skos:Concept. • records having a combination of alphabetic and numeric notation (special facets) are also modelled as rdf:Property with the domain being the alphabetic part of the notation and the range being the value from the ‘foci’ field (if present, otherwise skos:Concept). E.g. for m981 (“aged years”) domain is m (“organisms”) and range is an (“quantities”), super property is then m98 (“developmental stage”). 4.0 A metaphysical question: what is the top class of all phenomena? A basic SKOS relationship is skos:broader, by which any class can be related to its parent class. For example, wi “pots” has a skos:broader relationship to w “artifacts” — and vice versa, w “artifacts” has a skos:narrower relationship to wi “pots” and other subclasses. We generated these relationships in automatic ways by exploiting the expressivity of ILC positional notation, where every additional digit means an additional rank of specificity. Once we came to the main classes expressed by a single letter, such as w “artifacts” or h “celestial bodies”, we had to decide whether these in turn have any skos:broader relationship. ILC2 also has a class * meaning “absolute, apeiron, the undifferentiated whole” that could be seen as the primordial top class of which all phenomena are subdivisions. This would have implied that all single-letter classes would have a skos:broader relationship to class *. Draft visualizations of this architecture, however, looked confusing for expected common users, as they would display a very abstract, philosophical notion with much greater evidence than classes of more common usage. We thus opted for not recording such relationship in the SKOS version of ILC2. On the other hand, this has stimulated interesting considerations on how very general philosophical notions, such as “things in themselves” or “phenomena”, may be expressed in ILC. A provisional view, that could be implemented in ILC3, is that a top class meaning “being” can include both “absolute” that is noumena or things in themselves in philosophical terminology, and “phenomena” meant as classes of differentiated named entities, in turn including all common main classes; of these, some are “real”, that is actually existent, and can be specified by the deictic Y already available in ILC. 52 5.0 Publication details The conversion created a total of 82,534 triples describing 8,990 concepts (including 52 top concepts) and 943 properties (modelled as hierarchical specializations of skos:related), a total of 9,933 items. URIs for individual classes and facets refer to the online schedules previously available through a PHP interface. To allow online navigation, however, these have dynamic URLs of the form http://www.iskoi.org/ilc/2/no.php?no=jUxf (for class jUxf “Bay of Fundy” taken as an example). In SKOS data, the dynamic form has been changed to a static one: http://www.iskoi.org/ilc/2/class/jUxf. This has required to set a mod_rewrite redirect instruction on the iskoi.org Apache server, so that referenced URIs are automatically converted to the dynamic form and the appropriate information is displayed. SKOS data for ILC2 are available from http://www.iskoi.org/ilc/skos.php in Turtle, NTriples or RDF syntax. They are also available at the BARTOC (Basel Register of Thesauri, Ontologies and Classifications) repository at http://bartoc-skosmos.unibas.ch/ILC/en/ as part of the long-term experimentation with application of ILC to BARTOC indexing (Ledl and Gnoli 2017). The SKOS version is published using Skosmos, an open source tool developed at the National Library of Finland (http://skosmos.org/). This produces various flavours of RDF output, including the commonly used NTriples format. 6.0 Visualizations A text based visualization is available from BARTOC. Graphical displays can be created by importing the generated ILC2 NTriples RDF data file into the AllegroGraph Gruff tool. Some illustrative examples of ILC2 concepts and properties follow, where we can see the SKOS NTriples output and corresponding graph-based visualisations using Gruff and the equivalent view from BARTOC. Note that we can observe in the SKOS output the main URI for the ILC2 SKOS scheme at iskoi.org, the concept being visualised with its preferred label (“Bay of Fundy”), its notation (jUxf), broader concepts (“Atlantic Ocean”) and their notation. @prefix rdfs: . @prefix skos: . @prefix ilc2: . rdfs:label "Integrative Levels Classification (ILC)"@en ; skos:prefLabel "Integrative Levels Classification (ILC)"@en ; a skos:ConceptScheme . ilc2:jUxf rdfs:seeAlso ; skos:broader ilc2:jUx ; skos:notation "jUxf" ; rdfs:label "Bay of Fundy"@en ; skos:prefLabel "Bay of Fundy"@en ; skos:inScheme ; a skos:Concept . ilc2:jUx skos:notation "jUx" ; 53 rdfs:label "Atlantic Ocean"@en ; skos:prefLabel "Atlantic Ocean"@en ; a skos:Concept ; skos:narrower ilc2:jUxf . Figure 1: Skosmos output and Gruff visualisations and corresponding BARTOC view for Bay of Fundy A more elaborate example with the concept of polypteriformes shows more of a hierarchical tree. We also see an example of a descriptive Note in the Gruff visualisation. @prefix ilc2: . @prefix skos: . @prefix rdfs: . ilc2:mqvh skos:notation "mqvh" ; rdfs:label "ray-finned fish"@en ; skos:prefLabel "ray-finned fish"@en ; a skos:Concept ; skos:narrower ilc2:mqvhb . 54 ilc2:mqvhb rdfs:seeAlso ; skos:note "including bichirs, reedfish"@en ; skos:broader ilc2:mqvh ; skos:notation "mqvhb" ; rdfs:label "polypteriformes"@en ; skos:prefLabel "polypteriformes"@en ; skos:inScheme ; a skos:Concept . rdfs:label "Integrative Levels Classification (ILC)"@en ; skos:prefLabel "Integrative Levels Classification (ILC)"@en ; a skos:ConceptScheme . Figure 2: Skosmos output and Gruff visualizations and corresponding BARTOC view for polypteriformes 55 A yet more complex example shows the variety of relationships within ILC2 and connections between concepts. For example, there are associative relationships (skos:related) between “stars” and “star clusters” and between “stars” and “plasma”. @prefix ilc2: . @prefix skos: . @prefix rdfs: . ilc2:h skos:notation "h" ; rdfs:label "celestial bodies"@en ; skos:prefLabel "celestial bodies"@en ; a skos:Concept ; skos:narrower ilc2:hl . ilc2:hu skos:notation "hu" ; rdfs:label "star clusters"@en ; skos:prefLabel "star clusters"@en ; a skos:Concept ; skos:related ilc2:hl . ilc2:hlb skos:notation "hlb" ; rdfs:label "subdwarf stars"@en ; skos:prefLabel "subdwarf stars"@en ; a skos:Concept ; skos:broader ilc2:hl . ilc2:hlg skos:notation "hlg" ; rdfs:label "giant stars"@en ; skos:prefLabel "giant stars"@en ; a skos:Concept ; skos:broader ilc2:hl . ilc2:hlj skos:notation "hlj" ; rdfs:label "supergiant stars"@en ; skos:prefLabel "supergiant stars"@en ; a skos:Concept ; skos:broader ilc2:hl . ilc2:hl skos:narrower ilc2:hlU, ilc2:hlb, ilc2:hlg, ilc2:hlf, ilc2:hlj, ilc2:hlh, ilc2:hld, ilc2:hlk, ilc2:hla ; skos:scopeNote "celestial bodies where nuclear fusion occurs"@en ; skos:related ilc2:hu, ilc2:gf ; skos:inScheme ; rdfs:label "stars"@en ; rdfs:seeAlso ; skos:prefLabel "stars"@en ; skos:notation "hl" ; a skos:Concept ; skos:broader ilc2:h . ilc2:gf skos:notation "gf" ; 56 rdfs:label "plasma"@en ; skos:prefLabel "plasma"@en ; a skos:Concept ; skos:related ilc2:hl . ilc2:hlh skos:notation "hlh" ; rdfs:label "bright giant stars"@en ; skos:prefLabel "bright giant stars"@en ; a skos:Concept ; skos:broader ilc2:hl . rdfs:label "Integrative Levels Classification (ILC)"@en ; skos:prefLabel "Integrative Levels Classification (ILC)"@en ; a skos:ConceptScheme . ilc2:hla skos:notation "hla" ; rdfs:label "attributes of #hla"@en ; skos:prefLabel "attributes of #hla"@en ; a skos:Concept ; skos:broader ilc2:hl . ilc2:hlf skos:notation "hlf" ; rdfs:label "subgiant stars"@en ; skos:prefLabel "subgiant stars"@en ; a skos:Concept ; skos:broader ilc2:hl . ilc2:hlk skos:notation "hlk" ; rdfs:label "hypergiant stars"@en ; skos:prefLabel "hypergiant stars"@en ; a skos:Concept ; skos:broader ilc2:hl . ilc2:hlU skos:notation "hlU" ; rdfs:label "the Sun"@en ; skos:prefLabel "the Sun"@en ; a skos:Concept ; skos:broader ilc2:hl . ilc2:hld skos:notation "hld" ; rdfs:label "dwarf stars"@en ; skos:prefLabel "dwarf stars"@en ; a skos:Concept ; skos:broader ilc2:hl . 57 Figure 3: Skosmos output and Gruff visualizations and corresponding BARTOC view for stars 7.0 Conclusion Previous experience at the University of South Wales with the STELETO transformation tool has allowed to treat the complex syntactic structures of a freely faceted classification, such as ILC, and produce an appropriate representation of them as SKOS. On the other hand, full management of concept combinations according to ILC syntax is limited by the expressiveness of the SKOS format itself, as already discussed by Gnoli et al. (2011). Representation of a freely faceted classification as SKOS is especially useful for the purposes of data exchange in a standard international format, making it available on the Web as linked data in view of new applications. Tools for visualization of semantic structures are another benefit of conversion to SKOS. Special applications, such as PHP scripts for navigation of ILC schedules as available on the iskoi.org website, can further exploit its expressive power. 58 Acknowledgments We are grateful to Andreas Ledl for publication in BARTOC Skosmos, and to Riccardo Ridi for discussion on philosophical aspects of noumena and phenomena. References Austin, Derek. 1976. “The CRG Research Into a Freely Faceted Scheme.” In Classification in the 1970s: A Second Look, edited by Arthur Maltby. London: Bingley, 158-194. Binding, Ceri. 2019. “STELETO: Convert Input Data to Any Textual Output Format Via a Custom Template.” GitHub. https://github.com/cbinding/STELETO. Binding, Ceri and Douglas Tudhope. 2016. “Improving Interoperability Using Vocabulary Linked Data.” International Journal on Digital Libraries 17: 5-21. Binding, Ceri, Douglas Tudhope and Andreas Vlachidis. 2018. “A Study of Semantic Integration Across Archaeological Data and Reports in Different Languages.” Journal of Information Science 45: 364-386. Gnoli, Claudio. 2016. “Classifying Phenomena, Part 1: Dimensions.” Knowledge Organization 43: 403-415. Gnoli, Claudio. 2020. “Integrative Levels Classification.” In ISKO Encyclopedia of Knowledge Organization, edited by Birger Hjørland and Claudio Gnoli. http://www.isko.org/cyclo/ilc. Gnoli, Claudio, Tom Pullmann, Philippe Cousson, Gabriele Merli and Rick Szostak. 2011. “Representing the Structural Elements of a Freely Faceted Classification.” In Classification and Ontology: Formal Approaches and Access to Knowledge: Proceedings of the International UDC Seminar 19-20 September 2011 The Hague, edited by Aida Slavic and Edgardo Civallero. Würzburg: Ergon, 193-205. Gnoli, Claudio, Ziyoung Park and Andreas Ledl. 2019. “Dimensional Analysis of Subjects: Indexing Koss in BARTOC by Phenomena, Perspectives, Documents and Collections.” In 1st Low Countries ISKO Conference, Brussels. http://isko-lc.org/conference-programme/. Heritage Data, n.d. “Linked Data Vocabularies for Cultural Heritage.” https://www. heritagedata.org/ blog/. Ledl, Andreas and Claudio Gnoli. 2017. “Indexing Koss in BARTOC by a Disciplinary and A Phenomenon-Based Classification: Preliminary Considerations.” In Faceted Classification Today: Theory, Technology and End Users: Proceedings of the International UDC Seminar 14-15 Sept. 2017, London, edited by Aida Slavic and Claudio Gnoli. Würzburg: Ergon, 109-117. Park, Ziyoung, Claudio Gnoli and Daniele P. Morelli. 2020. “The Second Edition of the Integrative Levels Classification.” In NKOS Workshop at DCMI 2019 25 September 2019 Seoul. Journal of Data and Information Science 5: 39-50. Peponakis, Manolis, Anna Mastora, Sarantos Kapisakis, Martin Doerr. 2019. “Expressiveness and Machine Processability of Knowledge Organization Systems (KOS): An Analysis of Concepts and Relations.” International Journal on Digital Libraries 20: 433-452. Trzmielewski, Marcin, Claudio Gnoli. 2019. “Une Classification Interdisciplinaire Pour L’échange et la Médiation des Données Ouvertes de la Recherche.” In 12ème Colloque International D’ISKO-France: Données et Mégadonnées Ouvertes en SHS: De Nouveaux Enjeux Pour l’État et l’Organisation des Connaissances? 9-11 October 2019 Montpellier. Archive Ouverte HAL. https://hal.archives-ouvertes.fr/hal-02307108. W3C. 2009. SKOS: Simple Knowledge Organization System Reference, eds. Alistair Miles and Sean Bechhofer. http://www.w3.org/TR/skos-reference/.

Chapter Preview

References

Abstract

The proceedings explore knowledge organization systems and their role in knowledge organization, knowledge sharing, and information searching.

The papers cover a wide range of topics related to knowledge transfer, representation, concepts and conceptualization, social tagging, domain analysis, music classification, fiction genres, museum organization. The papers discuss theoretical issues related to knowledge organization and the design, development and implementation of knowledge organizing systems as well as practical considerations and solutions in the application of knowledge organization theory. Covered is a range of knowledge organization systems from classification systems, thesauri, metadata schemas to ontologies and taxonomies.

Zusammenfassung

Der Tagungsband untersucht Wissensorganisationssysteme und ihre Rolle bei der Wissensorganisation, dem Wissensaustausch und der Informationssuche. Die Beiträge decken ein breites Spektrum von Themen ab, die mit Wissenstransfer, Repräsentation, Konzeptualisierung, Social Tagging, Domänenanalyse, Musikklassifizierung, Fiktionsgenres und Museumsorganisation zu tun haben. In den Beiträgen werden theoretische Fragen der Wissensorganisation und des Designs, der Entwicklung und Implementierung von Systemen zur Wissensorganisation sowie praktische Überlegungen und Lösungen bei der Anwendung der Theorie der Wissensorganisation diskutiert. Es wird eine Reihe von Wissensorganisationssystemen behandelt, von Klassifikationssystemen, Thesauri, Metadatenschemata bis hin zu Ontologien und Taxonomien.