Content

Amelie Dorn, Yalemisew Abgaz, Gerda Koch, José Luis Preza Díaz, Harvesting Knowledge from Cultural Images with Assorted Technologies: The Example of the ChIA Project in:

International Society for Knowledge Organziation (ISKO), Marianne Lykke, Tanja Svarre, Mette Skov, Daniel Martínez-Ávila (Ed.)

Knowledge Organization at the Interface, page 470 - 473

Proceedings of the Sixteenth International ISKO Conference, 2020 Aalborg, Denmark

1. Edition 2020, ISBN print: 978-3-95650-775-5, ISBN online: 978-3-95650-776-2, https://doi.org/10.5771/9783956507762-470

Series: Advances in Knowledge Organization, vol. 17

Bibliographic information
Amelie Dorn – ACDH-CH, Austrian Academy of Sciences, Austria Yalemisew Abgaz – Adapt Centre, Dublin City University, Ireland Gerda Koch –AIT Forschungsgesellschaft mbH, Europeana Local-AT, Austria José Luis Preza Díaz – ACDH-CH, Austrian Academy of Sciences, Austria Harvesting Knowledge from Cultural Images with Assorted Technologies The Example of the ChIA Project Abstract: In recent years, cross-disciplinary collaboration for increased knowledge extraction from diverse data sources has been at the heart of interdisciplinary research and related fields, such as Digital Humanities. In particular, knowledge extraction and preservation from cultural heritage data has received increased attention. In this paper, we introduce the ChIA project, a cross-disciplinary Digital Humanities project that aims to perform knowledge design, knowledge extraction and organisation by applying semantic as well as Artificial Intelligence (AI) tools on a set of Europeana cultural food images. The collaborative endeavour aims to increase cultural knowledge access and analysis possibilities of images for different user groups and stakeholders, such as content providers or for educational purposes. 1.0 Introduction Since the recent European Year of Cultural Heritage by the European Commission1, the role of science and technology for the benefit of cultural heritage has played a major role in both physical and digital realms. Not only preserving, but also capturing, organising and making accessible cultural knowledge to different user groups has been a widely addressed topic across disciplines in recent years (cf. Hardman et al. 2009; Behli, Bouras, and Foufou 2018). The ChIA project2 is a current endeavour between the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH OeAW, AT)3, the Adapt Centre, Dublin City University (IE)4 and the cultural content aggregator Europeana Local - Österreich5. The project, which is at an early stage after inception, focuses on integrating semantic technologies and image analysis to enhance the accessibility of cultural images. Through digitisation, cultural images are being transformed from tangible to digital intangible resources and become accessible via different platforms. This significantly increases their availability to the wider public. The transformation, however, mainly focuses on the conversion of the images into their digital representations with generic metadata (e.g. title, creator, caption), overlooking much of the content of the images themselves. Search and retrieval, thus become restricted only to the generic and associated metadata, failing to explain the relevance of the search results. Often, digital cultural images lack a systematic machine-readable description of the cultural 1 https://ec.europa.eu/culture/news/commission-takes-stock-successful-european-year-cultural-heritage- 2018_en [last access: 01.02.2020] 2 https://chia.acdh.oeaw.ac.at/ [last access: 01.02.2020] 3 https://www.oeaw.ac.at/acdh/ [last access: 20.09.2019] 4 https://www.adaptcentre.ie/ [last access: 20.09.2019] 5 Europeana Local - Österreich. Verbundportal für lokale und regionale Kultur- und Wissenschafts- daten. website: http://www.europeana-local.at/ [last accessed: 27.09.2019] 471 and social aspects, as well as semantic enrichment and a well-defined interlinking of the cultural knowledge embedded. In the ChIA project, we aim to increase the knowledge that can be drawn from such images and enable its presentation in a structured way through the support of different digital tools and methods. 2.0 The method The ChIA project draws on expertise from an interdisciplinary team, with backgrounds in semantic technologies, artificial intelligence, cultural heritage aggregation and linguistics. In our approach, recent advancements in technology, including deep learning and computer vision (CV), are applied to analyse and capture the content of a selected set of images from Europeana, using pattern matching algorithms to generate new metadata to enable better search and retrieval. Most of these methods, however, focus on automatic object detection typically in the form of predicted concepts and their probability, still with little coverage on the social and cultural aspects of the images. The ChIA system (see Fig.1) aims to combine these different tools and methods that support knowledge extraction and organisation. Figure 1: Visual representation of the ChIA system. Image @ Yalemisew Abgaz 2019. Semantic tools enable a unified representation of the entities and foster accurate interpretation; knowledge graphs can be generated by combining all metadata; visual search allows users to search for similar images and a proposed chatbot enables interactive communication exposing the data of Europeana in innovative and new ways. Another aspect that makes the project unique is its specific focus on food-related images (see Figure 2 for an example). 472 Figure 2: Example of a cultural food image from the Europeana collection. Source: https://www.europeana.eu/portal/record/90402/SK_A_4070.html (Image licence: CC-PD) The data, provided by Europeana, has previously been curated and provided to Europeana by cultural organisations, including museums, archives, libraries and galleries. As “food” is a rather flexible concept, we specifically concentrate on food that is edible by humans, and images that present it in a cultural setting, which typically involves persons, locations, objects or a combination thereof. Additionally, “culture” may also encompass depictions of family, societal traditions or customs. 3.0 The aim and scope of the study The main content objective of ChIA is to develop knowledge design based on Europeana data derived via the Europeana API and explore & analyse cultural content in an experimental setting using AI. In this setting, we aim to enable access and analysis of cultural images by means of widening search capabilities using a combination of semantic technologies and augmented metadata, with interactive tools like chatbots, knowledge graphs and visual analysis. The project more generally functions as a testbed and playground for experimenting with Artificial Intelligence (AI) in a digital cultural context (cf. Schnapp 2014). The new knowledge design models will be beneficial for both scientists and other actor groups, i.e. Europeana content providers or for educational purposes, allowing for much more complex searches than simple metadatabased solutions. Another objective concerns the establishing of an intermediate layer infrastructure with the aim to gain additional metadata from Europeana images, enhance the already existing ones and connect AI services from both industry and open source tools for comparative experimentation. Our approach is novel, in that the combination of proposed tools and the resulting knowledge organisation system, have not been applied to images in the framework of Europeana, a digital cultural heritage collection, before. In addition, the topic of choice, food, is an important aspect of mankind’s tangible and intangible cultural heritage. Making cultural knowledge from images available beyond given metadata enables widerreaching access, but also increased possibilities for various analyses for both humans and machines (Abgaz et al. 2018). Given the time and available resource scope of the project, we aim to test different scenarios and provide test reports. The resulting knowledge organisation system (KOS), the ChIA system (see also Figure 1), aims to unite the different aspects of semantic and AI technologies, and give users 473 improved access to as well as improved interaction possibilities with Europeana cultural images depicting food concepts. On the one hand, the system will integrate thesauri with a special focus on food and descriptions of subjects represented in images. Available thesauri will be evaluated and finally reused in a new skosified and open ChIA vocabulary for image description. Thus resource discovery of cultural images that depict food related subjects shall be enabled and improved. The final ChIA food & culture vocabulary will be available in standard SKOS (Simple Knowledge Organization System) format and open for reuse via web services for cataloguing purposes. In line with this, domain-specific relationships between the SKOS concepts will be captured and represented using a separate OWL (Web Ontology Language) ontology. The ChIA food concepts and the ontology will also serve as input to Computer Vision concepts and for further user interaction and data enrichment. On the other hand, the ChIA system also integrates AI results, such as Computer Vision predicted concepts for the selected images analysed. By means of integrating a chatbot, users are also enabled to engage with the images in a more interactive way. 4.0 Conclusion and outlook Finally, ChIA aims to draw on learnings and make use of larger existing European infrastructures such as DARIAH6 and E-RIHS7. By using cultural images related to edible food from the Europeana database the project can draw on a considerable quantity of cultural heritage data derived from a great variety of cultural content holders (museums, archives, libraries, botanical gardens) across Europe. This will provide a vast starting point for analysis, but also shows the need to precisely target the research for dedicated user groups. In addition, the initial analysis pointed out the importance of the selection process for accumulating the optimal test set of raw data for ChIA purposes. It is envisaged that content analysis will start with investigating metadata richness within the first test sets accompanied by image analysis using computer vision functionality and artificial intelligence. References Abgaz, Yalemisew, Amelie Dorn, Barbara Piringer, Eveline Wandl-Vogt, and Andy Way. 2018. “Semantic Modelling and Publishing of Traditional Data Collection Questionnaires and Answers.” Information 9, no. 12: 297. https://doi.org/10.3390/info9120297. Behli, Abdelhak, Abdelaziz Bouras, and Sebti Foufou. 2018. “Leveraging Known Data for Missing Label Prediction in Cultural Heritage Context.” Applied Sciences 8, no. 10: 1768. https://doi.org/10.3390/app8101768 Hardman, Lynda, Lora Aroyo, Jacco van Ossenbruggen, and Eero Hyvönen. 2009. “Using AI to Access and Experience Cultural Heritage. Intelligent Systems.” IEEE Intelligent Systems 24, no. 2: 23 - 25. Schnapp, Jeffrey T. 2014 “Knowledge Design.” In Herrenhausen Lectures. Hannover: Volkswagenstiftung. http://jeffreyschnapp.com/wp-content/uploads/2011/06/HH_lectures_Schnapp_01.pdf 6 https://www.dariah.eu/ [last access: 20.09.2019] 7 http://www.e-rihs.eu/ [last access: 20.09.2019]

Chapter Preview

References

Abstract

The proceedings explore knowledge organization systems and their role in knowledge organization, knowledge sharing, and information searching.

The papers cover a wide range of topics related to knowledge transfer, representation, concepts and conceptualization, social tagging, domain analysis, music classification, fiction genres, museum organization. The papers discuss theoretical issues related to knowledge organization and the design, development and implementation of knowledge organizing systems as well as practical considerations and solutions in the application of knowledge organization theory. Covered is a range of knowledge organization systems from classification systems, thesauri, metadata schemas to ontologies and taxonomies.

Zusammenfassung

Der Tagungsband untersucht Wissensorganisationssysteme und ihre Rolle bei der Wissensorganisation, dem Wissensaustausch und der Informationssuche. Die Beiträge decken ein breites Spektrum von Themen ab, die mit Wissenstransfer, Repräsentation, Konzeptualisierung, Social Tagging, Domänenanalyse, Musikklassifizierung, Fiktionsgenres und Museumsorganisation zu tun haben. In den Beiträgen werden theoretische Fragen der Wissensorganisation und des Designs, der Entwicklung und Implementierung von Systemen zur Wissensorganisation sowie praktische Überlegungen und Lösungen bei der Anwendung der Theorie der Wissensorganisation diskutiert. Es wird eine Reihe von Wissensorganisationssystemen behandelt, von Klassifikationssystemen, Thesauri, Metadatenschemata bis hin zu Ontologien und Taxonomien.