ISKO 13’s Bookshelf: Knowledge Organization, the Science, Thrives— An Editorial

Krakow, Poland was the host city for the 13th International ISKO Conference held May 19-22, 2014. Conference-goers attended a grand opening session in the Collegium Novum at the Jagiellonian University (JU); the conference was included as part of the celebrations of the 650th anniversary of the JU. Three full days of research presentations followed in the venue of the Institute of Information and Library Science at the JU. In honor of the 25th anniversary of ISKO’s founding, attendees were treated to a talk at the opening session by founding scientist Dr. Ingetraut Dahlberg, concerning the bases of the science of knowledge organization (KO). The conference ended with a panel discussion on the future of ISKO (see Green 2014). The high level of research and interaction highlighted at the conference is evidence of the vibrance of ISKO and of the still emergent science of knowledge organization. Since 2008 I have used the occasion of the international conference to bring a domain analytical lens to bear on some core characteristics of the ongoing evolution of knowledge organization (see Smiraglia 2008, 2010, 2013). Using the rubric of “ISKO’s Bookshelf,” my analyses are aimed at a key research question: “what are the contributors reading (or perhaps I mean citing)?” and its corollary, “what can we observe in this manner about shifts in the intension and extension of KO as a domain?” Those questions inform the analysis presented in this editorial. The conference proceedings (Babik 2014) were published in print at the conference and subsequently became available online at the Ergon-Verlag ISKO Members’ portal at http://www.ergon-verlag.de/isko_ko/. There are seventy-six papers represented in the proceedings (one paper has only a reference to its publication elsewhere). The conference program lists seventy-seven papers and twenty-three posters. This analysis is based on the papers printed in the proceedings. As before, I was constrained to manually index the proceedings in order to analyze the citations in the papers, because Thomson Reuters Web of Science is not indexing ISKO proceedings. However, I am pleased to learn that Elsevier’s SCOPUS is indexing ISKO international conference proceedings, which will be a great boon to dissemination of the core literature of KO. The original spreadsheet containing the references from all of the papers can be found on my blog at http://lazy koblog.wordpress.com/. It continues to be a problem that inconsistent editing of the proceedings, particularly with regard to citation practice (which, in this case, was not standardized), requires much manual cleaning of the data before analysis can proceed.

has only a reference to its publication elsewhere). The conference program lists seventy-seven papers and twenty-three posters. This analysis is based on the papers printed in the proceedings. As before, I was constrained to manually index the proceedings in order to analyze the citations in the papers, because Thomson Reuters Web of Science is not indexing ISKO proceedings. However, I am pleased to learn that Elsevier's SCOPUS is indexing ISKO international conference proceedings, which will be a great boon to dissemination of the core literature of KO. The original spreadsheet containing the references from all of the papers can be found on my blog at http://lazy koblog.wordpress.com/. It continues to be a problem that inconsistent editing of the proceedings, particularly with regard to citation practice (which, in this case, was not standardized), requires much manual cleaning of the data before analysis can proceed.

International presence and thematic foci
This was the first international ISKO conference to be held in Poland, and it was very well-attended by participants from all over the world. To get a sense of the national affiliations represented, the country of affiliation of each first author was recorded. With the caveat that this method misses cases of international collaboration, Figure 1 shows the twenty countries represented.
Roughly a quarter of the contributions came from the United States, with another proximate third from Poland and Brazil. It is not unusual for the host country to have a bulge in its numbers and here we see 17% from Poland or roughly a sixth. Brazil, host country of the 2016 conference, had 13%. India, host country in 2012, dropped from 18% (Smiraglia 2013) to 5%. There were no other large clusters, although we see Singapore, Iran, Nigeria, Romania, Hungary and Taiwan this time, but nothing from Morocco or Algeria, who were newcomers in 2012.
Each host country's organizing committee generates its own conference theme and then subsequently provides thematic monikers for panel sessions. These can be used to get a sense of the thematic core of the conference, as a way of determining the parameters of the extension of the domain at this one point in time. (Coword analysis is used later, below, to analyze the inten-sion.) Table 1 shows conference themes from the program together with the numbers of papers in each: An interesting note is that a distinction is made between systems (KOS) and tools-thesauri, classifications, taxonomies, ontologies, terminologies-which usually are described as KOS. Countries of affiliation associated with each theme are visualized in Figure 2.
The visualization shows how well spread the themes are across global boundaries. Only two categories were country-specific: all of the papers on automated clas-

Citation analysis
There were a total of 1,217 references in 76 papers, for a mean of 16, and a mode and median of 15. The range was from zero to 45. A mean number of references per paper per country also was calculated; these hovered around the true mean at 18, with a range from 5 for Romania to 30 for Germany with all the rest near the mean. There is an observable dichotomy in KO in which roughly equal numbers of research papers are epistemologically either empirical science or humanistic narrative. The former tend to have few recent citations, and the latter tend to have many older citations. This conference is a bit of an outlier, because most of the papers fall into the scientific range with 5-15 citations. Oddly, German papers at this conference had 30 or more citations. Dates of cited works ranged from 1873 to 2014, or, the age of works cited ranged from 0-141 years. The mean age of works cited was 15.6 years (the median was 8). Thus the majority of works cited were fairly recent, as shown in Figure 3. Figure 4 contains histograms of the distributions of age of cited work and number of references.
The histograms help visualize the normativity of the means-most papers have between 10-20 references, and the age of most works cited peaks around 10 years. This is consistent with a social scientific epistemology. ANO-VA indicates there is no statistically significant influence of either variable on the other.
A mean age of cited work was calculated for each paper, and then these were averaged to develop a mean for each country; these are shown together in Table 2 with the mean number of references per country.
The number of references was more or less consistent across geopolitical boundaries. The mean age of work cited ranged from 4.7 to 41.3, which is rather a wild divergence. The explanation likely is the consistent dichotomy between scientific papers and humanistic papers. The median of 8 years tells us that there is a social-scientific distribution in the rate of absorption of scientific data in the community. But the large discrepancy reminds us that quite a few authors in the domain are not reporting empirical evidence, but rather are engaging in historical or rationalist narratives. Table 2 is arranged in ascending order of mean age of cited work per country; there seems no logical explanation other than that the countries with the larger number of papers also have wider ranges in mean age of works cited. Table 2 also shows the mean number of references per country.
Number of references and age of cited work also were arrayed by theme and this is shown in Table 3. Five thematic clusters exceed the mean age of cited works: domain and epistemology, automatic classification, KOS, global problems, and classifications. Four thematic clusters exceed the norm for number of references: history and future, knowledge representation, domain and epistemology, and global problems. More than one explanation is possible. It is realistic to conclude that the majority of the thematic clusters contain scientific works with few recent citations. One might expect history and future and global problems to consist of more narrative and less science. Domain and epistemology likely contains more narrative works on epistemological questions; in fact, most of the domain analytical scientific studies are contained in the "methods" cluster. Finally, "classification," contains mostly papers considered to be "theoretical," which in KO more often means rational and historical narrative than empirical hypothesis testing.
The distribution of media types is also an interesting indicator not only of what might be on ISKO's bookshelf, but of the epistemic stances brought to bear by conference  contributors. The references were sorted by source. The distribution of media types is given in Table 4.
The largest category was journal articles, and the second largest category was monographs, but the proportion of articles to monographs is higher than we have seen in recent conferences. The 440 journal citations were made to 134 different journals, 30 of which were cited more than once. Journals cited five or more times are shown in Table 5.
Nearly half of the citations are to Knowledge Organization, which received more than three times as many citations as any other journal. It continues to be a problem Journal of Information Science 5 2 Library Hi Tech 5 2 The American Archivist 5 2 that KO's impact factor is artificially low, because this major source of citations to research in the journal is not counted by Thomson Reuters. Notice that but for the title change, which split the file, the journal from the American Society for Information Science and Technology and its predecessor would have taken second place with 29 references. Papers in proceedings of conferences accounted for 11% of the references. Of these a majority, 72, were citations to papers from the proceedings of individual conferences; several major organizations such as CAIS (Canadian Association for Information Science), ASIST (Association for Information Science and Technology), DCMI (Dublin Core Metadata Initiative) and ACM (Association for Computing Machinery) were represented but no two citations were from any particular single conference. The rest were references to papers in ISKO international conference proceedings or ISKO regional chapter conferences. These are shown in Table 6.

ISKO International Conferences
No.  Table 7 shows the fourteen monographs that were each cited two or more times. This list is remarkable for the continued appeal to legacy texts that it demonstrates with reliance on texts by classic KO authors. But the list also is remarkable for the works from outside the domain that are being brought to bear on current KO research. Eighty citations were to chapters or papers in anthologies. Only a few of these (six) received more than one citation and these are shown in Table 8.
The legacy publication here is the collection of essays by Otlet, whose work is enjoying renewed interest in the KO community. The other volumes include culture, epistemology and terminology along with issues of current interest in the semantic Web.

Authors most cited and author co-citation analysis
The most prominent indicator of a research front in a domain is the appearance of a coherent set of prolific and  heavily cited authors, whose work drives hypothesis-testing and theory-development. The influence of these authors usually adheres to a power law such that their work is substantially more influential than the rest of the domain taken together. This is why they are said to constitute a research front; their work is at the forefront of new developments. Often some of the most-cited authors in a research front, however, are classic authors in the domain whose work remains substantially important over time.
These are all signs of coherence in a domain, and there has been a consistent, if evolving, research front visible in KO over time.
Author co-citation analysis is a means of visualizing the intension, or theoretical depth, of a domain by mapping authors whose work is frequently cited together by others in the domain on the presumption that co-citation indicates a theoretical relationship of some strength. Using the names of the authors in the research front, and creating a matrix of the incidences of their co-citation, multidimensional scaling can be used to generate a threedimensional map of the intellectual space, in which the authors are perceived to be clustered together (or in separate clusters at some distance apart) according to their theoretical relationship. In studies of ISKO conferences it has been useful to gather co-citation totals from Thomson Reuters Web of Science to map the perception of the research from by the KO community in general. But it also has been useful to manually gather co-citation totals from the papers in the individual conference proceedings, to map the perceptions of closeness among the immediately contributing authors. Both methods are used here.
There were 1,140 authors names in first-author position among the citations in the proceedings. After sorting and matching, there were 63 authors who were cited mo-re than once. Thus there was a "long-tail" of 1,077 authors whose names were cited only once in the conference proceedings. Of the rest, 31 were cited five or more times and these authors names appear in Table 9.
Hjørland 38 Gnoli 17 Smiraglia 15 Szostak 12 Beghtol 11 López-Huertas 10 Broughton 9 Dahlberg 8 Mai 8 Olson 8 Babik 7 Bliss 7 Buckland 7 Tennis 7 Otlet 6 Ranganathan 6 Vickery 6 Bloch 5 David 5 De Santis 5 Dousa 5 Foucault 5 Hajdu Barát 5 Kaiser 5 Kruk 5 Lakoff 5 Lancaster 5 McIlwaine 5 Rosch 5 Saracevic 5 Svenonius 5 Table 9. Most cited authors This is a curious gathering of names. The top ten names are all familiar authors in KO who are both reasonably prolific and highly cited. There are some other namessuch Otlet, Ranganathan, Kaiser and Bliss-who are wellknown classical authors in the domain from earlier eras. There are still other names-Lakoff, Bloch, Rosch and Foucault-who are well-known by their contributions incorporated in the domain by contemporary authors. And yet there are several very active authors in the domain whose names do not appear because they were cited 3 or fewer times by conference authors. Finally, there are names such as Buckland and Svenonius, whose work is well inculcated in the domain but who are not usually direct participants in the domain, but rather, are associated with the neighboring domain of information science. Whatever the interpretation, it is safe to assume the top ten names are in the research front as represented by this conference, as well as some of the names in the lower part of the distribution.
As it happens, author co-citation analysis relies on the presence of co-citation by peers in the domain, and when names for which little or no co-citation was identified are removed, the final list of authors in the research front contains 17 names, as shown in Table 10.  Author co-citation of this research front as represented by peer authors gathered from Web of Science appears in Figure 5, and co-citation by conference authors appears in Figure 6. These MDS plots were produced by IBM-SPSS™. Both plots are good representations of the data. Figure 5 visualizes an external analysis revealing the perception of the domain at large about this cluster of authors whose research is most cited in the contributed papers for this conference. To the extent that the participants in ISKO 13 have identified through their citations a research front, this visualization of co-citation is a snapshot of how the KO domain sees itself at this point in time. The figure is a two-dimensional visualization of a three-dimensional plot, which somewhat impairs our ability to see depth connections, but different kinds of dotted lines have been used for that reason. For example, the core cluster is that pointing directly at left of center, with a dashed line and containing Otlet, Foucault and Dahlberg; this segment represents classical concept theory, general classification and domain analysis. Imagining a third dimension shows us that the two clusters with the solid lines which are closer to the forefront (or, between the reader and the core cluster) connect groups anchored by Olson and Hjørland; these clusters represent cultural, temporal and epistemological points of view. In the distance (metaphorically speaking) are the two segments connected by small dots and connecting groups anchored by Ranganathan and Gnoli; here was have facets, integrative levels, pragmatism, semiotics and other recent modes of thought about knowledge organization. Co-citation is trace evidence of the perception of co-occurrence. Rather than pointing to causation, co-occurrence shows us what components are key and how they work together in the domain. There is a lot of space in this visualization, which could be taken as an indicator both that the connections are somewhat loose and also that each of the authors represented is seen as somehow holding a critical position inthe domain. The motion of the research front as visualized here is from the epistemological base, stretching away from the reader, through the semantic core to an experimental cluster.
In Figure 6 we see the perception of this research front from the point of view of conference participants, and in particular, of its members. There are fewer names in this visualization because Otlet and Foucault were not co-cited with these contemporary authors, although Ranganathan continued to be co-cited. We have essentially the same components as before but in a slightly different configuration suggesting different subtleties of interpretation. The core (at the center) is classical KO, containing founder Dahlberg and many of the authors who have contributed to perceptions of the domain's research agenda. In the metaphorical foreground we have semiotics, classification theory, and approaches to general classification. In the distance we have classical epistemology and facet theory. As before the clusters are tightly interwoven, suggesting the players visualize themselves at different points on the same continuum.

Co-word analysis
Co-word analysis is a technique for visualizing intellectual concept space by looking for term co-occurrence. The steps involve isolating the most frequently used terms in a domain, and then searching for terms that co-occur with them in the literature of the domain. In this case, all of the titles of the 75 papers were entered into Voyeur.ca's Voyant text engine and an initial visualization and chart of word (not term) frequencies was developed. A word cloud ap-  Figure 7 and the top of the frequency distribution appears in Table 11. There were 16,571 words in the corpus, of which 5,080 were unique. Those words appearing 20 or more times are shown here (some editing has resulted in partial words being removed).

Co-word analysis
Co-word analysis is a technique for visualizing intellectual concept space by looking for term co-occurrence. The steps involve isolating the most frequently used terms in a domain, and then searching for terms that co-occur with them in the literature of the domain. In this case, all of the titles of the 75 papers were entered into Voyeur.ca's Voyant text engine and an initial visualization and chart of word (not term) frequencies was developed. A word cloud appears in Figure 7 and the top of the frequency distribution appears in Table 11. There were 16,571 words in the corpus, of which 5,080 were unique. Those words appearing 20 or more times are shown here (some editing has resulted in partial words being removed).
Keyword-in-context analysis (see KWIC below) tells us that many of the terms at the top of the distribution identify publications, rather than topical keywords, so although they appear in the Voyant visualization we can remove them from the frequency distribution. We see from Table 10 that although there are many words occur between 20 and 50 times, the rate drops below 2% of to-tal occurrence pretty rapidly. Most of these words represent a sort of solidified granularity.
The same titles were entered into WordStat™. Word-Stat™ allows the creation of a term thesaurus (called a "dictionary") which, together with specific language exclusion features can be used to filter the file. A dictionary of key ISKO terms has been developed for prior research, and was used again here. Figure 8 is an MDS plot of those terms as they co-occur in the titles of these conference papers.
The plot in Figure 8 is a fairly good model of the data. What is interesting here is how the co-occurrence data lead us to draw the clusters. Notice, for instance, that there is a solid core cluster with knowledge organization, classification, concepts, categories, ISKO and cognition. But behind it is a small cluster with models, and behind both is another cluster where domain analysis and ontology are associated with information science. This suggests that there is, among the contributors to this conference, an ontological discrepancy between concept theory (which is classic KO) and domain analysis, by which concepts are derived, which is perceived rather as closer to information science (IS). Cognition is classic KO, users are IS, but information retrieval models are in their own cluster apart from both. WordStat™ also has a keyword-in-context feature that allows viewing each instance of a term in context. From this we are able to see that the main terms are mostly used in formal ways. The key terms in this conference are those that are not used routinely. These are compiled, each with three KWIC examples, in Table 12.
The thousands of words used in the long tail of the frequency distribution are essentially words that define types of information retrieval, categories, ontologies, thesauri, users, models, and cognition. This short list is the extension of the ISKO domain as realized by con-tributors to ISKO 13. It is interesting to notice that a key term from the thematic conference program list shown in table 3 above is missing, "tools."

Discussion: ISKO 13's bookshelf
In sum, ISKO 13 was once again a fulsome exercise in bringing together the disparate parts of the KO community. It is a problem that KO must compete with knowledge management, which is not the same science, for intellectual space in the representation of science in gen-  eral. KO as a scientific community must insist that indexing services make a clear distinction between KO and knowledge management. This conference, more than others of recent vintage, shows that even among participants there is a distinction between KO and information retrieval. This is ontologically correct, but epistemologically a tautological red herring. We do not organize knowledge only for retrieval; rather, our science is devoted to understanding the actual empirical order of knowledge. But that, of course, could be useful for information retrieval. But this distinction is perhaps also an artifact of the conference organization. ISKO ought to take more care that its international conference program committees are run by its Scientific Advisory Committee, and not by local arrangement committees. This is critical for the advancement of the science of knowledge organization.
Another limitation of this analysis arises from the keynote papers. Hjørland's presentation was published elsewhere, and thus his citations and keywords are not present in this empirical analysis. And Buckland and Soergel, who also were invited as keynote speakers, are not really part of the current research front of the domain. So the empirical evidence from their papers, which is here, is perhaps influencing the interpretation.
The domain of KO as a science clearly is lively, evolving, emergent, and thriving. Every one of the triangulated methodological tools here points to a tightly knit cluster of key terms involving knowledge organization, categories, concepts and classification. The other five thousand words, many used more than once, define the granularity of applications of KO systems (not tools). KOS apply to every aspect of human existence. This is why KO is a science of the substrate of everything else. In that it is like the science of information. But I increasingly believe the future of KO is in comprehending that it is a science of its own and not a sub-domain of information. The methodological triangulation of citation analysis and coword analysis shows the vivid common core of KO in concept theory and cognition, its research front in applied systems, and the immense granularity of applicability of those theories and systems. Based on the evidence from this conference in sequence, it is clear KO as a science is thriving. What is on ISKO 13's Bookshelf ? First and foremost, the journal Knowledge Organization. There is little difference globally in the resources being applied. There is a constant and useful tension between historicist and empirical epistemic stances. Next to our journal, the most important resource for KO scientists comes from the proceed-ings of ISKO international and regional chapter conferences. We rely on a good mix of classic texts from Ranganathan, Bliss and Otlet, and from major monographs by Hjørland and Svenonius. Perhaps most interesting is the set of cited anthologies, including works on the socalled semantic web, on culture and epistemology, and on The need for a faceted classification as the basis of all methods

On concepts and classifications of musical instruments
Teaching classification 1990-2010 Otlet's dreams of universal knowledge availability. All of it is fueled by Dahlberg's dream of an interoperable application of the empirically discovered heuristics of knowledge organization.