SNOMED-CT as Standard Language for Organization and Representation of the Information in Patient Records †

The Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT), such as the Medical Subject Headings (MeSH) and the Health Sciences Descriptors (MeSH) is a standard for handling, organizing, representing and retrieval of information in the health context. It is structured, among other things, in 19 categories: clinical diagnosis/disease, procedures, observable entities, body structure, body, substance, biological and pharmaceutical products, sample, physical object, physical force, event, geographical or environmental location, social context, stages and scales, special concepts and qualifiers. We present research results carried out with patients’ medical records in the Walter Cantidio University Hospital, at Federal University of Ceará. The line guiding this study seeks to answer the following question: what is the contribution of these categories to build a representation of the patient’s medical records at the Department of Medical Records and Statistics (SAME), at the Walter Cantidio University Hospital (HUWC)? The objective of the research is to study the contribution of SNOMED-CT for the representation of information within those records. It is therefore an exploratory study supported by neofunctionalist method and content analysis, the physical structure of digitized records was analyzed at the SAME of the HUWC. Then we analyzed a corpus of two patient records with nine volumes, about 4000 pages corresponding to 777 Mb. The results and conclusions show that the hierarchical categories of SNOMED-CT may bring contributions to the representation of the charts, as it is a robust terminology based on ontology, contemplating the essence of the information recorded in these documents. Regarding the physical structure of the chart shows some similarities, and hence can contribute to information retrieval with higher added value, since it allows the use of pre and post-coordination as well as natural language, synonyms and acronyms. Received 18 June 2014; Accepted 24 June 2014 Knowl. Org. 41(2014)No.4 V. Pinto, C. Rabelo, I. Girão. SNOMED-CT as Standard Language for Organization 312


Introduction
The concern about a terminology, aimed at a normalization, whose purpose would be to obtain a language able to facilitate the process of communication between citizens, has taken on massive proportions, mainly in the context of science and technology (S&T).We believe that, in these contexts, the first concepts in this field are developed in chemistry, botany and zoology taxonomy.Recently, terminological ideas started to migrate into engineering taxonomy with Wüster Eugen, of the Vienna School, and D.S. Lotte and Drezen, founders of the Russian school.This new reality has consolidated especially because of the development of studies of lexical terminology.
In the context of library and information science, we can consider that the first efforts in connection with the establishment of terminological aspects started with the bibliographic classifications, as well as the subject heading lists.However, with the emergence of the thesauri in the late 1950s, these ideas started to become more accepted, with the systematization of the work of the Information Center of the United States Department of Defense (ASTIA).
Regarding the healthcare field, we can highlight the first medical bibliography called "De claris medicinae scriptoribus" published in 1506 (XVI century), whose author is Dr. Synphorien Champier (Figueiredo and Cunha 1967).In 1879, the Medical Indexing Service -USA, began to produce Index Medicus, and later on the Database of the U.S. National Library of Medicine (NLM), the Medical Literature Analysis and Retrieval System (MEDLARS), whose online version is called MEDLINE (REIS, 1979).Another venture that cannot be faded into oblivion is the publication, in 1954, of the Subject Heading Authority List by NLM.Then, the Unified Medical Language System (UMLS), the Systematized Nomenclature of Medicine and Index Medicus, in 1986.In addition to these initiatives arises, in 1974, the first version of the Systematized Nomenclature of Medicine (SNOMED).In 1970SNOMED II was released, which, in 1993, was renamed SNOMED International and, in 2000, renamed SNOMED RT.From the union between SNOMED RT and Clinical Terms Version 3 (CTV3) of the National Health Service (NHS), from the United Kingdom, the Systematized Nomenclature of Medicine arises-Clinical Terms (SNOMED-CT), a bilingual reference terminology for representing clinical information accurately and unequivocally, in order to enable communication between health professionals worldwide.
In Brazil, Pedro Luiz Napoleão Chernoviz idealized the standardization of terminologies in the healthcare field.It is what made him publish, in Paris, 1870, the Dictionary of Popular Medicine and Science for Accessory Use of Families.The Regional Library of Medicine also deserves attention for participating in the construction and management of Health Science Descriptors (DeCH), published in 1987.
All these initiatives contributed significantly to the continuous qualitative progress of the terminologies in the healthcare field, with the ultimate goal of standardizing the terms, facilitating communication and making it more efficient between members of the multi-disciplinary healthcare teams, as well as between them and patients, so as to prevent communication noise between these agents involved.From observing this phenomenon, our interest in investigating it was aroused, making us ponder about the following question: what is the contribution of the categories of SNOMED CT to develop the representation (indexing) of the patient records in the Medical Records Service and Statistics (SAME), of the University Hospital Walter Cantidio (HUWC) of the Federal University of Ceará (UFC).This research has the overall aim to study the applicability of SNOMED CT to represent (index) the information along these medical records, regarding the relation between them and the terms present in the categories of SNOMED CT.Within this overall objective there are some specific objectives, as follows: -Know the physical and logical structure of SNOMED CT and its categories to represent information in the clinical context of the healthcare field; -Study the physical and logical structure of the patient records so as to map the concepts recorded in these documents, in order to synchronize them with the SNOMED CT terminology.-Compare these categories, applying them to the representation of the information in the patient records in HUWC-UFC.
Here are, therefore, the important aspects of this exploratory research based on the neo-functionalist method and content analysis, taking, as its object of study, a corpus composed of two patient records, totaling nine volumes and containing about four thousand pages equivalent to 777Mb.The result showed, however, that the categories of SNOMED-CT would be applied to the indexing of these documents, even if their categories are not shown, maybe because the terms studied are only related to the terminology of nephropathy.

Patient records
The literature on the history of medicine notes that recordings on the health status of the patients were part of the doctors' concerns ever since the first centuries before the commen era, as the example of Imhotep and Hippocrates.Hippocrates, in the fifth century BCE shows the importance of getting these records, claiming that they "should accurately reflect the course of the disease and indicate its possible causes" (Marin, Massad and Azevedo Neto 2003, 9).According to Moutel and Baret (2002, 8), until the late eighteenth century, the semantics of the patient records was related to the doctor, only from the nineteenth century is it that the notion of patient records appears in healthcare organizations."Physicians based their observations, and consequently their notes, in what they heard, felt, recorded in chronological order.Therefore, these stories are focused on documenting a patient's medical history."Since 1931, in the United States, a record of the quality of the patient records has become an ethical requirement in hospitals.But only after 1970 was it that patient records came to have a central place in medical practices and other healthcare professionals' practice, as they include (Moutel and Baret 2002, 9): Medical, social and administrative data ... administrative record that was a long time out of patients' reach, to ensure respect of medical confidentiality before others, but it is a controversy that sometimes a whole service could know the a patient's health status that the patient him/herself ignored.
In Article 1 of Resolution 1.638/2002 of the Federal Council of Medicine (Brasil CFM 2002) of Brazil, patient records are defined as: A single document consisting of a set of information, signals and recorded images, extracted from facts, events and situations about the patients' health and assistance they are provided with, done in a legal, scientific and confidential way, used to enable communication between members of the multi-professional healthcare team and continuity of care provided to the individual.Moreover, Bentes Pinto (2006, 3), shows the particularities and specifications in patient records, stating in her reflections that these stories are: A document that contains all registered information concerning a patient, whether they be for identification purpose, or socioeconomic status, or health status (observations made by healthcare professionals, x-rays, prescriptions, test results, diagnosis from specialists, notes on the patients' improvement written by the multidisciplinary healthcare team: doctors, nurse, psychologist ... with respect to the observed progress), administrative matters, etc. Actually, it is the physical memory of the patients' history, being, therefore, essential for the communication between members of the healthcare team, as well as between the team and patients; being also essential for the continuity, security, effectiveness and quality of the patients' treatment.Thus, also, helping the management of healthcare institutions.
According to Bentes Pinto (2006), the patient records have two different structures: physical and logical ones.The first contains the informational categories concerning the patient, health care providers and hospitalizations and the efforts of the health care providers.In this group, we have the patient records, data on initial clinical examinations, further requirements and their respective results, final diagnosis(es), treatment(s) undergone by the patients, their daily improvement, data on nutrition, social service, psychological care, prescriptions on drug doses to be used, pre and post-operative evaluation, recovery, surgical annotations, antimicrobial monitoring, anesthesia newsletter, classic control system and summary of discharges in a recovery room.Also, the expense for hospitalization, discharge reports, prescription with guidelines to be followed by the patient, consumption in the operating room and death notifications.
While in the second, logical structure, there is a description of the information itself, about the patient him/ herself, i.e, identification, socioeconomic and administrative data, such as: full name, identity number, CPF (Individual Taxpayer Registry), education, religion, home and work address, age, complexion, descent, city of birth, nationality, marital status, spouse's name, number of children, liability statement, social service observations, profession, place of work, spendings on social security; anamnesis (main complaints, history of current illness, personal and family history, previous morbid history, services, addiction, diet), physical examination and diagnostic hypotheses; reports from nursing personnel (graphs of temperature, pulse and respiration-TPR, blood pressure-BP and water balance), observations from social, psychological and nutritional service.

Terminological languages for representing
information in the healthcare field

Considerations of terminological languages
To speak of terminological languages, it is necessary, first of all, to understand what a language is in itself.There are some concepts about language.Some consider it as the biological expression and, therefore, there is the need for good physiological condition of the subject so that communication occurs, idea defended by, among others, the philosopher Humberto Maturana.Others, such as Wittgenstein, state that language is a means of social interaction, including documentary and computer languages.Therefore, regardless of the concepts, every language has the purpose to communicate.From this perspective, we cannot understand it apart from the social context in which it is uttered.For that matter, Echeverría (2007, 50 ) tells us that the language "is born of the social interaction among human beings.Consequently, language is a social phenomenon, not biological," once there is "an interaction between different particular human beings" developed in a "consensual domain" in which "the participants in a social interaction share the same system of signs ... to name objects, acts or events in order to coordinate their ordinary activities."However, it is not our intention to discuss all these possibilities here; we will focus on exposing some looks at documentary language, found in the treatment, organization and presentation of information present in the patient records.That is important because these documents are set as the main channel of communication and information between the multidisciplinary healthcare team and between the team and the sick person (Bentes Pinto 2006).
Documentary languages (DL) also known as terminology, like any other, are designed and intended to facilitate communication between professionals of the same specialty.They result from the documentary boom that occurred from the nineteenth century because of the large number of documents that permeate the scientific and technological world, and the necessity to reduce linguistic inconsistancy caused gradually by the course of natural language, considered polysemous by nature.Due to this polysemy and intended to reduce communication noise in different fields, areas or disciplines there was a call for a standardization of terminology so that the informational flow is achieved more clearly between specific communities.
In the literature of library and information science, it is considered that the structure of documentary languages was built in the late nineteenth century with the publication of the Dewey Decimal Classification (DDC) in 1876, and the Universal Decimal Classification (UDC) by Paul Otlet, in 1895.In 1914, the Library of Congress of the United States published the Library of Congress Subject Headings (LCSH) which was used as the basis for many other lists.According to the ANSI/NISO Z39.19 -3005 (ISO 2005), a thesaurus is a "controlled vocabulary organized in an known order in which the equivalence, homographic, hierarchical and associative among terms are displayed clearly and identified by standardized indicators of relations."The construction of these languages is initially given by the organization of terms classified according to the categories of knowledge.Actually, they are constructed based on the terminologies of each specialty.According to Cabré (1998, 70), terminology: Is a transdisciplinary science because terminological products are pieces of linguistic representation on which should be built any field of scientific knowledge so as to acquire, develop and transfer specific knowledge from any domain (which means, in the field of law, medicine or physics, for example, that the discipline of terminology plays a pivotal role in providing knowledge of transporting terms that mediate in communication, as an identificator of underlying rules in connection with the creation of and relation between terms, and as a method and ability to work.
Terminologies are configured as systems of standardized terms used by professionals in a specific area of knowledge.They are the basis for the construction of documentary languages.Usually when we refer to those languages, we are referring to a specialized language.Therefore, this language may not exist otherwise, only through the use of specialized terms inserted in the given context, since each terminology is valid in the professional extent to which it relates.
A documentary language is a set of signs representing the terminology of a specialty using standardized concepts and is intended to represent, express or describe the contents of the documents, facilitating information retrieval.As Urdiciain explains (1996,307), "the documentary Language is a system ... of standard symbols that facilitate the formal representation of the content of the documents to allow retrieval, manually or automatically, of the information requested by users."Thus, they have similar specialized thesauri, which map the terms/concepts of very specific areas, such as the Medical Subject Headings (MeSH), built by the National Library of Medicine (NLM), that present a set of descriptors standardized for thematic representation or indexing documents in the healthcare field.To us, thematic representation of the information is understood as the result from the act of reading, identifying, choosing and/or translating informational items concerning the topics discussed in a document.
Initially, documentary languages were limited to cover various fields of knowledge, which were universal and pre-coordinated, such as DDC and the UDC.However, it was noticed that these languages were not enough to cover all areas of specificities originated by the fragmentation of knowledge of various areas of knowledge, though their origin has been general, being regarded as efficient tools for document indexing.As a result, a specialized thesaurus appeared, one that mapped the terms/concepts from specific fields as it is the case of MeSH, introducing a system of standardized descriptors for the thematic representation (indexing) documents in the healthcare field.We understand that a thematic representation of information results from the cognitive act of reading, identifying, choosing and translating the terms concerning the issues discussed in the documents.
The representation of information in the context of healthcare, as well as other areas of knowledge, makes use of tools in order to enable the access to the information.Documentary languages are the way in which the representation or indexing and retrieval of information are made.For a broader understanding of that matter, the following section will present SNOMED-CT, and give further explanations about its main features.

SNOMED-CT
The Systematized Nomenclature of Medicine (SNOMED) is an international and multilingual clinical terminology to be used in the making of electronic patient records.It covers the clinical content and can be used for indexing the documents, and other clinical documents are also used for information retrieval.Then, it is a basic language to realize semantic interoperability of clinical records in the healthcare information system.
The genesis of SNOMED is found in the Systematized Nomenclature of Pathology (SNOP), which started in 1965, published by the College of American Pathologists (CAP).In 1974, the first edition of SNOMED was published, renamed SNOMED International in 1993.The agreement between CAP and NHS was signed in 1999 and, thenceforth, with changes in the logical structure, was born SNOMED Reference Terminology (SNOMED-RT) in 2000.In 2001, the Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT) was published as a result of the union between SNOMED-RT and Clinical Terms terminology of the National Health Service (NHS) from the UK.In 2002, the UMLS National Library of Medicine (NLM) donated a license for the use of SNOMED-CT in Spanish.In 2007, IHTSDO bought SNOMED-CT's intellectual rights.
The structural organization of SNOMED-CT contains the basic components; the hierarchies; the roles; the cliniClue browser and cross-referencing (mappings) and post-coordination subsets.The basic components of SNOMED-CT are: -Key Concepts: refer to abstract or concrete entities of the healthcare field.They are, therefore, thought units.They are identified by unique numeric codes that are never modified or deleted, e.g: ConceptID 77427003.-Descriptions: concerning the names that refer to a concept, having, for each description, an identifier.E.g: hypertension-DescriptionID.There are three types of descriptions: specific name, preferred term; and synonyms.Additionally, it is possible to use various terms to identify a concept.-Relation or cross referencing: relates concepts of SNOMED CT with those of other terminologies or classifications.Therefore, they route a schema to a destination; they contribute to the description of meanings and concepts and link them, for example with CIE-10, NIC; NOC; NANDA; LOINC, SNODENT.
The structural organization of SNOMED CT consists of 19 high ranking categories of concepts, descriptions and correlations.These categories are presented below, in Table 1: SNOMED-CT is set as one of the terminologies with more scope within the healthcare field, for being a multiaxial ontology, which stands out because of the possibilities offered in its structure, in consequence of the large amount of existing terms in its vocabulary.The latest update was published in 2008, and contains in its database of terminology more than 311,000 concepts related to its 19 hierarchical axis and countless subclasses.Thus, the relationship between these classes follows the semantic axis (IHTSDO, 2012).The main difference between the SNOMED-CTs is the concern about indexing, retrieval and semantic interoperability of all data records of the patient records and their peculiarities that had not been thought of in previous terminologies studied.SNOMED-CT includes: The "tables" of concepts, descriptions and relationships; cross-references to CIE-10; MC with epidemiological and/or statistical purposes; SNOMED-CT Technical Reference Guide; a browser that allows to navigate the terminology; and 2 updates: January and July (English Ed.) SNOMED-CT can be used for storing laboratory reports, image storage, anatomy pathology reports, protocols, database of autopsies, support system for decision making, coding the news about healthcare in the press, and tissue banks in electronic medical records.Therefore, "it is an extensive clinical terminology that provides clinical content and expressivity for documentation and clinical reports, which can be used to encode, retrieve and analyze clinical data" (SNOMED -CT 2012, 15).

Method and material
This research features an exploratory study aimed to allow us to obtain new insights into the SNOMED-CT and its application for indexing of patient records on paper.In order to delve into this topic, we reviewed the relevant literature on each section of the subject.Neofunctionalist method and content analysis were some of the tools that we made use of to interpret reality.We did not find any research related to this present work that aims to study the contribution of the SNOMED-CT terminology for the representation of information in patient records, whose recording is done manually (printed).However, we found some research that includes the use of SNOMED-CT for electronic patient records.
Neofunctionalist theory aims to understand the language of the representation of information, which has the function of improving communication between the members of multidisciplinary health teams and between those members and the patient, with the least amount of noise possible.Such a method is an innovation of functionalism and appears in sociological research and in the administra-tion field to explain facts about the world.This states the "things" written on patient records, for instance, the health professionals that attended that patient.Hence, we think of these stories as having a critical role in communication.If they are not seen this way, their importance will never be acknowledged.In neofunctionalism, the patient records are not only thought of, but, in addition, they are observed as to how and if they are fulfilling their function, and, if they are not fulfilling their role, what outcome this might produce.
Regarding content analysis, this theory allowed us to map the concepts/terminology of the patient records and allowed us to map the categories of the SNOMED-CT terminology.Initially, we chose two patient records (on paper) of those diagnosed with nephropathy stored in the Medical Records Service and Statistics (SAME) of the Walter Cantídio University Hospital (HUWC) of the Federal University of Ceará (UFC).This specialty was chosen due to the fact that the HUWC/UFC is considered a reference hospital in renal transplantation, since it is the pioneer in the State of Ceará, Brazil.

High-level categories of SNOMED-CT
Clinical Finding: it refers to concepts related to an observation, evolution, clinical status, e.g.: Tuberculosis, anemia, and low cardiac output.
Pharmaceutical/biologic product: they are related to their therapeutic mecanism, e.g.: Substance: Generic ingredients - To compose the corpus of this research, we selected two patient records.The two patients were hospitalized in the 1990s.The criteria used to choose these documents is that they are composed of nine volumes and contain about four thousand pages, the equivalent of 777 Mb.After this selection, the stories were digitized, to begin the process of content analysis.We carried out a comparative study of the structure of these records and the offers of the proposed categories in the SNOMED-CT terminology.After that, we will extract some information from the patient records to compare to the categories referred to in SNOMED-CT.

Result and discussion
As presented in Table 2 below, the clinical terminology SNOMED-CT has nineteen categories, namely: clinical finding, pharmaceutical/biologic product, environments/ geographical locations, procedure, qualifier value, social context, observable entity, record artifact, situation with explicit content, body structure, physical object, staging and scales, organism, physical force, linkage concept, substance, events, special concept, and specimen.These terminological labels are intended to cover all of the aspects related to the nature of the information on the state of health of a patient.
Thus, the results of the empirical research in the patient records show that, although our research has been done in handwritten records, we see that SNOMED-CT categories can be applied to the treatment, organization, retrieval and management of information of these documents in the information systems of health centers and hospitals, both in the public health sector and the private sector (Table 2).
According to the results of the empirical research conducted, we understand that 11 out of 19 the SNOMED-CT categories were found in the first patient record, totaling 42% of the terms.Conversely, in the second patient record, the data is reversed: we identified that 11 of these categories are found in these records, constituting 58%.It is sufficiently interesting, once our analysis is based on printed records.We also understood that this result can be the consequence of the little knowledge we have of the health field terminology used in SNOMED-CT, and also for some difficulties in translating some terms into Brazilian Portuguese, since SNOMED-CT was designed to be used in the United States even though its intention is to expand its range overseas.

Patient record 1
Patient record 2 Similarly, it was not possible to find the categories in which we would be able to retrieve information related to family history, i.e., information about the diseases in the family.For example, in one of the patient records, we took notice of the physician's observation: "mother with hypertension."This information cannot be put into any of the categories, since it is not related to the patient's diagnosis; however, it refers to his/her mother and is eventually a very important piece of information.Another remark we can make is that we did not find, among all the categories in SNOMED-CT, any in which it was possible to classify the specialty in which the patient had been diagnosed, or for which he had been being treated, for instance: nephrology/nephropathy, which is a very important representation (indexing) in any patient records.

Conclusions
In this research, we seek to know and understand the terminology SNOMED-CT and its use for the representation (indexing) of information in patient records on paper.The result of this empirical research shows that this terminology is structured similarly to other documentary languages, with hierarchical relationships and synonymy, among other things.Therefore, in our point of view, the SNOMED-CT terminology can be used for the indexing of patient records, whether they be electronic, digital or on paper.The clinical terminlogy of SNOMED-CT is a language that provides innumerous possibilities for representing (indexing) the information in patient records, since it has labels for a large number of categories and all of their relationships.Moreover, this terminology has some other categories related to lab reports, image attachments, specific protocols, construction and maintenance of databases and system supports for decision making and dissemination of information about health needs of a city, state or country, and genome studies, among other things.
Through SNOMED-CT, you have a reference terminology in the health field, which facilitates the exchange of information and is also a tool for decision making and for information retrieval.Therefore, it is a language that processes, organizes, mediates and manages the information.We would like to highlight that in this study, we listed the classes and categories belonging to the SNOMED-CT terminology.However, we had some minor setbacks be-cause, although we applied for a short-term license only for this research, it was not granted.Therefore, our empirical research was done on the online pages of SNOMED-CT.For that reason, we had some trouble understanding the categories, not in terms of language, but because of the difficulty understanding to what the concept referred.

Table 2 .
Association between SNOMED CT and the patient records.Source: Results of empirical research