was successfully added to your cart.

    Unified Medical Language System (UMLS)

    What is UMLS?

    The UMLS is a project that targets producing a group of files and software that unify many biomedical terminologies and standards for the sake of hermetic integration and interoperability between different systems.

    As discussed before, there are always barriers to successfully retrieve machine-readable information. One of which is that the same concepts could be expressed in different ways among different databases and systems.

    UMLS can guarantee successful interoperability of drug names, billing codes, health information, and medical terms across different computer systems. This can secure successful data mining procedures, or in producing accurate healthcare statistical reports.

    The UMLS has three tools (known as “Knowledge Sources”) that can be used separately or together as showed below.

    Metathesaurus

    The Metathesaurus is a huge multi-lingual vocabulary database that contains information about biomedical and health-related concepts, their various names, and the relationships among them.

    It is concerned with terms and codes from many vocabularies, including MeSH®, MedDRA, RxNorm, and SNOMED CT® (English and Spanish).

    The 2019AB Metathesaurus contains approximately 4.26 million concepts and 15.2 million unique concept names from 211 terminologies (source vocabularies).

    The Metathesaurus is populated from what is called by “Source Vocabulary”. Those source vocabularies can be derived from lists of controlled terms used in patient care, health services billing, public health statistics, or any clinical or health services research. Those lists used in building Metathesaurus need high-quality data that must be clean and presented in a standardized format.

    Semantic Network

    It is concerned with broad categories (semantic types) and their relationships (semantic relations). It contains 135 broad categories and 54 relationships between them (‘isa’ (is a) relationship can be considered the primary link between most of the semantic types)

    All concepts in the Metathesaurus are related to at least one semantic type from the Semantic Network where it is defined with textual descriptions or through inherent information inherent in its hierarchy.

    SPECIALIST Lexicon and Lexical Tools

    It is concerned with parts of speech, variant information, and programs for language processing. It contains 200,000 lexical items.

    Most of the healthcare research project networks need high-quality data, where the data is not only clean but must follow specific standards or be available in a specific format. Having clean data that is compatible with UMLS standards is always considered one of the constraints to any healthcare project. For a researcher, obtaining this high quality of data may consume about 60% of his/her time.

    John Snow Labsis considered one of the leading organizations that offer a catalog that contains diverse datasets including many UMLS datasets. These healthcare nlp datasets are manually, and machine reviewed.

    JSL catalog contains 2 interesting datasets that could be navigated to have a better understanding of the topic.

    The first dataset provides the information on relationships between concepts or atoms known to the Metathesaurus for the semantic type “Antibiotic”. With regards to asymmetrical relationships, one row is assigned for each direction of the relationship.

    The other dataset provides information about the entire concept structure of the Unified Medical Language System (UMLS) Metathesaurus for the semantic type “Antibiotic”.

    This dataset connects different names for all the concepts for a specific Semantic Type. The Semantic Network contains 125 Semantic Types. The relation between the Metathesaurus concept and semantic type is one too many; some terms are assigned to 5 semantic types.

    By utilizing the rich structure of the UMLS dataset, Generative AI in Healthcare can help improve the understanding of medical concepts, enabling a Healthcare Chatbot to provide more accurate, context-driven interactions and assist patients and healthcare professionals with more efficient decision-making.

    How useful was this post?

    Our additional expert:
    Mohamed joined John Snow Labs (JSL) in Feb. 2016 as Healthcare Researcher and Author. Other than having 20+ years of experience moving between different healthcare domains (management, training, curricula design, solution architecture, clinical, research, and data management), Mohamed has good experience in working with SQL, big data, machine learning, and Python. Before joining JSL, Mohamed had worked as a Healthcare Facility Manager in his own private practice. He has also worked as a data manager, training consultant, and eHealth Researcher in various companies/organizations in Egypt, Canada, and US (Remotely).

    Accurate de-identification, obfuscation, and editing of scanned medical documents and images

    One kind of noisy data that healthcare data scientists deal with is scanned documents and images: from PDF attachments of lab results,...
    preloader