Others titles
- Metathesaurus Concepts and Their Semantic Types
- Concepts Nucleotide Sequence Concepts and Types
- MRCONSO & MRSTY
Keywords
- UMLS
- Metathesaurus
- UMLS concepts
- Semantic Type
- RRF
- NLM
Nucleotide Sequence Concepts and Types
This dataset contains the entire concept structure of UMLS Metathesaurus for the semantic type “Nucleotide Sequence”. One of the primary purposes of this dataset is to connect different names for all the concepts for a specific Semantic Type. There are 125 semantic types in the Semantic Network. Every Metathesaurus concept is assigned at least one semantic type; very few terms are assigned as many as five semantic types.
Get The Data
- ResearchNon-Commercial, Share-Alike, Attribution Free Forever
- CommercialCommercial Use, Remix & Adapt, White Label Log in to download
Description
The UMLS, or Unified Medical Language System, is a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems. One powerful use of the UMLS is linking health information, medical terms, drug names, and billing codes across different computer systems. Some examples of this are:
– Linking terms and codes between your doctor, your pharmacy, and your insurance company
– Patient care coordination among several departments within a hospital
The UMLS has many other uses, including search engine retrieval, data mining, public health statistics reporting, and terminology research.
The UMLS has three tools (Knowledge Sources):
– Metathesaurus: Terms and codes from many vocabularies, including CPT, ICD-10-CM, LOINC, MeSH, RxNorm, and SNOMED CT
– Semantic Network: Broad categories (semantic types) and their relationships (semantic relations)
– SPECIALIST Lexicon and Lexical Tools: Natural language processing tools
The 2024AA Metathesaurus contains approximately 3.44 million concepts and 13.7 million unique concept names from 199 source vocabularies. The Metathesaurus is a very large, multi-purpose, and multi-lingual vocabulary database that contains information about biomedical and health related concepts, their various names, and the relationships among them. It is built from the electronic versions of many different thesauri, classifications, code sets, and lists of controlled terms used in patient care, health services billing, public health statistics, indexing and cataloging biomedical literature, and /or basic, clinical, and health services research. In this documentation, these are referred to as the “source vocabularies” of the Metathesaurus. In the Metathesaurus, all the source vocabularies are available in a single, fully-specified database format.
The Metathesaurus is organized by concept or meaning. In essence, its purpose is to link alternative names and views of the same concept together and to identify useful relationships between different concepts. All concepts in the Metathesaurus are assigned to at least one semantic type from the Semantic Network. This provides consistent categorization of all concepts in the Metathesaurus at the relatively general level represented in the Semantic Network.
The purpose of the Semantic Network is to provide a consistent categorization of all concepts represented in the UMLS Metathesaurus and to provide a set of useful relationships between these concepts. All information about specific concepts is found in the Metathesaurus; the Network provides information about the set of basic semantic types, or categories, which may be assigned to these concepts, and it defines the set of relationships that may hold between the semantic types. The current release of the Semantic Network contains 125 semantic types and 54 relationships. The Semantic Network serves as an authority for the semantic types that are assigned to concepts in the Metathesaurus. The Network defines these types, both with textual descriptions and by means of the information inherent in its hierarchies.
About this Dataset
Data Info
Date Created | 2009-07-29 |
---|---|
Last Modified | 2024-05-06 |
Version | 2024AA |
Update Frequency |
Semiannual |
Temporal Coverage |
N/A |
Spatial Coverage |
N/A |
Source | John Snow Labs; U.S. National Library of Medicine (NLM); |
Source License URL | |
Source License Requirements |
Reporting Requirements |
Source Citation |
Reporting Requirements |
Keywords | UMLS, Metathesaurus, UMLS concepts, Semantic Type, RRF, NLM |
Other Titles | Metathesaurus Concepts and Their Semantic Types, Concepts Nucleotide Sequence Concepts and Types, MRCONSO & MRSTY |
Data Fields
Name | Description | Type | Constraints |
---|---|---|---|
Concept_Unique_Identifier | Concept Unique Identifier (CUI) is the unique identifier for a Metathesaurus concept to which strings with the same meaning are linked. CUI starts with C followed by 7 digits. | string | required : 1 |
Language_of_Terms | Language of terms in the source vocabulary | string | required : 1 |
Term_Status | Status of the term. P= Preferred LUI of the CUI, S= Non-Preferred LUI of the CUI | string | required : 1 |
Lexical_Unique_Identifier | Lexical Unique Identifier (LUI) is the unique identifier of a term in the Metathesaurus. Terms are different from strings in that they group together strings that are lexical variants of one another. LUI starts with L followed by 7 digits. | string | required : 1 |
String_Type | Type of string. PF= Preferred form of the term, VCW=Case and word-order variant of the preferred form, VC=Case variant of the preferred form, VO=Variant of the preferred form, VW=Word-order variant of the preferred form. | string | required : 1 |
String_Unique_Identifier | String Unique Identifier (SUI) is a unique identifier for each unique string in the Metathesaurus. Strings that differ in any way, e.g., by upper or lower case, will have different SUIs. SUI starts with S followed by 7 digits. | string | required : 1 |
Is_Preferred | Indicates if the atom status is preferred (true) or not (false) for this string within this concept. | boolean | required : 1 |
Atom_Unique_Identifier | Atom Unique Identifier (AUI) is an identifier for the atom in the UMLS. It is the primary key to the concepts table. AUI starts with A followed by 7 digits. They are the concept names or strings from each of the source vocabularies | string | required : 1 |
Source_Asserted_Atom_Identifier | Source asserted identifier for an atom | string | - |
Source_Asserted_Concept_Identifier | Source asserted identifier for a concept | string | - |
Source_Asserted_Descriptor_Identifier | Source asserted identifier for descriptor in the metathesaurus | string | - |
Source_Abbreviation | Abbreviation of the source vocabulary | string | required : 1 |
Term_Type | Type of term within the source vocabulary. A value indicating the kind of role an atom plays in its source. Examples include PT for "preferred term," SY for "synonym," and MH for "main heading." | string | - |
Source_String_Code | Source string code (CODE) is the Unique Identifier or code for string in source | string | - |
String_Name | Name of string in the Metathesaurus | string | - |
Source_Restriction_Level | integer | level : Nominalrequired : 1 | |
Suppressible_Flag | In the UMLS Metathesaurus terms can be marked as "suppressible", these terms can then be removed from the subset. These terms are most often identified as suppressible because of ambiguity in meaning or lack of face validity. Suppressible flag. Values = E: Suppressible due to editor decision, N: Not suppressible, O: Obsolete, Y: Suppressible due to SAB | string | required : 1 |
Content_View_Flag_1 | Content View Flag. Bit field used to flag rows included in Content View. This field is a varchar field to maximize the number of bits available for use. | integer | level : Nominal |
Semantic_Type_Unique_Identifier | Unique id assigned to semantic type e.g., T004=Fungus, T005=Virus | string | required : 1 |
Semantic_Type_Tree_Identifier | Semantic type tree number | string | required : 1 |
Semantic_Type_Name | Name of semantic type | string | required : 1 |
Attribute_Type_Identifier | Each concept has specific attributes defining its meaning and is linked to the corresponding concept names in the various source vocabularies | string | required : 1 |
Content_View_Flag_2 | A bit field used to flag rows included in Content View. This field is a varchar field to maximize the number of bits available for use. | integer | level : Nominal |
Data Preview
Concept Unique Identifier | Language of Terms | Term Status | Lexical Unique Identifier | String Type | String Unique Identifier | Is Preferred | Atom Unique Identifier | Source Asserted Atom Identifier | Source Asserted Concept Identifier | Source Asserted Descriptor Identifier | Source Abbreviation | Term Type | Source String Code | String Name | Source Restriction Level | Suppressible Flag | Content View Flag 1 | Semantic Type Unique Identifier | Semantic Type Tree Identifier | Semantic Type Name | Attribute Type Identifier | Content View Flag 2 |
C0004793 | ENG | P | L0004793 | PF | S0017880 | False | A0029285 | M0002204 | D001483 | MSH | MH | D001483 | Base Sequence | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | |
C0004793 | ENG | P | L0004793 | PF | S0017880 | True | A3830926 | MTH | PN | NOCODE | Base Sequence | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | |||
C0004793 | ENG | P | L0004793 | VO | S0017883 | True | A0029288 | M0002204 | D001483 | MSH | PM | D001483 | Base Sequences | 0 | N | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | ||
C0004793 | ENG | P | L0004793 | VO | S0085044 | True | A0115449 | M0002204 | D001483 | MSH | PM | D001483 | Sequence, Base | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | |
C0004793 | ENG | P | L0004793 | VO | S0085062 | True | A0115462 | M0002204 | D001483 | MSH | PM | D001483 | Sequences, Base | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | |
C0004793 | ENG | P | L0004793 | VO | S11854897 | True | A18590096 | 4600.0 | 0000001719 | CHV | SY | 0000001719 | base sequence | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | |
C0004793 | ENG | S | L0028628 | PF | S0067638 | False | A10761827 | C45374 | NCI | PT | C45374 | Nucleotide Sequence | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | ||
C0004793 | ENG | S | L0028628 | PF | S0067638 | True | A26649344 | M0002204 | D001483 | MSH | ET | D001483 | Nucleotide Sequence | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | |
C0004793 | ENG | S | L0028628 | VO | S0067639 | True | A0093422 | M0002204 | D001483 | MSH | PM | D001483 | Nucleotide Sequences | 0 | N | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 | ||
C0004793 | ENG | S | L0028628 | VO | S0085053 | True | A0115455 | M0002204 | D001483 | MSH | PM | D001483 | Sequence, Nucleotide | 0 | N | 256.0 | T086 | A2.1.5.3.1 | Nucleotide Sequence | AT17613067 | 256 |