The word “hack” has been used and abused countless of times in the World Wide Web mainly because cyber perpetrators have now become so sophisticated that everyone thinks that hacking is only destructive. But here in John Snow Labs, the word “hack” which means, “to cut with rough or heavy blows” means a whole different thing. “Hack” for John Snow Labs means to cut through the difficult tasks, use heavy blows to crack and sift through the mess and extract valuable information. Just to be clear though, this process is done only in the premise of rightful access and appropriate licensing permissions.
In the last John Snow Labs Monthly Release, 287 datasets were carefully chosen and meticulously curated to serve the steadily growing and ever evolving complex needs of the healthcare industry; reason for the title “Health Hacked”. All these medical datasets passed several layers of strict quality assurance and DataOps screening and validation process.
A substantial number of clean and normalized data are now added to the John Snow Labs catalog, which are made available for data researchers, scientists and analysts to shake down, test out and investigate. All with the same goal and that is to enrich the health of everyone.
In a glance, here’s a view of what users can expect from the latest datasets from John Snow Lab:
4 new datasets from the accelerator Medicare Claims and Utilization have been added to the Cost category, which includes data for Provider Summary of Outpatient Prospective Payment System APC from years 2011 to 2014. New datasets have also been added for Chronic Condition Dyads and Triads for years 2013 to 2015. These datasets will be very helpful in calculating Medicare ambulatory payments for the upcoming years, and identifying chronic conditions per Triads and Dyads according to Medicare enrollees and fee-for-service programs.
A robust dataset for Census category on USA Crime Data is now included in the catalog, where it reveals information about crimes according to 2 categories: Violent and Property Crimes, which were reported to the police from the years 1981 to 2014. This would be very valuable in gauging health injuries that are related to crimes.
6 datasets for Devices category have now been added that uncovers data on Manufacturer and User Facility Device Experience Database, MAUDE with Patient Involvement, Medical Devices Recalls for 2017, Humanitarian Device Exemptions, Clinical Laboratory Improvement Amendments and Medical Device 510k Clearances. These datasets would be advantageous in establishing safe parameters in the use of medical devices for patient treatment and management.
15 datasets are also now showcased for Payments category, which are very handy for researchers wanting to look at Claims for Supplemental Security Income (SSI) Benefits for the Elderly, Blind, Disabled and those with Language Preferences; it also includes datasets for those with End Stage Renal Disease Medicare Claims. The datasets are in quarterly updates for the years between 2010 to 2017.
Physicians category also has 1 new dataset for National Provider Identifier to Medicare Crosswalk, which includes an array of other unique and FOIA-disclosable identifiers for health care providers.
Providers category now has Medicare Provider Utilization and Payment Data for the years 2013 and 2014. These datasets showcase prescription drug events (PDEs) incurred by Medicare beneficiaries with a Part D prescription drug plan for these years. This dataset will be functional in creating new programs for prescription drug plans for Medicare beneficiaries.
And lastly, Terminology category has had a recondition as 250 new datasets are added to its growing catalog. New accelerators from Anatomy, Activities, Concepts, Organism, Processes and Substances were expanded. These new datasets are from a broad array of information from concept structure of UMLS for the semantic types to information on relationships between concepts or atoms known to the Metathesaurus for the Anatomy accelerator.
Talk about Concepts and get different names for the same concept from many different vocabularies and provide information about the Semantic types for these concepts. Terminology Concepts have also been grouped according to activity relationships, age groups, amino acid sequence, biomedical occupation, body location, region, space or system; it also talks about clinical attributes, conceptual entities, professional society relationships and even regulation and law relationships, plus many others. One needs to dive into this information to get a good grasp of what it’s all about.
Organism, on the other hand, has information on various species related to their concept structure of UMLS Metathesaurus according to their concepts and relationships in many levels. Processes have information on biological and functional concepts on the cellular level, plus data on diseases and the many different factors that play a role in asymmetrical relationships that determines the direction of that relationship. Substances, which is the last accelerator for Terminology, has information on 133 semantic types in the Semantic Network and the various asymmetrical relationships that determine the direction of the relationship among others.
All in all, it was a long but steady process for John Snow Labs as new discoveries and new tools are set up to help raise the bar of health care. Providers are fostered to deliver the best of care to see favorable outcomes for beneficiaries, which in the future – will hopefully be also considered for government program initiatives.
Discover more on how Generative AI in Healthcare and Healthcare Chatbot solutions are transforming data-driven healthcare, enabling personalized insights, and supporting improved patient outcomes through innovative, accessible technology.
Try The Generative AI Lab - No-Code Platform For Model Tuning & Validation
See in action