LangTest Archives

Human-in-the-loop (HITL) Workflows in the Generative AI Lab

by

Dia Trambitas, Ph.D.

In industries like healthcare, in which regulatory-grade accuracy is a requirement, human validation of model results is often a critical requirement. While models handle the legwork, the Generative AI Lab...

John Snow Labs Launches Automated Testing for Responsible AI—the First No-Code Tool to Test and Evaluate Custom Language Models

by

Gina Devine

John Snow Labs, the AI for healthcare company, today announced the release of Automated Responsible...

Evaluating Long-Form Responses with Prometheus-Eval and Langtest

by

Kalyan Chakravarthy

Prometheus-Eval and LangTest combine to provide an open-source, reliable, and cost-effective solution for evaluating long-form responses. Prometheus, trained on a comprehensive dataset, matches GPT-4’s performance, while LangTest offers a robust...

Mastering Model Evaluation: Introducing the Comprehensive Ranking & Leaderboard System in LangTest

by

Kalyan Chakravarthy

The Model Ranking & Leaderboard system, powered by LangTest from John Snow Labs, provides a systematic approach to evaluating and comparing AI models. It offers comprehensive ranking capabilities, historical comparison,...

Ensuring Precision of LLMs in Medical Domain: The Challenge of Drug Name Swapping

by

Kalyan Chakravarthy

Accurate drug name identification is vital for patient safety. Testing GPT-4o with Langtest, which offers a drug_generic_to_brand conversion test, identified potential errors where the model predicts incorrect drug names when...

LangTest Blog

Blog

Human-in-the-loop (HITL) Workflows in the Generative AI Lab

John Snow Labs Launches Automated Testing for Responsible AI—the First No-Code Tool to Test and Evaluate Custom Language Models

Evaluating Long-Form Responses with Prometheus-Eval and Langtest

Mastering Model Evaluation: Introducing the Comprehensive Ranking & Leaderboard System in LangTest

Ensuring Precision of LLMs in Medical Domain: The Challenge of Drug Name Swapping

Join the Global Healthcare AI Community

The Technology

The Technology in Action

Industry Trends

LangTest Blog

Blog

Human-in-the-loop (HITL) Workflows in the Generative AI Lab

John Snow Labs Launches Automated Testing for Responsible AI—the First No-Code Tool to Test and Evaluate Custom Language Models

Evaluating Long-Form Responses with Prometheus-Eval and Langtest

Mastering Model Evaluation: Introducing the Comprehensive Ranking & Leaderboard System in LangTest

Ensuring Precision of LLMs in Medical Domain: The Challenge of Drug Name Swapping