Kalyan Chakravarthy's Blog

For Foundation Models, Precision Matters — But Stability Matters More

by

Kalyan Chakravarthy

Understanding the Importance of Robustness in AI Models In recent years, foundation models have revolutionized the AI landscape, powering applications ranging from natural language understanding to image recognition. These models...

Robustness Testing of LLM Models Using LangTest in Databricks

by

Kalyan Chakravarthy

In the world of natural language processing (NLP), LLMs like GPT-4 have changed the game for how machines understand and generate human language. They are the foundation for a ton...

Evaluating Long-Form Responses with Prometheus-Eval and Langtest

by

Kalyan Chakravarthy

Prometheus-Eval and LangTest combine to provide an open-source, reliable, and cost-effective solution for evaluating long-form responses. Prometheus, trained on a comprehensive dataset, matches GPT-4’s performance, while LangTest offers a robust...

Mastering Model Evaluation: Introducing the Comprehensive Ranking & Leaderboard System in LangTest

by

Kalyan Chakravarthy

The Model Ranking & Leaderboard system, powered by LangTest from John Snow Labs, provides a systematic approach to evaluating and comparing AI models. It offers comprehensive ranking capabilities, historical comparison,...

Ensuring Precision of LLMs in Medical Domain: The Challenge of Drug Name Swapping

by

Kalyan Chakravarthy

Accurate drug name identification is vital for patient safety. Testing GPT-4o with Langtest, which offers a drug_generic_to_brand conversion test, identified potential errors where the model predicts incorrect drug names when...

Content by Kalyan Chakravarthy

Blog

For Foundation Models, Precision Matters — But Stability Matters More

Robustness Testing of LLM Models Using LangTest in Databricks

Evaluating Long-Form Responses with Prometheus-Eval and Langtest

Mastering Model Evaluation: Introducing the Comprehensive Ranking & Leaderboard System in LangTest

Ensuring Precision of LLMs in Medical Domain: The Challenge of Drug Name Swapping

Join the Global Healthcare AI Community

The Technology

The Technology in Action

Industry Trends

Content by Kalyan Chakravarthy

Blog

For Foundation Models, Precision Matters — But Stability Matters More

Robustness Testing of LLM Models Using LangTest in Databricks

Evaluating Long-Form Responses with Prometheus-Eval and Langtest

Mastering Model Evaluation: Introducing the Comprehensive Ranking & Leaderboard System in LangTest

Ensuring Precision of LLMs in Medical Domain: The Challenge of Drug Name Swapping