Posts tagged "evaluation"

MLOps

We Tried and Tested the 9 Best Comet Alternatives for Model Evaluation

In this article, you will learn about the best Comet alternatives for model evaluation.

Hamza Tahir

Feb 19, 202614 mins

LLMOps

What 1,200 Production Deployments Reveal About LLMOps in 2025

Alex Strick van Linschoten

Dec 19, 202518 mins

LLMOps

LLMOps in Production: Another 419 Case Studies of What Actually Works

Explore 419 new real-world LLMOps case studies from the ZenML database, now totaling 1,182 production implementations—from multi-agent systems to RAG.

Alex Strick van Linschoten

Dec 15, 202518 mins

LLMOps

8 Best DeepEval Alternatives: Which LLM Evaluation Framework is Better?

In this article, you will learn about the best DeepEval alternatives that you can use for LLM evaluation.

Hamza Tahir

Nov 20, 202514 mins

LLMOps

8 Best Langfuse Alternatives to Trace, Evaluate, and Manage Prompts for Your LLM Application

In this article, you learn about the best Langfuse alternatives for tracing, eval, prompt management, and metrics for LLM apps.

Hamza Tahir

Nov 14, 202515 mins

LLMOps

Best LLM Evaluation Tools: Top 9 Frameworks for Testing AI Models

Discover the 9 best LLM evaluation tools to test your AI models before going live.

Hamza Tahir

Oct 9, 202514 mins

Community

How I Built and Evaluated a Clinical RAG System with ZenML (and Why Custom Evaluation Matters)

On custom evaluation frameworks for clinical RAG systems, showing why domain-specific metrics matter more than plug-and-play solutions when trust and safety are non-negotiable.

Satya Patel

Sep 15, 20254 mins

LLMOps

The Annotated Guide to the Maven Evals Course (by way of the LLMOps Database)

Lessons from the Maven Evals course are combined with 50+ real-world case studies from ZenML's LLMOps Database to show how companies like Discord, GitHub, and Coursera implement the Three Gulfs model and Analyze-Measure-Improve lifecycle to transform failing LLM systems into production-ready applications.

Alex Strick van Linschoten

Jul 22, 202512 mins

LLMOps

LLMOps in Production: 287 More Case Studies of What Actually Works

287 latest curated summaries of LLMOps use cases in industry, from tech to healthcare to finance and more. This blog also highlights some of the trends observed across the case studies.

Alex Strick van Linschoten

Jul 17, 202515 mins

Tag: evaluation

We Tried and Tested the 9 Best Comet Alternatives for Model Evaluation

What 1,200 Production Deployments Reveal About LLMOps in 2025

LLMOps in Production: Another 419 Case Studies of What Actually Works

8 Best DeepEval Alternatives: Which LLM Evaluation Framework is Better?

8 Best Langfuse Alternatives to Trace, Evaluate, and Manage Prompts for Your LLM Application

Best LLM Evaluation Tools: Top 9 Frameworks for Testing AI Models

How I Built and Evaluated a Clinical RAG System with ZenML (and Why Custom Evaluation Matters)

The Annotated Guide to the Maven Evals Course (by way of the LLMOps Database)

LLMOps in Production: 287 More Case Studies of What Actually Works

Popular Topics