ZenML

Automating Healthcare Documentation and Rule Management with GenAI

Orizon 2024
View original source

Orizon, a healthcare tech company, faced challenges with manual code documentation and rule interpretation for their medical billing fraud detection system. They implemented a GenAI solution using Databricks' platform to automate code documentation and rule interpretation, resulting in 63% of tasks being automated and reducing documentation time to under 5 minutes. The solution included fine-tuned Llama2-code and DBRX models deployed through Mosaic AI Model Serving, with strict governance and security measures for protecting sensitive healthcare data.

Industry

Healthcare

Technologies

Overview

Orizon is a health tech company within Brazil’s Bradesco Group that operates a platform connecting insurance companies, hospitals, doctors, and patients. Their core mission involves using advanced analytics and artificial intelligence to detect financial fraud in medical billing—a significant challenge in Brazil’s healthcare market. This case study demonstrates how LLMs were deployed in production to automate internal documentation workflows for medical billing rules, dramatically improving operational efficiency while maintaining stringent data governance requirements.

The Problem

Orizon maintained approximately 40,000 medical rules in their system, with roughly 1,500 new rules being added each month. These rules, coded in legacy languages like C# and C++, were essential for determining healthcare eligibility and detecting fraudulent billing patterns. However, the documentation and interpretation process for these rules created severe operational bottlenecks.

The original workflow was highly manual and resource-intensive. Each time a new rule was added, developers had to manually examine the code, create documentation, and produce flowcharts. This process was described as “very manual and error-prone” by Guilherme Guisse, Head of Data and Analytics at Orizon. Business analysts frequently needed to request C++ developers to interpret code, creating additional delays. The documentation work alone required two dedicated developers from the business analysis team, and adding a single new rule could take several days to complete.

The GenAI Solution Architecture

Orizon’s approach to solving this problem involved several interconnected LLMOps components built on the Databricks Data Intelligence Platform.

Data Infrastructure Modernization

Before deploying LLMs, Orizon transitioned to a data lakehouse architecture to consolidate previously isolated databases into a unified system. Delta Lake provided the foundation for reliable data storage and management, enabling ACID transactions and unified streaming and batch data processing. This infrastructure modernization was essential groundwork for supporting GenAI workloads at scale.

Model Selection and Fine-Tuning

Orizon fine-tuned two models for their use case: Llama2-code and DBRX. The choice of Llama2-code is particularly relevant given the use case involves interpreting and documenting source code written in C# and C++. DBRX, Databricks’ own foundation model, was also employed. The fine-tuning process incorporated Orizon’s proprietary business rules and domain-specific data, which was critical for generating accurate documentation that reflected the company’s specific healthcare fraud detection logic.

MLflow for Lifecycle Management

MLflow, the open-source MLOps platform originally developed by Databricks, was used to manage the entire machine learning lifecycle. This included data ingestion, feature engineering, model building, tuning, and experiment tracking. The emphasis on reproducibility through MLflow is noteworthy—in a production LLMOps context, being able to track experiments and reproduce results is crucial for maintaining model quality over time and debugging issues that may arise.

Model Serving and Deployment

The fine-tuned models were deployed through Mosaic AI Model Serving, making them accessible across the organization in a secure manner. This serverless model serving approach simplified the deployment process and ensured that the models could scale to meet organizational demand without requiring dedicated infrastructure management.

Microsoft Teams Integration

A key aspect of the production deployment was the integration of the LLM-powered system into Microsoft Teams. This allowed business users to query the system using natural language and receive instant explanations about rules without relying on developers. The chatbot interface was designed to provide accurate, real-time answers along with graphical documentation, effectively democratizing access to code understanding across the organization.

Data Governance and Security

Given the sensitive nature of healthcare data and the proprietary nature of Orizon’s fraud detection rules, data governance was a critical consideration in the LLMOps implementation. Unity Catalog was employed to ensure secure data management of business rules. As Guisse noted, “We couldn’t send this data outside the company because our business rules are confidential.”

This constraint meant that Orizon needed to develop and deploy their LLM internally rather than relying on external API-based services. Unity Catalog’s ability to define and enforce granular permissions allowed Orizon to maintain the necessary security and compliance while still enabling authorized personnel to access data for critical operations. This approach aligns with best practices for deploying LLMs in regulated industries where data sovereignty and confidentiality are paramount.

The company operates in Brazil, where privacy regulations require strict protection of patient and medical billing information. The LLMOps implementation had to accommodate these compliance requirements while still delivering the intended productivity benefits.

Results and Outcomes

The case study reports several impressive outcomes, though as with any vendor-published case study, these should be considered with appropriate context:

Future Roadmap

Orizon has outlined additional AI use cases they plan to pursue, which indicates ongoing investment in LLMOps capabilities:

Critical Assessment

While the case study presents compelling results, several factors warrant consideration. This is a Databricks-published customer story, so it naturally emphasizes positive outcomes. The 1 billion BRL productivity figure is described as potential rather than realized. Additionally, the long-term sustainability of the 63% automation rate and whether the system maintains accuracy over time as rules evolve is not addressed.

The technical approach appears sound, with appropriate attention to MLOps practices (using MLflow for lifecycle management), security considerations (Unity Catalog for governance), and practical deployment patterns (Microsoft Teams integration for accessibility). The choice to fine-tune models rather than rely on general-purpose LLMs reflects an understanding that domain-specific accuracy is critical in a fraud detection context.

The case study demonstrates a pragmatic approach to LLMOps in a regulated industry, balancing the productivity benefits of automation with the security and compliance requirements inherent in healthcare applications.

More Like This

Agentic AI Copilot for Insurance Underwriting with Multi-Tool Integration

Snorkel 2025

Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.

healthcare fraud_detection customer_support +90

Deploying Secure AI Agents in Highly Regulated Financial and Gaming Environments

Sicoob / Holland Casino 2025

Two organizations operating in highly regulated industries—Sicoob, a Brazilian cooperative financial institution, and Holland Casino, a government-mandated Dutch gaming operator—share their approaches to deploying generative AI workloads while maintaining strict compliance requirements. Sicoob built a scalable infrastructure using Amazon EKS with GPU instances, leveraging open-source tools like Karpenter, KEDA, vLLM, and Open WebUI to run multiple open-source LLMs (Llama, Mistral, DeepSeek, Granite) for code generation, robotic process automation, investment advisory, and document interaction use cases, achieving cost efficiency through spot instances and auto-scaling. Holland Casino took a different path, using Anthropic's Claude models via Amazon Bedrock and developing lightweight AI agents using the Strands framework, later deploying them through Bedrock Agent Core to provide management stakeholders with self-service access to cost, security, and operational insights. Both organizations emphasized the importance of security, governance, compliance frameworks (including ISO 42001 for AI), and responsible AI practices while demonstrating that regulatory requirements need not inhibit AI adoption when proper architectural patterns and AWS services are employed.

healthcare fraud_detection customer_support +50

Healthcare Conversational AI and Multi-Model Cost Management in Production

Amberflo / Interactly.ai

A panel discussion featuring Interactly.ai's development of conversational AI for healthcare appointment management, and Amberflo's approach to usage tracking and cost management for LLM applications. The case study explores how Interactly.ai handles the challenges of deploying LLMs in healthcare settings with privacy and latency constraints, while Amberflo addresses the complexities of monitoring and billing for multi-model LLM applications in production.

healthcare customer_support high_stakes_application +24