ZenML

RAG-powered Decision Intelligence Platform for Manufacturing Knowledge Management

Circuitry.ai 2023
View original source

Circuitry.ai addressed the challenge of managing complex product information for manufacturers by developing an AI-powered decision intelligence platform. Using Databricks' infrastructure, they implemented RAG chatbots to process and serve proprietary customer data, resulting in a 60-70% reduction in information search time. The solution integrated Delta Lake for data management, Unity Catalog for governance, and custom knowledge bases with Llama and DBRX models for accurate response generation.

Industry

Tech

Technologies

Overview

Circuitry.ai is a startup founded in 2023 by industry leaders with the mission of providing decision intelligence to manufacturers of high-value, complex products such as heavy equipment and automotive. The company was explicitly founded to capitalize on recent advancements in generative AI, aiming to help manufacturers analyze, augment, and automate decisions throughout the buyer lifecycle. Their target customers face significant challenges in managing intricate products with complex bills of materials, requiring specialized knowledge for both sales and service operations. The core offering involves AI-powered advisors (RAG chatbots) that can disseminate product information effectively across various stakeholders including manufacturers, sales associates, channel partners, and end customers.

This case study, while presented as a customer success story by Databricks, offers valuable insights into the practical challenges of building and deploying RAG-based LLM applications in production environments, particularly when dealing with proprietary customer data and the need for multi-tenant data management.

The Problem Space

Circuitry.ai encountered several significant technical challenges in their journey to productionize their GenAI-powered decision support tools. It’s worth noting that these challenges are representative of what many organizations face when moving from prototype to production with RAG systems.

The first major challenge involved applying metadata filters on top of retrievers. The team found insufficient documentation around this capability, which is a common pain point in the RAG ecosystem where filtering retrieved results based on customer-specific metadata is essential for multi-tenant applications. This is particularly important when serving multiple manufacturing clients, each with their own product catalogs and proprietary information.

The second challenge centered on establishing internal quality checks for AI chatbots. The team needed to ensure their chatbots met evaluation criteria before deployment, highlighting the importance of having robust evaluation pipelines in any production LLM system. Without proper evaluation frameworks, organizations risk deploying chatbots that provide inaccurate or irrelevant responses, which could be particularly damaging in a B2B context where trust is paramount.

A third critical challenge was handling knowledge base updates without disrupting internal RAG chains. In a production environment where new products and knowledge articles are constantly being added, the system needs to support incremental updates without requiring full reprocessing of the entire knowledge base. This is a non-trivial problem that many RAG implementations struggle with.

Data segregation and governance emerged as a fundamental requirement. Since Circuitry.ai’s models were trained on their customers’ proprietary data (internal knowledge resources, product information, etc.), ensuring proper data isolation between tenants was essential for building customer trust. The CEO, Ashok Kartham, emphasized that “most of the data that we wanted to use was actually our customer’s proprietary data.”

Finally, the integration of multiple data sources with different structures and formats necessitated a robust data integration framework to ensure consistency. This is a common challenge in enterprise AI applications where information may be scattered across PDFs, manuals, knowledge bases, and various document formats.

Technical Architecture and Implementation

Data Infrastructure Layer

Circuitry.ai built their solution on the Databricks platform, leveraging several key components for their LLMOps infrastructure:

Delta Lake served as the foundational storage layer, providing ACID transactions and unified batch and streaming data processing. Critically, Delta Lake supported the incremental data updates that were crucial for keeping the knowledge base current as new products and knowledge articles were added. This addressed one of their primary pain points around knowledge base management and allowed them to update the RAG system without disrupting ongoing operations.

Unity Catalog provided unified governance for data and AI assets. This was particularly important for Circuitry.ai’s multi-tenant use case, where proper data segregation was essential to protect proprietary customer information. Unity Catalog facilitated the security and governance requirements that were foundational to their model training processes and ensured that client A’s data would never leak into client B’s chatbot responses.

Model Training and Deployment

MLflow enabled experiment tracking and model management, which is essential for any production ML/LLM system. While the case study doesn’t go into extensive detail about their experimentation process, the use of MLflow suggests they were following best practices around reproducibility and model versioning.

Databricks Model Serving provided the deployment infrastructure for their machine learning models. The case study specifically mentions it as a “highly performant serverless vector database with governance built in,” which suggests they were using it for both serving their embedding models and potentially their LLM endpoints.

RAG Pipeline Architecture

The workflow for creating custom knowledge bases followed a structured approach:

This notebook-driven approach, while perhaps not the most sophisticated CI/CD pipeline, provides a practical way for a small team to manage the deployment of customer-specific RAG instances.

Model Selection and Flexibility

Circuitry.ai implemented their RAG pipeline using generative AI models, specifically Llama and DBRX (Databricks’ own open-source LLM). The CEO highlighted the value of model flexibility: “Databricks has made it easier by supporting multiple models, and we saw this as an immediate benefit — the ability to switch between models and extensively test has been a real plus for us.”

This model-agnostic approach is a best practice in LLMOps, as it allows organizations to swap in newer or better-performing models as they become available without requiring significant architectural changes. Given the rapid pace of LLM development, this flexibility is essential for maintaining competitive AI products.

Continuous Improvement Through Feedback

A notable aspect of their implementation was the incorporation of a feedback mechanism for continuous improvement. Users could rate the GenAI-generated responses, creating a feedback loop that helped Circuitry.ai improve their Decision AIdvisor tool. This feedback was used to:

This user feedback mechanism is a critical component of production LLM systems, as it provides real-world signal on model performance that can be used to improve prompts, fine-tune models, or identify gaps in the knowledge base.

Metadata Filtering for Multi-Tenancy

With assistance from Databricks, Circuitry.ai implemented metadata filtering alongside Model Serving to filter results and make them more geared toward individual clients. This was particularly important for clients “with a multitude of products and applications,” ensuring that search results were relevant to the specific user’s context.

Results and Outcomes

The case study claims a 60-70% reduction in time spent searching for information for Circuitry.ai’s customers. While this is an impressive figure, it should be noted that this comes from a vendor case study and specific measurement methodologies are not provided. That said, this type of efficiency gain is consistent with what RAG systems can achieve when they effectively surface relevant information.

The efficiency gain was particularly evident in onboarding processes, where new employees could now have easier access to relevant information, allowing them to become productive more quickly. Customer feedback from proof-of-concept trials was described as “overwhelmingly positive,” with praise for the speed and relevance of AI-driven responses. The case study notes that customers now receive answers in seconds rather than minutes, eliminating the need to search through large PDF files manually.

Future Directions

Circuitry.ai has outlined several planned extensions to their platform:

These planned enhancements suggest a move toward more sophisticated AI agents that can take actions, not just answer questions, which represents the next frontier in enterprise AI applications.

Critical Assessment

While this case study presents a positive picture of Circuitry.ai’s implementation, there are some aspects worth noting from a balanced perspective:

Despite these caveats, the case study provides valuable insights into the practical challenges and solutions involved in deploying RAG-based LLM applications in a B2B SaaS context, particularly around multi-tenancy, data governance, incremental updates, and the importance of model flexibility and feedback mechanisms in production systems.

More Like This

Scaling AI Product Development with Rigorous Evaluation and Observability

Notion 2025

Notion AI, serving over 100 million users with multiple AI features including meeting notes, enterprise search, and deep research tools, demonstrates how rigorous evaluation and observability practices are essential for scaling AI product development. The company uses Brain Trust as their evaluation platform to manage the complexity of supporting multilingual workspaces, rapid model switching, and maintaining product polish while building at the speed of AI industry innovation. Their approach emphasizes that 90% of AI development time should be spent on evaluation and observability rather than prompting, with specialized data specialists creating targeted datasets and custom LLM-as-a-judge scoring functions to ensure consistent quality across their diverse AI product suite.

document_processing content_moderation question_answering +52

Agentic AI Copilot for Insurance Underwriting with Multi-Tool Integration

Snorkel 2025

Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.

healthcare fraud_detection customer_support +90

AI-Powered Vehicle Information Platform for Dealership Sales Support

Toyota 2025

Toyota Motor North America (TMNA) and Toyota Connected built a generative AI platform to help dealership sales staff and customers access accurate vehicle information in real-time. The problem was that customers often arrived at dealerships highly informed from internet research, while sales staff lacked quick access to detailed vehicle specifications, trim options, and pricing. The solution evolved from a custom RAG-based system (v1) using Amazon Bedrock, SageMaker, and OpenSearch to retrieve information from official Toyota data sources, to a planned agentic platform (v2) using Amazon Bedrock AgentCore with Strands agents and MCP servers. The v1 system achieved over 7,000 interactions per month across Toyota's dealer network, with citation-backed responses and legal compliance built in, while v2 aims to enable more dynamic actions like checking local vehicle availability.

customer_support chatbot question_answering +47