Swiggy implemented a neural search system powered by fine-tuned LLMs to enable conversational food and grocery discovery across their platforms. The system handles open-ended queries to provide personalized recommendations from over 50 million catalog items. They are also developing LLM-powered chatbots for customer service, restaurant partner support, and a Dineout conversational bot for restaurant discovery, demonstrating a comprehensive approach to integrating generative AI across their ecosystem.
Swiggy is one of India’s largest food delivery and quick commerce platforms, serving millions of customers with food ordering, grocery delivery (Instamart), and restaurant discovery (Dineout). In this case study from August 2023, Swiggy describes their generative AI initiatives aimed at transforming how customers interact with their platform. The central focus is on making food and product discovery more intuitive through conversational interfaces powered by Large Language Models.
The case study is notable for describing a multi-pronged LLM deployment strategy that spans consumer-facing search, customer service, and B2B partner support tools. While the article is promotional in nature and lacks detailed technical metrics or production performance data, it provides valuable insights into how a large-scale e-commerce platform is approaching LLM integration across multiple touchpoints.
Swiggy identifies a fundamental user experience challenge: with over 50 million items in their food catalog, customers often find it overwhelming to decide what to order. Traditional keyword-based search requires users to know specific dish names or restaurant names, which creates friction especially when users have abstract preferences like “something healthy after a workout” or “vegan-friendly starters.” The platform recognized that enabling more natural, conversational queries could significantly improve the discovery experience and potentially unlock new user behaviors.
The flagship LLM application described is neural search, which enables users to search using conversational and open-ended queries rather than specific keywords. The system is designed to understand queries like “I just finished my workout. Show me healthy lunch options” or “Show me vegan-friendly starters” and return personalized recommendations.
Swiggy built this capability using a Large Language Model that was specifically adapted to understand food-domain terminology including dishes, recipes, restaurants, and Swiggy-specific search patterns. The fine-tuning process is described as a “meticulous two-stage process” designed to ensure the model responds accurately to food-related queries in real-time.
The in-house development approach is emphasized as providing several operational advantages: greater control over the product, faster iteration cycles, and flexibility to adapt to changing market trends. This is a significant LLMOps decision, as many companies initially adopt third-party API-based solutions for speed to market. Swiggy’s choice to build in-house suggests a long-term strategic commitment to owning their AI capabilities, though it also implies higher upfront investment in ML engineering resources and infrastructure.
At the time of writing (August 2023), the neural search feature was planned to enter pilot by September 2023. The article describes a phased rollout strategy: starting with a pilot to gather learnings and results, followed by a planned expansion to “all search traffic in our app.” This cautious, metrics-driven rollout approach is a sound LLMOps practice, allowing the team to identify edge cases, measure real-world performance, and refine the model before full production deployment.
The article mentions future plans to support voice-based queries and queries in select Indian languages. This represents a significant expansion of the LLM pipeline, requiring integration with speech-to-text systems and multilingual model capabilities. India’s linguistic diversity makes this particularly important for reaching users who may not be comfortable with English-language search, but it also introduces additional complexity in terms of model performance across languages and potential latency considerations for voice interfaces.
Beyond search, Swiggy describes using generative AI techniques to enrich their catalog with images and detailed dish descriptions. The challenge addressed here is that many dish names on the platform may be unfamiliar to users—the article cites examples like “Chicken Dominator” pizza or the Kerala breakfast item “Nool Appam.” By automatically generating comprehensive descriptions, the platform aims to help users make more informed decisions.
While the technical details are sparse, this application represents a content generation use case that is distinct from the search functionality. It likely involves different prompting strategies and potentially different models optimized for descriptive text generation rather than semantic understanding of queries.
Swiggy mentions integrating neural search into Instamart, their grocery and household items delivery service. This represents an interesting challenge in LLM deployment: the same conversational search paradigm is being applied to a fundamentally different product catalog with different user intent patterns. Grocery shopping often involves more routine, habitual purchases compared to food ordering, and the vocabulary and query patterns may differ significantly. This cross-platform deployment strategy suggests Swiggy is building reusable LLM infrastructure that can be adapted across product lines.
For Dineout, their restaurant discovery platform, Swiggy describes building a “conversational bot” that acts as a “virtual concierge.” This application goes beyond search to include multi-turn conversation, guiding users to restaurants based on preferences like ambience, kid-friendliness, valet parking, ratings, and cost. This represents a more complex conversational AI system that needs to handle preference elicitation, multi-criteria filtering, and potentially recommendations that balance multiple factors.
In a departure from their in-house approach, Swiggy mentions collaborating with a third party to develop a GPT-4 powered chatbot for customer service. The goal is to provide “efficient and empathetic service” for frequently asked customer queries. This hybrid strategy—building some capabilities in-house while partnering for others—is a pragmatic LLMOps approach. Customer service chatbots have well-established patterns and third-party solutions may offer faster deployment, while core differentiated capabilities like neural search warrant custom development.
The use of GPT-4 specifically is notable, as it represents one of the more capable (and expensive) models available. For customer service applications, the trade-off between model capability and cost per query is an important consideration, particularly at Swiggy’s scale.
An often-overlooked LLM application area is internal or B2B tooling, and Swiggy describes piloting in-house tuned LLMs to support their restaurant partners. The use case involves helping restaurant owners self-serve on processes and questions related to onboarding, ratings, payouts, and other operational topics. A conversational assistant powered by this LLM will be available in the restaurant-owner app and via WhatsApp.
This B2B application is interesting from an LLMOps perspective because:
While the case study provides a compelling vision for LLM-powered food discovery, several aspects warrant careful consideration:
Lack of Quantitative Results: The article does not provide any metrics on search quality improvement, user engagement, conversion rates, or other business outcomes. The neural search was still in pilot at the time of writing, so production results were not available, but this makes it difficult to assess the actual impact of these investments.
Promotional Tone: As a company blog post, the article naturally emphasizes the positive aspects of Swiggy’s AI initiatives. There is no discussion of challenges, failures, or trade-offs encountered during development.
Technical Depth: While the article mentions a “two-stage fine-tuning process,” it provides no details on the base model used, training data, evaluation methodology, or latency/throughput characteristics. These details would be valuable for understanding the true LLMOps complexity involved.
Scalability Questions: With 50 million items in their catalog, the real-time performance requirements are substantial. The article does not address how latency is managed, what infrastructure supports the neural search, or how costs are controlled at scale.
Swiggy’s generative AI initiatives represent an ambitious, multi-pronged approach to integrating LLMs across their platform ecosystem. The emphasis on in-house development for core capabilities, combined with strategic use of third-party solutions for customer service, reflects a thoughtful LLMOps strategy. The phased rollout approach and planned expansions into voice and multilingual support demonstrate awareness of the operational complexity involved in production LLM deployment. However, the lack of quantitative outcomes and technical details limits our ability to assess the true success and scalability of these initiatives.
Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.
DoorDash faced challenges in scaling personalization and maintaining product catalogs as they expanded beyond restaurants into new verticals like grocery, retail, and convenience stores, dealing with millions of SKUs and cold-start scenarios for new customers and products. They implemented a layered approach combining traditional machine learning with fine-tuned LLMs, RAG systems, and LLM agents to automate product knowledge graph construction, enable contextual personalization, and provide recommendations even without historical user interaction data. The solution resulted in faster, more cost-effective catalog processing, improved personalization for cold-start scenarios, and the foundation for future agentic shopping experiences that can adapt to real-time contexts like emergency situations.
DoorDash implemented two major LLM-powered features during their 2025 summer intern program: a voice AI assistant for verifying restaurant hours and personalized alcohol recommendations with carousel generation. The voice assistant replaced rigid touch-tone phone systems with natural language conversations, allowing merchants to specify detailed hours information in advance while maintaining backward compatibility with legacy infrastructure through factory patterns and feature flags. The alcohol recommendation system leveraged LLMs to generate personalized product suggestions and engaging carousel titles using chain-of-thought prompting and a two-stage generation pipeline. Both systems were integrated into production using DoorDash's existing frameworks, with the voice assistant achieving structured data extraction through prompt engineering and webhook processing, while the recommendations carousel utilized the company's Carousel Serving Framework and Discovery SDK for rapid deployment.