- Delivery side owned retrieval logic and workflow structure
Case studies
AI & ML development

RAG in Production with Cost and Latency Control
CLIENT:Eurekantine
WORKFLOW:AI weekly meal planning with retrieval over internal menu data
SYSTEM STATE:Live planning workflow, rotating menu rules, internal dish database
01
Eurekantine moved weekly menu planning into an AI-assisted retrieval workflow
The case focused on weekly meal planning. Managers had to prepare menus manually, keep variety high, and avoid repeating dishes from recent weeks. The AI workflow reduced that effort by turning a manager request into a focused candidate set and then into a weekly menu plan.
The important part was the retrieval structure behind the output. The system did not send the whole dish database to the model. It filtered recent dishes first, extracted request keywords, searched the internal menu data, and only then asked the model to build the weekly plan.
02
Context and constraints
The menu database was large enough that sending everything to the model would have created noise and unnecessary token use.
The planning flow also had live business rules: preserve variety, rotate out recent dishes, and reflect manager preferences such as more spicy or more vegetarian options.
What shaped the workflow
Menus had to avoid repeating recent dishes
Manager instructions changed the target result
Internal product data carried the real menu context
Retrieval had to stay relevant enough for daily use
The workflow had to remain fast enough to replace manual planning
03
Retrieval happened before generation
Each dish in the internal database had structured properties and a vector representation. The system also turned the manager request into searchable keywords and encoded those signals for retrieval. That made it possible to find relevant dishes before building the final plan.
The generation step worked on a smaller candidate set instead of the full menu base. That reduced noise, supported RAG latency control, and gave the model a clearer working set for the weekly plan.
Workflow steps
•Take manager input for the upcoming weekly menu
•Remove a large share of recently used dishes from the candidate pool
•Extract relevant request keywords such as spicy or vegetarian
•Encode those signals for vector search
•Retrieve the most relevant dishes from the internal database
•Build the weekly plan from the narrowed set
04
Relevance improved when retrieval became more structured
The useful change came when the team stopped treating the manager request as one large instruction.
An intermediate step extracted the important keywords first, which improved retrieval precision and gave final generation cleaner inputs.
What improved retrieval quality
Request keywords became more explicit
Candidate selection became more focused
The model saw fewer irrelevant dishes
Final menu generation had a clearer working set
Output became more usable for routine planning
05
Search space reduction kept the workflow practical
The operating constraint was simple: the workflow had to save time, not create a slower planning loop. Searching the full dish base would have pushed too much irrelevant context into generation. Narrowing the candidate pool kept the flow lighter and easier to run.
The retrieval layer helped in two ways. It improved relevance and reduced the amount of data passed into the model. That supported RAG cost control and kept the workflow manageable for repeated weekly use, even though exact cost metrics were not the main business KPI.
Where control came from
01Recent dishes were filtered out before retrieval
02Only the most relevant dishes were sent forward
03Intermediate keyword extraction improved targeting
04The final prompt stayed smaller than a full-database approach
05The workflow stayed fast enough for repeated weekly use
06
The main risk sat in weak relevance and unstable planning quality
This workflow did not carry contract or pricing risk, but it still carried operational risk. If retrieval quality dropped, the weekly plan could become repetitive, drift away from manager intent, or become too noisy to trust.
That would push the team back into manual planning.
Highest-risk failure modes
Poor candidate retrieval from the internal database
Manager intent translated too loosely into search signals
Too much irrelevant context in the final prompt
Weak rotation logic around recent menus
Output that looked plausible but was operationally unhelpful
07
The workflow was tested in staging before live release
The planning flow was tested in staging first. Once the output was good enough for practical use, the feature moved into live operation.
The work included iterations around retrieval quality and output structure before release.
What supported launch confidence
•Staging use before live deployment
•Iterations around search precision
•Better structured prompts and output expectations
•Stable enough menu generation for routine planning
•Continued operational use after release
08
Weekly planning became faster and less manual
The largest effect was time reduction for the manager role.
The workflow removed much of the manual comparison, rotation checking, and search across the menu base.
What improved
Less manual planning work
Faster weekly menu preparation
Better use of internal menu data
More consistent rotation across weeks
Ongoing production use instead of one-off experimentation
09
Ongoing usefulness depended on continued tuning
The workflow stayed useful because retrieval logic, keyword extraction, and output structure could be refined as the team learned from live use.
This was an operational planning workflow that improved through iteration.
Ownership boundaries
- Client-side operational owner judged menu usefulness
- Search quality and prompt behavior improved through iteration
- Production value depended on continued fit with live planning needs
Retrieval workflows become production-useful when internal data is narrowed before generation
This enterprise RAG case study shows how Eurekantine used vector search, keyword extraction, and candidate filtering to turn internal menu data into a stable planning workflow. That delivery shape fits production workflows where the model needs relevant internal data, fast response, controlled cost, and repeatable output.
