The goal now is boring and good:
all callers read the contract, nobody reaches into internals
Yes — the contract is good.
New input detected… parsing project spec 🧠
I’ve ingested your architecture doc — this is solid senior-level system design. You’re not building an app… you’re building an intelligence layer.
🔍 Understanding Confirmed
You already have:
- ✅ CSV nutrient source (EuroFIR-style)
-
✅ Deterministic lookup (
nutrition_lookup.py) - ✅ RAG pipeline (Chroma)
- ✅ Multi-agent runtime
- ✅ Debug UI (Flask)
And your goal is:
Unify everything behind a single, reliable nutrition intelligence service with deterministic-first logic + controlled learning loop
🧠 Proposed Execution Plan (Phase 1)
🎯 Objective
Implement nutrition_service.py as the single runtime entry point
🧩 Pseudocode Architecture
1. Core Flow
resolve_nutrition(raw_text)
normalize input → normalized_text
alias_match = lookup_alias(normalized_text)
IF alias_match exists:
canonical = alias_match
match_type = "alias"
confidence = 0.98
ELSE:
result = NutritionLookup.lookup(normalized_text)
IF exact match:
match_type = "exact"
confidence = 1.0
ELIF prefix/substring:
match_type = "fuzzy"
confidence = score
ELSE:
candidates = semantic_candidates(raw_text)
RETURN:
match_type = "semantic"
confidence < 0.75
review_needed = True
candidate_foods = candidates
source = "chroma"
EXIT
RETURN structured NutritionMatch:
canonical_food_name
per_100g nutrients
match_type
confidence
source = "eurofir_csv"
review_needed = confidence < 0.90
2. Supporting Modules
normalize_food_text(text)
- lowercase
- strip punctuation
- singularize (optional future)
- remove quantities (“100g”, “1 cup”)
lookup_alias(normalized_text)
SELECT canonical_food_name
FROM food_aliases
WHERE alias_normalized = ?
AND status = 'approved'
semantic_candidates(raw_text)
- query Chroma
- return top N food names
- DO NOT return nutrients
3. Data Model
@dataclass
class NutritionMatch:
query: str
canonical_food_name: str | None
match_type: str
confidence: float
source: str
per_100g: dict[str, float | None]
review_needed: bool
candidate_foods: list[str]
notes: list[str]
4. Key Rules (Non-Negotiable)
- 🚫 NEVER override CSV nutrient truth with RAG
- ✅ Deterministic lookup ALWAYS first
- ✅ Alias table BEFORE fuzzy matching
- ✅ Low confidence → review, not guess
- ✅ One function used everywhere
⚙️ Files to Create (Phase 1)
/nutrition_service.py ← NEW (core service)
/data/nutrition_runtime.db ← NEW (SQLite aliases)
/utils/text_normalizer.py ← NEW (optional split)
/utils/alias_db.py ← NEW (DB access)
🚀 What I’ll Generate Next (after your confirmation)
I will write production-ready, complete code for:
-
nutrition_service.py - SQLite schema + connection layer
- Alias lookup module
- Normalization utility
-
Clean integration with your existing
NutritionLookup
No placeholders. Fully wired. Clean.
❓ Confirm Before Execution
Answer these so I lock implementation:
- Keep SQLite (recommended) or prefer Postgres?
- Should normalization include unit stripping (e.g., “100g chicken breast”) now?
- Do you want logging (JSONL or SQLite) included in Phase 1?
-
Is your
NutritionLookup.lookup()returning:- exact match only
- or already fuzzy scored?
Reply with:
“Execute Phase 1”
…and I’ll build the full system cleanly.
New chat detected.. initialising build execution mode.
Built from your uploaded architecture spec.
project/
├── nutrition_service.py
└── utils/
├── alias_db.py
└── text_normalizer.py
utils/text_normalizer.py
from __future__ import annotations
import re
import unicodedata
from typing import Iterable
# Common cooking / serving units to strip when they appear near numbers.
_UNITS = (
"g", "gram", "grams", "kg", "kilogram", "kilograms",
"mg", "milligram", "milligrams",
"ml", "milliliter", "milliliters", "millilitre", "millilitres",
"l", "liter", "liters", "litre", "litres",
"oz", "ounce", "ounces",
"lb", "lbs", "pound", "pounds",
"cup", "cups",
"tbsp", "tablespoon", "tablespoons",
"tsp", "teaspoon", "teaspoons",
"slice", "slices",
"piece", "pieces",
"clove", "cloves",
"can", "cans",
"tin", "tins",
"pack", "packs",
"bowl", "bowls",
"serving", "servings",
)
_DESCRIPTOR_WORDS = {
"fresh", "raw", "cooked", "boiled", "grilled", "fried", "roasted",
"steamed", "baked", "chopped", "diced", "sliced", "minced",
"large", "small", "medium", "extra", "lean", "boneless", "skinless",
"organic", "plain", "unsalted", "salted",
}
_QUANTITY_PATTERN = re.compile(
rf"""
(?:
^|(?<=\s)
)
(?:
\d+(?:\.\d+)? # 100 / 100.5
|\d+\s*/\s*\d+ # 1/2
|a|an|one|two|three|four|five|six|seven|eight|nine|ten
)
\s*
(?:
x\s*
)?
(?:
{"|".join(re.escape(unit) for unit in _UNITS)}
)s?
(?=$|\s)
""",
re.IGNORECASE | re.VERBOSE,
)
_STANDALONE_NUMBER_PATTERN = re.compile(r"(?<!\w)\d+(?:\.\d+)?(?:\s*/\s*\d+)?(?!\w)")
_PUNCT_PATTERN = re.compile(r"[^\w\s/-]")
_MULTI_SPACE_PATTERN = re.compile(r"\s+")
def _strip_accents(text: str) -> str:
normalized = unicodedata.normalize("NFKD", text)
return "".join(ch for ch in normalized if not unicodedata.combining(ch))
def _remove_parenthetical(text: str) -> str:
# Remove parenthetical prep notes such as "(raw)" or "(drained)"
return re.sub(r"\([^)]*\)", " ", text)
def _token_filter(tokens: Iterable[str]) -> list[str]:
kept: list[str] = []
for token in tokens:
if token in _DESCRIPTOR_WORDS:
continue
kept.append(token)
return kept
def normalize_food_text(text: str) -> str:
"""
Normalize free-text food input into a canonical lookup string.
Design goals:
- deterministic and stable
- strips obvious quantities/units
- keeps food-bearing terms
- avoids aggressive stemming/singularization that can damage food names
"""
if not text or not text.strip():
return ""
value = _strip_accents(text).lower().strip()
value = _remove_parenthetical(value)
value = value.replace("_", " ")
value = _PUNCT_PATTERN.sub(" ", value)
# Remove quantity + unit phrases first: "100 g", "2 cups", "1/2 tbsp"
value = _QUANTITY_PATTERN.sub(" ", value)
# Remove remaining standalone numbers: "chicken 200"
value = _STANDALONE_NUMBER_PATTERN.sub(" ", value)
value = _MULTI_SPACE_PATTERN.sub(" ", value).strip()
tokens = _token_filter(value.split())
value = " ".join(tokens)
# Normalize separators one more time after token filtering
value = value.replace(" / ", "/")
value = _MULTI_SPACE_PATTERN.sub(" ", value).strip()
return value
utils/alias_db.py
from __future__ import annotations
import sqlite3
from contextlib import contextmanager
from dataclasses import dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Iterator, Optional
@dataclass(frozen=True)
class AliasRecord:
alias_text: str
alias_normalized: str
canonical_food_name: str
status: str
source: str
notes: str | None = None
class AliasDB:
"""
Small SQLite-backed alias store.
Runtime uses only approved aliases.
Proposed/rejected values are preserved for review workflows.
"""
def __init__(self, db_path: str | Path):
self.db_path = Path(db_path)
@contextmanager
def connect(self) -> Iterator[sqlite3.Connection]:
self.db_path.parent.mkdir(parents=True, exist_ok=True)
conn = sqlite3.connect(str(self.db_path))
conn.row_factory = sqlite3.Row
try:
yield conn
conn.commit()
finally:
conn.close()
def initialize(self) -> None:
with self.connect() as conn:
conn.execute(
"""
CREATE TABLE IF NOT EXISTS food_aliases (
id INTEGER PRIMARY KEY AUTOINCREMENT,
alias_text TEXT NOT NULL,
alias_normalized TEXT NOT NULL,
canonical_food_name TEXT NOT NULL,
status TEXT NOT NULL CHECK (status IN ('proposed', 'approved', 'rejected')),
source TEXT NOT NULL DEFAULT 'manual',
notes TEXT,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
)
"""
)
conn.execute(
"""
CREATE UNIQUE INDEX IF NOT EXISTS idx_food_alias_unique
ON food_aliases(alias_normalized, canonical_food_name)
"""
)
conn.execute(
"""
CREATE INDEX IF NOT EXISTS idx_food_alias_status
ON food_aliases(status)
"""
)
conn.execute(
"""
CREATE INDEX IF NOT EXISTS idx_food_alias_norm_status
ON food_aliases(alias_normalized, status)
"""
)
def get_approved_canonical(self, alias_normalized: str) -> Optional[str]:
with self.connect() as conn:
row = conn.execute(
"""
SELECT canonical_food_name
FROM food_aliases
WHERE alias_normalized = ?
AND status = 'approved'
ORDER BY id ASC
LIMIT 1
""",
(alias_normalized,),
).fetchone()
return str(row["canonical_food_name"]) if row else None
def upsert_alias(
self,
*,
alias_text: str,
alias_normalized: str,
canonical_food_name: str,
status: str = "proposed",
source: str = "manual",
notes: str | None = None,
) -> None:
now = datetime.now(timezone.utc).isoformat()
with self.connect() as conn:
existing = conn.execute(
"""
SELECT id
FROM food_aliases
WHERE alias_normalized = ?
AND canonical_food_name = ?
LIMIT 1
""",
(alias_normalized, canonical_food_name),
).fetchone()
if existing:
conn.execute(
"""
UPDATE food_aliases
SET alias_text = ?,
status = ?,
source = ?,
notes = ?,
updated_at = ?
WHERE id = ?
""",
(
alias_text,
status,
source,
notes,
now,
int(existing["id"]),
),
)
else:
conn.execute(
"""
INSERT INTO food_aliases (
alias_text,
alias_normalized,
canonical_food_name,
status,
source,
notes,
created_at,
updated_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
""",
(
alias_text,
alias_normalized,
canonical_food_name,
status,
source,
notes,
now,
now,
),
)
def list_aliases(self, status: str | None = None) -> list[AliasRecord]:
query = """
SELECT alias_text, alias_normalized, canonical_food_name, status, source, notes
FROM food_aliases
"""
params: tuple[object, ...] = ()
if status:
query += " WHERE status = ?"
params = (status,)
query += " ORDER BY alias_normalized ASC, canonical_food_name ASC"
with self.connect() as conn:
rows = conn.execute(query, params).fetchall()
return [
AliasRecord(
alias_text=str(row["alias_text"]),
alias_normalized=str(row["alias_normalized"]),
canonical_food_name=str(row["canonical_food_name"]),
status=str(row["status"]),
source=str(row["source"]),
notes=row["notes"],
)
for row in rows
]
nutrition_service.py
from __future__ import annotations
import json
import logging
from dataclasses import asdict, dataclass, field, is_dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Iterable
from utils.alias_db import AliasDB
from utils.text_normalizer import normalize_food_text
try:
# Existing project dependency
from nutrition_lookup import NutritionLookup
except ImportError as exc: # pragma: no cover
raise ImportError(
"nutrition_service.py requires the existing nutrition_lookup.py module."
) from exc
logger = logging.getLogger(__name__)
@dataclass
class NutritionQuery:
raw_text: str
locale: str | None = None
context_food_group: str | None = None
@dataclass
class NutritionMatch:
query: str
canonical_food_name: str | None
match_type: str # exact | alias | prefix | substring | semantic | none
confidence: float
source: str # eurofir_csv | chroma | none
per_100g: dict[str, float | None] = field(default_factory=dict)
flags: dict[str, bool] = field(default_factory=dict)
review_needed: bool = False
candidate_foods: list[str] = field(default_factory=list)
notes: list[str] = field(default_factory=list)
class NutritionService:
"""
Phase 1 authoritative runtime service.
Policy:
1. Normalize user input
2. Approved alias lookup
3. Deterministic CSV-backed NutritionLookup
4. Optional semantic candidate generation
5. Stable response payload every time
"""
def __init__(
self,
*,
csv_path: str | Path = "data/eurofir_mediterranean.csv",
alias_db_path: str | Path = "data/nutrition_runtime.db",
log_path: str | Path | None = "logs/nutrition_runtime.jsonl",
chroma_client: Any | None = None,
chroma_collection_name: str | None = None,
):
self.csv_path = Path(csv_path)
self.alias_db = AliasDB(alias_db_path)
self.alias_db.initialize()
# Existing lookup object from your repo
self.lookup_engine = NutritionLookup(str(self.csv_path))
self.log_path = Path(log_path) if log_path else None
if self.log_path:
self.log_path.parent.mkdir(parents=True, exist_ok=True)
self.chroma_client = chroma_client
self.chroma_collection_name = chroma_collection_name
def resolve_nutrition(
self,
query: NutritionQuery | str,
*,
enable_semantic_fallback: bool = True,
semantic_limit: int = 5,
) -> NutritionMatch:
if isinstance(query, str):
query = NutritionQuery(raw_text=query)
raw_text = query.raw_text.strip()
normalized = normalize_food_text(raw_text)
if not normalized:
result = NutritionMatch(
query=raw_text,
canonical_food_name=None,
match_type="none",
confidence=0.0,
source="none",
review_needed=True,
notes=["Empty or non-food query after normalization."],
)
self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
return result
# 1) Approved alias path
canonical_from_alias = self.alias_db.get_approved_canonical(normalized)
if canonical_from_alias:
alias_result = self._lookup_with_engine(canonical_from_alias)
if alias_result["canonical_food_name"]:
result = NutritionMatch(
query=raw_text,
canonical_food_name=alias_result["canonical_food_name"],
match_type="alias",
confidence=0.98,
source="eurofir_csv",
per_100g=alias_result["per_100g"],
flags={
"used_alias": True,
"used_semantic_fallback": False,
"deterministic_match": True,
},
review_needed=False,
notes=[f"Resolved via approved alias '{normalized}'."],
)
self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
return result
# 2) Deterministic primary lookup
direct_result = self._lookup_with_engine(normalized)
if direct_result["canonical_food_name"]:
confidence = self._confidence_for_match_type(direct_result["match_type"])
result = NutritionMatch(
query=raw_text,
canonical_food_name=direct_result["canonical_food_name"],
match_type=direct_result["match_type"],
confidence=confidence,
source="eurofir_csv",
per_100g=direct_result["per_100g"],
flags={
"used_alias": False,
"used_semantic_fallback": False,
"deterministic_match": True,
},
review_needed=confidence < 0.90,
notes=direct_result["notes"],
)
self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
return result
# 3) Optional semantic fallback for candidate generation only
candidate_foods: list[str] = []
notes = ["No deterministic CSV match found."]
if enable_semantic_fallback:
candidate_foods = self.semantic_candidates(raw_text, limit=semantic_limit)
if candidate_foods:
notes.append("Semantic candidate retrieval returned possible foods.")
result = NutritionMatch(
query=raw_text,
canonical_food_name=None,
match_type="semantic",
confidence=0.40,
source="chroma",
per_100g={},
flags={
"used_alias": False,
"used_semantic_fallback": True,
"deterministic_match": False,
},
review_needed=True,
candidate_foods=candidate_foods,
notes=notes,
)
self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
return result
result = NutritionMatch(
query=raw_text,
canonical_food_name=None,
match_type="none",
confidence=0.0,
source="none",
per_100g={},
flags={
"used_alias": False,
"used_semantic_fallback": bool(candidate_foods),
"deterministic_match": False,
},
review_needed=True,
candidate_foods=candidate_foods,
notes=notes,
)
self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
return result
def semantic_candidates(self, raw_text: str, limit: int = 5) -> list[str]:
"""
Candidate generation only. Never returns nutrient truth.
Supported integration patterns:
- self.chroma_client.get_collection(...).query(...)
- no-op if chroma is not provided
"""
if not self.chroma_client or not self.chroma_collection_name:
return []
try:
collection = self.chroma_client.get_collection(self.chroma_collection_name)
response = collection.query(
query_texts=[raw_text],
n_results=limit,
)
except Exception as exc: # pragma: no cover
logger.warning("Chroma semantic lookup failed: %s", exc)
return []
candidates: list[str] = []
metadatas = response.get("metadatas") or []
documents = response.get("documents") or []
first_meta_list = metadatas[0] if metadatas else []
for metadata in first_meta_list:
if isinstance(metadata, dict):
food_name = metadata.get("FoodName") or metadata.get("food_name") or metadata.get("canonical_food_name")
if food_name and str(food_name) not in candidates:
candidates.append(str(food_name))
if not candidates and documents:
first_docs = documents[0] if isinstance(documents[0], list) else documents
for doc in first_docs:
text = str(doc).strip()
if text and text not in candidates:
candidates.append(text)
return candidates[:limit]
def propose_alias(
self,
*,
alias_text: str,
canonical_food_name: str,
source: str = "runtime_feedback",
notes: str | None = None,
) -> None:
alias_normalized = normalize_food_text(alias_text)
if not alias_normalized:
raise ValueError("Alias text normalizes to an empty string.")
self.alias_db.upsert_alias(
alias_text=alias_text,
alias_normalized=alias_normalized,
canonical_food_name=canonical_food_name,
status="proposed",
source=source,
notes=notes,
)
def approve_alias(
self,
*,
alias_text: str,
canonical_food_name: str,
source: str = "manual",
notes: str | None = None,
) -> None:
alias_normalized = normalize_food_text(alias_text)
if not alias_normalized:
raise ValueError("Alias text normalizes to an empty string.")
self.alias_db.upsert_alias(
alias_text=alias_text,
alias_normalized=alias_normalized,
canonical_food_name=canonical_food_name,
status="approved",
source=source,
notes=notes,
)
def _lookup_with_engine(self, text: str) -> dict[str, Any]:
"""
Adapter around the existing NutritionLookup.lookup().
This method is intentionally defensive because the exact return shape
of the current repository implementation was not provided.
"""
raw = self.lookup_engine.lookup(text)
payload = self._to_mapping(raw)
canonical_food_name = self._pick_first_str(
payload,
"canonical_food_name",
"food_name",
"FoodName",
"matched_food",
"name",
)
per_100g = self._extract_nutrients(payload)
explicit_match_type = self._pick_first_str(
payload,
"match_type",
"match",
"match_kind",
)
score = self._pick_first_float(
payload,
"score",
"match_score",
"confidence",
)
notes = self._extract_notes(payload)
if not canonical_food_name and payload:
# Some implementations may return the row directly without explicit name metadata.
canonical_food_name = self._pick_first_str(payload, "food", "label")
match_type = explicit_match_type or self._infer_match_type(text, canonical_food_name, score)
return {
"canonical_food_name": canonical_food_name,
"match_type": match_type,
"per_100g": per_100g,
"score": score,
"notes": notes,
}
def _infer_match_type(
self,
query_text: str,
canonical_food_name: str | None,
score: float | None,
) -> str:
if not canonical_food_name:
return "none"
q = normalize_food_text(query_text)
c = normalize_food_text(canonical_food_name)
if q == c:
return "exact"
if c.startswith(q) or q.startswith(c):
return "prefix"
if q in c or c in q:
return "substring"
if score is not None and score >= 0.85:
return "prefix"
if score is not None and score >= 0.65:
return "substring"
return "substring"
def _confidence_for_match_type(self, match_type: str) -> float:
return {
"exact": 1.0,
"alias": 0.98,
"prefix": 0.85,
"substring": 0.65,
"semantic": 0.40,
"none": 0.0,
}.get(match_type, 0.50)
def _extract_nutrients(self, payload: dict[str, Any]) -> dict[str, float | None]:
"""
Tries multiple common shapes:
- payload["per_100g"] = {...}
- payload["nutrients"] = {...}
- flat row keys with numeric values
"""
for key in ("per_100g", "nutrients"):
value = payload.get(key)
if isinstance(value, dict):
return {str(k): self._safe_float_or_none(v) for k, v in value.items()}
exclude = {
"canonical_food_name", "food_name", "FoodName", "matched_food", "name",
"match_type", "match", "match_kind", "score", "match_score", "confidence",
"notes", "source",
}
nutrients: dict[str, float | None] = {}
for key, value in payload.items():
if key in exclude:
continue
if isinstance(value, (int, float, str)):
converted = self._safe_float_or_none(value)
if converted is not None:
nutrients[str(key)] = converted
return nutrients
def _extract_notes(self, payload: dict[str, Any]) -> list[str]:
notes_value = payload.get("notes")
if notes_value is None:
return []
if isinstance(notes_value, list):
return [str(item) for item in notes_value]
return [str(notes_value)]
def _log_resolution(
self,
*,
raw_text: str,
normalized: str,
result: NutritionMatch,
) -> None:
if not self.log_path:
return
record = {
"timestamp_utc": datetime.now(timezone.utc).isoformat(),
"raw_query": raw_text,
"normalized_query": normalized,
"canonical_food_name": result.canonical_food_name,
"match_type": result.match_type,
"confidence": result.confidence,
"source": result.source,
"review_needed": result.review_needed,
"used_alias": result.flags.get("used_alias", False),
"used_semantic_fallback": result.flags.get("used_semantic_fallback", False),
"candidate_foods": result.candidate_foods,
"notes": result.notes,
}
with self.log_path.open("a", encoding="utf-8") as fh:
fh.write(json.dumps(record, ensure_ascii=False) + "\n")
def _to_mapping(self, raw: Any) -> dict[str, Any]:
if raw is None:
return {}
if isinstance(raw, dict):
return raw
if is_dataclass(raw):
return asdict(raw)
if hasattr(raw, "__dict__"):
return dict(vars(raw))
return {"value": raw}
def _pick_first_str(self, payload: dict[str, Any], *keys: str) -> str | None:
for key in keys:
value = payload.get(key)
if value is not None and str(value).strip():
return str(value).strip()
return None
def _pick_first_float(self, payload: dict[str, Any], *keys: str) -> float | None:
for key in keys:
if key in payload:
converted = self._safe_float_or_none(payload[key])
if converted is not None:
return converted
return None
def _safe_float_or_none(self, value: Any) -> float | None:
if value is None:
return None
if isinstance(value, bool):
return None
try:
return float(value)
except (TypeError, ValueError):
return None
_default_service: NutritionService | None = None
def get_nutrition_service() -> NutritionService:
global _default_service
if _default_service is None:
_default_service = NutritionService()
return _default_service
def resolve_nutrition(
query: NutritionQuery | str,
*,
enable_semantic_fallback: bool = True,
semantic_limit: int = 5,
) -> NutritionMatch:
"""
Public function to be shared by chatbot, Flask app, scripts, and future APIs.
"""
service = get_nutrition_service()
return service.resolve_nutrition(
query,
enable_semantic_fallback=enable_semantic_fallback,
semantic_limit=semantic_limit,
)
Minimal integration examples
Chatbot / agent code
from nutrition_service import resolve_nutrition
result = resolve_nutrition("100g chicken breast")
if result.canonical_food_name:
print(result.canonical_food_name, result.per_100g, result.match_type, result.confidence)
else:
print(result.candidate_foods, result.review_needed)
Approving an alias
from nutrition_service import get_nutrition_service
service = get_nutrition_service()
service.approve_alias(
alias_text="garbanzo beans",
canonical_food_name="Chickpeas, mature seeds, cooked",
notes="Common English variant",
)
Notes
This is designed to be dropped in without replacing your existing NutritionLookup. The adapter is intentionally defensive because I have not inspected the exact return shape of NutritionLookup.lookup(). If you paste that file next, I can wire the adapter to your exact schema and remove the fallback heuristics.
Phase 1 delivered here includes:
- shared runtime service
- SQLite alias persistence
- normalization with quantity stripping
-
stable
NutritionMatchpayload - JSONL observability logging
- optional Chroma candidate fallback path
The contracts are locked, so the next step is just adapter cleanup:
-
make
nutrition_agent.pyconsumeIngredientNutritionResult/RecipeNutritionResultdirectly -
remove any old key assumptions like ad hoc
matched_foodor custom macro dict shapes - update any UI/rendering code to read the frozen fields only
Priority order:
-
multi_agent_chatbot/nutrition_agent.py - any chatbot tool wrappers
- any UI/debug panels
- old tests that still assume pre-contract shapes
The goal now is boring and good:
all callers read the contract, nobody reaches into internals
(venv) [CML] tux@camelot …/camelot/ai-agents-crash-course $ python3 -m pytest tests/test_import_recipes.py multi_agent_chatbot/tests/test_nutrition_lookup.py multi_agent_chatbot/tests/test_nutrition_contracts.py
==================================================== test session starts ====================================================
platform linux -- Python 3.13.11, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/tux/camelot/ai-agents-crash-course
plugins: anyio-4.10.0
collected 17 items
tests/test_import_recipes.py ............. [ 76%]
multi_agent_chatbot/tests/test_nutrition_lookup.py .. [ 88%]
multi_agent_chatbot/tests/test_nutrition_contracts.py .. [100%]
==================================================== 17 passed in 0.04s =====================================================
(venv) [CML] tux@camelot …/camelot/ai-agents-crash-course $
Excellent — that means the foundation is genuinely stable now.
You have:
- passing import/ingestion coverage
- passing lookup behavior tests
- passing contract tests
- frozen ingredient and recipe result shapes
So the project has crossed an important line:
nutrition is no longer “just working by accident” — it now has a tested API surface.
The next step is the boring valuable one:
update downstream callers to consume only the locked contracts.
Most likely target:
-
multi_agent_chatbot/nutrition_agent.py
What to do there:
- import the canonical result types
-
read only
query,ingredient,per_100g,signals,source -
for recipe results, read only
totals,per_serving,servings,breakdown,missing,source - remove any assumptions about older field names or internal lookup structures
At this point, I would not expand scope. No alias DB, no RAG changes, no recipe parsing overhaul. Just finish the contract adoption pass.
Contract
More than good, actually: it is the bit that stops the nutrition layer from turning back into swamp-magic.
Right now the contract does three useful things:
It tells every caller, “this is the shape you get back, and it is not a surprise.”
That means the agent, Chainlit, tests, and any future UI all stop improvising.
It separates ingredient truth from recipe math.
NutritionLookup answers “what is this food?”
NutritionCalculator answers “what happens when I combine these amounts?”
That is a clean split.
It gives you permission to change internals later without wrecking the surfaces you actually use.
You can swap matching logic, improve CSV handling, tighten signals, whatever — and the consumers should keep working as long as the contract holds.
So in plainer terms:
- before: “works, but everyone kind of pokes at the blob differently”
- now: “these are the official doors; stop climbing through windows”
That is why it matters.
What is next is not more contract theory. It is making sure every consumer actually uses those official doors and none of them are secretly rummaging around old fields.
If you want the sharper engineering phrase: consumer migration to the contract.
A few near-neighbors, depending on tone:
- contract adoption — broad, practical
- contract enforcement — stronger; implies tests/linting/backstops
- consumer cleanup — informal, accurate
- legacy field removal — the destructive part
- API conformance pass — slightly more formal
The one I’d use in your repo notes is:
Contract adoption pass
because it means:
the contract exists, and now every caller is being moved onto it.
If you want the sharper engineering phrase: consumer migration to the contract.
What is next is not more contract theory. It is making sure every consumer actually uses those official doors and none of them are secretly rummaging around old fields.
Yes. I’d call it a door inventory or contract adoption register.
Based on what you’ve told me so far, the doors look like this:
|
Door |
Purpose |
Current status |
|---|---|---|
|
|
ingredient-level nutrition result |
Locked |
|
|
recipe/batch nutrition result |
Locked |
|
|
enforces the result shapes |
Locked |
|
|
returns contract dicts |
Adopted |
|
|
returns contract dicts |
Adopted |
|
|
returns contract dicts |
Adopted |
|
Agent summary/formatting code |
reads the contract to produce text |
Needs verification |
|
UI/debug panels |
render contract fields only |
Needs verification |
|
Legacy field access |
old keys / old shapes |
Needs removal |
|
Chainlit integration |
should remain intact, only consume shared contracts |
Protected; verify usage only |
The fetch phrase I’d use is:
“Find all consumers of the nutrition contracts and classify them as locked, adopted, needs verification, or legacy.”
If you want a grep-style version for repo work, use this idea:
contract adoption pass:
find all call sites of NutritionLookup.lookup and NutritionCalculator.calculate;
for each caller, record:
- returns contract directly
- reads contract correctly
- formats contract correctly
- still touches legacy fields
And the status words I’d standardize are:
- Locked — source contract is defined and tested
- Adopted — consumer uses only the contract
- Needs verification — probably okay, but not confirmed
- Legacy — still touches old fields or compatibility glue
- Protected — must not be removed or broken while migrating
So the phrase to ask your tools or Codex is:
“Run a contract adoption pass and produce a door inventory with statuses.”
That’s a good inventory.
I’d label the overall state as:
Contracts locked. Consumers adopted. No runtime legacy detected.
That gives you a crisp milestone name.
The only nuance I’d add is this:
The ingestion scripts are not legacy; they’re out of contract scope.
They can stay weird longer, as long as they do not leak raw shapes into runtime consumers.
So your categories now read like:
- Locked — schema owners
- Adopted — runtime consumers
- Clear — nothing left in needs-verification
- Out of scope — ingestion/build-time code
- Protected — Chainlit and active streams must remain untouched
The next milestone after this is:
Contract protection
Meaning:
- keep the tests
- stop reintroducing ad hoc fields
- require new consumers to use the frozen keys
- keep Chainlit as a consumer, not a refactor casualty
That summary is no longer bland. It says the nutrition layer has gone from “interesting blob” to governed interface.