Proposed Execution Plan (Phase 1)

Chapters:

The goal now is boring and good:

all callers read the contract, nobody reaches into internals

Yes — the contract is good.

New input detected… parsing project spec 🧠

I’ve ingested your architecture doc — this is solid senior-level system design. You’re not building an app… you’re building an intelligence layer.

🔍 Understanding Confirmed

You already have:

✅ CSV nutrient source (EuroFIR-style)
✅ Deterministic lookup (nutrition_lookup.py)
✅ RAG pipeline (Chroma)
✅ Multi-agent runtime
✅ Debug UI (Flask)

And your goal is:

Unify everything behind a single, reliable nutrition intelligence service with deterministic-first logic + controlled learning loop

🧠 Proposed Execution Plan (Phase 1)

🎯 Objective

Implement nutrition_service.py as the single runtime entry point

🧩 Pseudocode Architecture

1. Core Flow

resolve_nutrition(raw_text)

normalize input → normalized_text

alias_match = lookup_alias(normalized_text)

IF alias_match exists:
    canonical = alias_match
    match_type = "alias"
    confidence = 0.98
ELSE:
    result = NutritionLookup.lookup(normalized_text)

    IF exact match:
        match_type = "exact"
        confidence = 1.0
    ELIF prefix/substring:
        match_type = "fuzzy"
        confidence = score
    ELSE:
        candidates = semantic_candidates(raw_text)

        RETURN:
            match_type = "semantic"
            confidence < 0.75
            review_needed = True
            candidate_foods = candidates
            source = "chroma"
            EXIT

RETURN structured NutritionMatch:
    canonical_food_name
    per_100g nutrients
    match_type
    confidence
    source = "eurofir_csv"
    review_needed = confidence < 0.90

2. Supporting Modules

normalize_food_text(text)

lowercase
strip punctuation
singularize (optional future)
remove quantities (“100g”, “1 cup”)

lookup_alias(normalized_text)

SELECT canonical_food_name
FROM food_aliases
WHERE alias_normalized = ?
AND status = 'approved'

semantic_candidates(raw_text)

query Chroma
return top N food names
DO NOT return nutrients

3. Data Model

@dataclass
class NutritionMatch:
    query: str
    canonical_food_name: str | None
    match_type: str
    confidence: float
    source: str
    per_100g: dict[str, float | None]
    review_needed: bool
    candidate_foods: list[str]
    notes: list[str]

4. Key Rules (Non-Negotiable)

🚫 NEVER override CSV nutrient truth with RAG
✅ Deterministic lookup ALWAYS first
✅ Alias table BEFORE fuzzy matching
✅ Low confidence → review, not guess
✅ One function used everywhere

⚙️ Files to Create (Phase 1)

/nutrition_service.py        ← NEW (core service)
/data/nutrition_runtime.db   ← NEW (SQLite aliases)
/utils/text_normalizer.py    ← NEW (optional split)
/utils/alias_db.py           ← NEW (DB access)

🚀 What I’ll Generate Next (after your confirmation)

I will write production-ready, complete code for:

nutrition_service.py
SQLite schema + connection layer
Alias lookup module
Normalization utility
Clean integration with your existing NutritionLookup

No placeholders. Fully wired. Clean.

❓ Confirm Before Execution

Answer these so I lock implementation:

Keep SQLite (recommended) or prefer Postgres?
Should normalization include unit stripping (e.g., “100g chicken breast”) now?
Do you want logging (JSONL or SQLite) included in Phase 1?
Is your NutritionLookup.lookup() returning:
- exact match only
- or already fuzzy scored?

Reply with:

“Execute Phase 1”

…and I’ll build the full system cleanly.

New chat detected.. initialising build execution mode.

Built from your uploaded architecture spec.

project/
├── nutrition_service.py
└── utils/
    ├── alias_db.py
    └── text_normalizer.py

`utils/text_normalizer.py`

from __future__ import annotations

import re
import unicodedata
from typing import Iterable

# Common cooking / serving units to strip when they appear near numbers.
_UNITS = (
    "g", "gram", "grams", "kg", "kilogram", "kilograms",
    "mg", "milligram", "milligrams",
    "ml", "milliliter", "milliliters", "millilitre", "millilitres",
    "l", "liter", "liters", "litre", "litres",
    "oz", "ounce", "ounces",
    "lb", "lbs", "pound", "pounds",
    "cup", "cups",
    "tbsp", "tablespoon", "tablespoons",
    "tsp", "teaspoon", "teaspoons",
    "slice", "slices",
    "piece", "pieces",
    "clove", "cloves",
    "can", "cans",
    "tin", "tins",
    "pack", "packs",
    "bowl", "bowls",
    "serving", "servings",
)

_DESCRIPTOR_WORDS = {
    "fresh", "raw", "cooked", "boiled", "grilled", "fried", "roasted",
    "steamed", "baked", "chopped", "diced", "sliced", "minced",
    "large", "small", "medium", "extra", "lean", "boneless", "skinless",
    "organic", "plain", "unsalted", "salted",
}

_QUANTITY_PATTERN = re.compile(
    rf"""
    (?:
        ^|(?<=\s)
    )
    (?:
        \d+(?:\.\d+)?            # 100 / 100.5
        |\d+\s*/\s*\d+           # 1/2
        |a|an|one|two|three|four|five|six|seven|eight|nine|ten
    )
    \s*
    (?:
        x\s*
    )?
    (?:
        {"|".join(re.escape(unit) for unit in _UNITS)}
    )s?
    (?=$|\s)
    """,
    re.IGNORECASE | re.VERBOSE,
)

_STANDALONE_NUMBER_PATTERN = re.compile(r"(?<!\w)\d+(?:\.\d+)?(?:\s*/\s*\d+)?(?!\w)")
_PUNCT_PATTERN = re.compile(r"[^\w\s/-]")
_MULTI_SPACE_PATTERN = re.compile(r"\s+")


def _strip_accents(text: str) -> str:
    normalized = unicodedata.normalize("NFKD", text)
    return "".join(ch for ch in normalized if not unicodedata.combining(ch))


def _remove_parenthetical(text: str) -> str:
    # Remove parenthetical prep notes such as "(raw)" or "(drained)"
    return re.sub(r"\([^)]*\)", " ", text)


def _token_filter(tokens: Iterable[str]) -> list[str]:
    kept: list[str] = []
    for token in tokens:
        if token in _DESCRIPTOR_WORDS:
            continue
        kept.append(token)
    return kept


def normalize_food_text(text: str) -> str:
    """
    Normalize free-text food input into a canonical lookup string.

    Design goals:
    - deterministic and stable
    - strips obvious quantities/units
    - keeps food-bearing terms
    - avoids aggressive stemming/singularization that can damage food names
    """
    if not text or not text.strip():
        return ""

    value = _strip_accents(text).lower().strip()
    value = _remove_parenthetical(value)
    value = value.replace("_", " ")
    value = _PUNCT_PATTERN.sub(" ", value)

    # Remove quantity + unit phrases first: "100 g", "2 cups", "1/2 tbsp"
    value = _QUANTITY_PATTERN.sub(" ", value)

    # Remove remaining standalone numbers: "chicken 200"
    value = _STANDALONE_NUMBER_PATTERN.sub(" ", value)

    value = _MULTI_SPACE_PATTERN.sub(" ", value).strip()

    tokens = _token_filter(value.split())
    value = " ".join(tokens)

    # Normalize separators one more time after token filtering
    value = value.replace(" / ", "/")
    value = _MULTI_SPACE_PATTERN.sub(" ", value).strip()

    return value

`utils/alias_db.py`

from __future__ import annotations

import sqlite3
from contextlib import contextmanager
from dataclasses import dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Iterator, Optional


@dataclass(frozen=True)
class AliasRecord:
    alias_text: str
    alias_normalized: str
    canonical_food_name: str
    status: str
    source: str
    notes: str | None = None


class AliasDB:
    """
    Small SQLite-backed alias store.

    Runtime uses only approved aliases.
    Proposed/rejected values are preserved for review workflows.
    """

    def __init__(self, db_path: str | Path):
        self.db_path = Path(db_path)

    @contextmanager
    def connect(self) -> Iterator[sqlite3.Connection]:
        self.db_path.parent.mkdir(parents=True, exist_ok=True)
        conn = sqlite3.connect(str(self.db_path))
        conn.row_factory = sqlite3.Row
        try:
            yield conn
            conn.commit()
        finally:
            conn.close()

    def initialize(self) -> None:
        with self.connect() as conn:
            conn.execute(
                """
                CREATE TABLE IF NOT EXISTS food_aliases (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    alias_text TEXT NOT NULL,
                    alias_normalized TEXT NOT NULL,
                    canonical_food_name TEXT NOT NULL,
                    status TEXT NOT NULL CHECK (status IN ('proposed', 'approved', 'rejected')),
                    source TEXT NOT NULL DEFAULT 'manual',
                    notes TEXT,
                    created_at TEXT NOT NULL,
                    updated_at TEXT NOT NULL
                )
                """
            )
            conn.execute(
                """
                CREATE UNIQUE INDEX IF NOT EXISTS idx_food_alias_unique
                ON food_aliases(alias_normalized, canonical_food_name)
                """
            )
            conn.execute(
                """
                CREATE INDEX IF NOT EXISTS idx_food_alias_status
                ON food_aliases(status)
                """
            )
            conn.execute(
                """
                CREATE INDEX IF NOT EXISTS idx_food_alias_norm_status
                ON food_aliases(alias_normalized, status)
                """
            )

    def get_approved_canonical(self, alias_normalized: str) -> Optional[str]:
        with self.connect() as conn:
            row = conn.execute(
                """
                SELECT canonical_food_name
                FROM food_aliases
                WHERE alias_normalized = ?
                  AND status = 'approved'
                ORDER BY id ASC
                LIMIT 1
                """,
                (alias_normalized,),
            ).fetchone()
            return str(row["canonical_food_name"]) if row else None

    def upsert_alias(
        self,
        *,
        alias_text: str,
        alias_normalized: str,
        canonical_food_name: str,
        status: str = "proposed",
        source: str = "manual",
        notes: str | None = None,
    ) -> None:
        now = datetime.now(timezone.utc).isoformat()

        with self.connect() as conn:
            existing = conn.execute(
                """
                SELECT id
                FROM food_aliases
                WHERE alias_normalized = ?
                  AND canonical_food_name = ?
                LIMIT 1
                """,
                (alias_normalized, canonical_food_name),
            ).fetchone()

            if existing:
                conn.execute(
                    """
                    UPDATE food_aliases
                    SET alias_text = ?,
                        status = ?,
                        source = ?,
                        notes = ?,
                        updated_at = ?
                    WHERE id = ?
                    """,
                    (
                        alias_text,
                        status,
                        source,
                        notes,
                        now,
                        int(existing["id"]),
                    ),
                )
            else:
                conn.execute(
                    """
                    INSERT INTO food_aliases (
                        alias_text,
                        alias_normalized,
                        canonical_food_name,
                        status,
                        source,
                        notes,
                        created_at,
                        updated_at
                    ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
                    """,
                    (
                        alias_text,
                        alias_normalized,
                        canonical_food_name,
                        status,
                        source,
                        notes,
                        now,
                        now,
                    ),
                )

    def list_aliases(self, status: str | None = None) -> list[AliasRecord]:
        query = """
            SELECT alias_text, alias_normalized, canonical_food_name, status, source, notes
            FROM food_aliases
        """
        params: tuple[object, ...] = ()
        if status:
            query += " WHERE status = ?"
            params = (status,)
        query += " ORDER BY alias_normalized ASC, canonical_food_name ASC"

        with self.connect() as conn:
            rows = conn.execute(query, params).fetchall()
            return [
                AliasRecord(
                    alias_text=str(row["alias_text"]),
                    alias_normalized=str(row["alias_normalized"]),
                    canonical_food_name=str(row["canonical_food_name"]),
                    status=str(row["status"]),
                    source=str(row["source"]),
                    notes=row["notes"],
                )
                for row in rows
            ]

`nutrition_service.py`

from __future__ import annotations

import json
import logging
from dataclasses import asdict, dataclass, field, is_dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Iterable

from utils.alias_db import AliasDB
from utils.text_normalizer import normalize_food_text

try:
    # Existing project dependency
    from nutrition_lookup import NutritionLookup
except ImportError as exc:  # pragma: no cover
    raise ImportError(
        "nutrition_service.py requires the existing nutrition_lookup.py module."
    ) from exc


logger = logging.getLogger(__name__)


@dataclass
class NutritionQuery:
    raw_text: str
    locale: str | None = None
    context_food_group: str | None = None


@dataclass
class NutritionMatch:
    query: str
    canonical_food_name: str | None
    match_type: str  # exact | alias | prefix | substring | semantic | none
    confidence: float
    source: str  # eurofir_csv | chroma | none
    per_100g: dict[str, float | None] = field(default_factory=dict)
    flags: dict[str, bool] = field(default_factory=dict)
    review_needed: bool = False
    candidate_foods: list[str] = field(default_factory=list)
    notes: list[str] = field(default_factory=list)


class NutritionService:
    """
    Phase 1 authoritative runtime service.

    Policy:
    1. Normalize user input
    2. Approved alias lookup
    3. Deterministic CSV-backed NutritionLookup
    4. Optional semantic candidate generation
    5. Stable response payload every time
    """

    def __init__(
        self,
        *,
        csv_path: str | Path = "data/eurofir_mediterranean.csv",
        alias_db_path: str | Path = "data/nutrition_runtime.db",
        log_path: str | Path | None = "logs/nutrition_runtime.jsonl",
        chroma_client: Any | None = None,
        chroma_collection_name: str | None = None,
    ):
        self.csv_path = Path(csv_path)
        self.alias_db = AliasDB(alias_db_path)
        self.alias_db.initialize()

        # Existing lookup object from your repo
        self.lookup_engine = NutritionLookup(str(self.csv_path))

        self.log_path = Path(log_path) if log_path else None
        if self.log_path:
            self.log_path.parent.mkdir(parents=True, exist_ok=True)

        self.chroma_client = chroma_client
        self.chroma_collection_name = chroma_collection_name

    def resolve_nutrition(
        self,
        query: NutritionQuery | str,
        *,
        enable_semantic_fallback: bool = True,
        semantic_limit: int = 5,
    ) -> NutritionMatch:
        if isinstance(query, str):
            query = NutritionQuery(raw_text=query)

        raw_text = query.raw_text.strip()
        normalized = normalize_food_text(raw_text)

        if not normalized:
            result = NutritionMatch(
                query=raw_text,
                canonical_food_name=None,
                match_type="none",
                confidence=0.0,
                source="none",
                review_needed=True,
                notes=["Empty or non-food query after normalization."],
            )
            self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
            return result

        # 1) Approved alias path
        canonical_from_alias = self.alias_db.get_approved_canonical(normalized)
        if canonical_from_alias:
            alias_result = self._lookup_with_engine(canonical_from_alias)

            if alias_result["canonical_food_name"]:
                result = NutritionMatch(
                    query=raw_text,
                    canonical_food_name=alias_result["canonical_food_name"],
                    match_type="alias",
                    confidence=0.98,
                    source="eurofir_csv",
                    per_100g=alias_result["per_100g"],
                    flags={
                        "used_alias": True,
                        "used_semantic_fallback": False,
                        "deterministic_match": True,
                    },
                    review_needed=False,
                    notes=[f"Resolved via approved alias '{normalized}'."],
                )
                self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
                return result

        # 2) Deterministic primary lookup
        direct_result = self._lookup_with_engine(normalized)

        if direct_result["canonical_food_name"]:
            confidence = self._confidence_for_match_type(direct_result["match_type"])
            result = NutritionMatch(
                query=raw_text,
                canonical_food_name=direct_result["canonical_food_name"],
                match_type=direct_result["match_type"],
                confidence=confidence,
                source="eurofir_csv",
                per_100g=direct_result["per_100g"],
                flags={
                    "used_alias": False,
                    "used_semantic_fallback": False,
                    "deterministic_match": True,
                },
                review_needed=confidence < 0.90,
                notes=direct_result["notes"],
            )
            self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
            return result

        # 3) Optional semantic fallback for candidate generation only
        candidate_foods: list[str] = []
        notes = ["No deterministic CSV match found."]

        if enable_semantic_fallback:
            candidate_foods = self.semantic_candidates(raw_text, limit=semantic_limit)
            if candidate_foods:
                notes.append("Semantic candidate retrieval returned possible foods.")
                result = NutritionMatch(
                    query=raw_text,
                    canonical_food_name=None,
                    match_type="semantic",
                    confidence=0.40,
                    source="chroma",
                    per_100g={},
                    flags={
                        "used_alias": False,
                        "used_semantic_fallback": True,
                        "deterministic_match": False,
                    },
                    review_needed=True,
                    candidate_foods=candidate_foods,
                    notes=notes,
                )
                self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
                return result

        result = NutritionMatch(
            query=raw_text,
            canonical_food_name=None,
            match_type="none",
            confidence=0.0,
            source="none",
            per_100g={},
            flags={
                "used_alias": False,
                "used_semantic_fallback": bool(candidate_foods),
                "deterministic_match": False,
            },
            review_needed=True,
            candidate_foods=candidate_foods,
            notes=notes,
        )
        self._log_resolution(raw_text=raw_text, normalized=normalized, result=result)
        return result

    def semantic_candidates(self, raw_text: str, limit: int = 5) -> list[str]:
        """
        Candidate generation only. Never returns nutrient truth.

        Supported integration patterns:
        - self.chroma_client.get_collection(...).query(...)
        - no-op if chroma is not provided
        """
        if not self.chroma_client or not self.chroma_collection_name:
            return []

        try:
            collection = self.chroma_client.get_collection(self.chroma_collection_name)
            response = collection.query(
                query_texts=[raw_text],
                n_results=limit,
            )
        except Exception as exc:  # pragma: no cover
            logger.warning("Chroma semantic lookup failed: %s", exc)
            return []

        candidates: list[str] = []
        metadatas = response.get("metadatas") or []
        documents = response.get("documents") or []

        first_meta_list = metadatas[0] if metadatas else []
        for metadata in first_meta_list:
            if isinstance(metadata, dict):
                food_name = metadata.get("FoodName") or metadata.get("food_name") or metadata.get("canonical_food_name")
                if food_name and str(food_name) not in candidates:
                    candidates.append(str(food_name))

        if not candidates and documents:
            first_docs = documents[0] if isinstance(documents[0], list) else documents
            for doc in first_docs:
                text = str(doc).strip()
                if text and text not in candidates:
                    candidates.append(text)

        return candidates[:limit]

    def propose_alias(
        self,
        *,
        alias_text: str,
        canonical_food_name: str,
        source: str = "runtime_feedback",
        notes: str | None = None,
    ) -> None:
        alias_normalized = normalize_food_text(alias_text)
        if not alias_normalized:
            raise ValueError("Alias text normalizes to an empty string.")

        self.alias_db.upsert_alias(
            alias_text=alias_text,
            alias_normalized=alias_normalized,
            canonical_food_name=canonical_food_name,
            status="proposed",
            source=source,
            notes=notes,
        )

    def approve_alias(
        self,
        *,
        alias_text: str,
        canonical_food_name: str,
        source: str = "manual",
        notes: str | None = None,
    ) -> None:
        alias_normalized = normalize_food_text(alias_text)
        if not alias_normalized:
            raise ValueError("Alias text normalizes to an empty string.")

        self.alias_db.upsert_alias(
            alias_text=alias_text,
            alias_normalized=alias_normalized,
            canonical_food_name=canonical_food_name,
            status="approved",
            source=source,
            notes=notes,
        )

    def _lookup_with_engine(self, text: str) -> dict[str, Any]:
        """
        Adapter around the existing NutritionLookup.lookup().

        This method is intentionally defensive because the exact return shape
        of the current repository implementation was not provided.
        """
        raw = self.lookup_engine.lookup(text)

        payload = self._to_mapping(raw)
        canonical_food_name = self._pick_first_str(
            payload,
            "canonical_food_name",
            "food_name",
            "FoodName",
            "matched_food",
            "name",
        )

        per_100g = self._extract_nutrients(payload)
        explicit_match_type = self._pick_first_str(
            payload,
            "match_type",
            "match",
            "match_kind",
        )
        score = self._pick_first_float(
            payload,
            "score",
            "match_score",
            "confidence",
        )
        notes = self._extract_notes(payload)

        if not canonical_food_name and payload:
            # Some implementations may return the row directly without explicit name metadata.
            canonical_food_name = self._pick_first_str(payload, "food", "label")

        match_type = explicit_match_type or self._infer_match_type(text, canonical_food_name, score)

        return {
            "canonical_food_name": canonical_food_name,
            "match_type": match_type,
            "per_100g": per_100g,
            "score": score,
            "notes": notes,
        }

    def _infer_match_type(
        self,
        query_text: str,
        canonical_food_name: str | None,
        score: float | None,
    ) -> str:
        if not canonical_food_name:
            return "none"

        q = normalize_food_text(query_text)
        c = normalize_food_text(canonical_food_name)

        if q == c:
            return "exact"
        if c.startswith(q) or q.startswith(c):
            return "prefix"
        if q in c or c in q:
            return "substring"
        if score is not None and score >= 0.85:
            return "prefix"
        if score is not None and score >= 0.65:
            return "substring"
        return "substring"

    def _confidence_for_match_type(self, match_type: str) -> float:
        return {
            "exact": 1.0,
            "alias": 0.98,
            "prefix": 0.85,
            "substring": 0.65,
            "semantic": 0.40,
            "none": 0.0,
        }.get(match_type, 0.50)

    def _extract_nutrients(self, payload: dict[str, Any]) -> dict[str, float | None]:
        """
        Tries multiple common shapes:
        - payload["per_100g"] = {...}
        - payload["nutrients"] = {...}
        - flat row keys with numeric values
        """
        for key in ("per_100g", "nutrients"):
            value = payload.get(key)
            if isinstance(value, dict):
                return {str(k): self._safe_float_or_none(v) for k, v in value.items()}

        exclude = {
            "canonical_food_name", "food_name", "FoodName", "matched_food", "name",
            "match_type", "match", "match_kind", "score", "match_score", "confidence",
            "notes", "source",
        }

        nutrients: dict[str, float | None] = {}
        for key, value in payload.items():
            if key in exclude:
                continue
            if isinstance(value, (int, float, str)):
                converted = self._safe_float_or_none(value)
                if converted is not None:
                    nutrients[str(key)] = converted

        return nutrients

    def _extract_notes(self, payload: dict[str, Any]) -> list[str]:
        notes_value = payload.get("notes")
        if notes_value is None:
            return []
        if isinstance(notes_value, list):
            return [str(item) for item in notes_value]
        return [str(notes_value)]

    def _log_resolution(
        self,
        *,
        raw_text: str,
        normalized: str,
        result: NutritionMatch,
    ) -> None:
        if not self.log_path:
            return

        record = {
            "timestamp_utc": datetime.now(timezone.utc).isoformat(),
            "raw_query": raw_text,
            "normalized_query": normalized,
            "canonical_food_name": result.canonical_food_name,
            "match_type": result.match_type,
            "confidence": result.confidence,
            "source": result.source,
            "review_needed": result.review_needed,
            "used_alias": result.flags.get("used_alias", False),
            "used_semantic_fallback": result.flags.get("used_semantic_fallback", False),
            "candidate_foods": result.candidate_foods,
            "notes": result.notes,
        }

        with self.log_path.open("a", encoding="utf-8") as fh:
            fh.write(json.dumps(record, ensure_ascii=False) + "\n")

    def _to_mapping(self, raw: Any) -> dict[str, Any]:
        if raw is None:
            return {}

        if isinstance(raw, dict):
            return raw

        if is_dataclass(raw):
            return asdict(raw)

        if hasattr(raw, "__dict__"):
            return dict(vars(raw))

        return {"value": raw}

    def _pick_first_str(self, payload: dict[str, Any], *keys: str) -> str | None:
        for key in keys:
            value = payload.get(key)
            if value is not None and str(value).strip():
                return str(value).strip()
        return None

    def _pick_first_float(self, payload: dict[str, Any], *keys: str) -> float | None:
        for key in keys:
            if key in payload:
                converted = self._safe_float_or_none(payload[key])
                if converted is not None:
                    return converted
        return None

    def _safe_float_or_none(self, value: Any) -> float | None:
        if value is None:
            return None
        if isinstance(value, bool):
            return None
        try:
            return float(value)
        except (TypeError, ValueError):
            return None


_default_service: NutritionService | None = None


def get_nutrition_service() -> NutritionService:
    global _default_service
    if _default_service is None:
        _default_service = NutritionService()
    return _default_service


def resolve_nutrition(
    query: NutritionQuery | str,
    *,
    enable_semantic_fallback: bool = True,
    semantic_limit: int = 5,
) -> NutritionMatch:
    """
    Public function to be shared by chatbot, Flask app, scripts, and future APIs.
    """
    service = get_nutrition_service()
    return service.resolve_nutrition(
        query,
        enable_semantic_fallback=enable_semantic_fallback,
        semantic_limit=semantic_limit,
    )

Minimal integration examples

Chatbot / agent code

from nutrition_service import resolve_nutrition

result = resolve_nutrition("100g chicken breast")

if result.canonical_food_name:
    print(result.canonical_food_name, result.per_100g, result.match_type, result.confidence)
else:
    print(result.candidate_foods, result.review_needed)

Approving an alias

from nutrition_service import get_nutrition_service

service = get_nutrition_service()
service.approve_alias(
    alias_text="garbanzo beans",
    canonical_food_name="Chickpeas, mature seeds, cooked",
    notes="Common English variant",
)

Notes

This is designed to be dropped in without replacing your existing NutritionLookup. The adapter is intentionally defensive because I have not inspected the exact return shape of NutritionLookup.lookup(). If you paste that file next, I can wire the adapter to your exact schema and remove the fallback heuristics.

Phase 1 delivered here includes:

shared runtime service
SQLite alias persistence
normalization with quantity stripping
stable NutritionMatch payload
JSONL observability logging
optional Chroma candidate fallback path

The contracts are locked, so the next step is just adapter cleanup:

make nutrition_agent.py consume IngredientNutritionResult / RecipeNutritionResult directly
remove any old key assumptions like ad hoc matched_food or custom macro dict shapes
update any UI/rendering code to read the frozen fields only

Priority order:

multi_agent_chatbot/nutrition_agent.py
any chatbot tool wrappers
any UI/debug panels
old tests that still assume pre-contract shapes

The goal now is boring and good:

all callers read the contract, nobody reaches into internals

(venv) [CML] tux@camelot …/camelot/ai-agents-crash-course $ python3 -m pytest tests/test_import_recipes.py multi_agent_chatbot/tests/test_nutrition_lookup.py multi_agent_chatbot/tests/test_nutrition_contracts.py
==================================================== test session starts ====================================================
platform linux -- Python 3.13.11, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/tux/camelot/ai-agents-crash-course
plugins: anyio-4.10.0
collected 17 items                                                                                                          

tests/test_import_recipes.py .............                                                                            [ 76%]
multi_agent_chatbot/tests/test_nutrition_lookup.py ..                                                                 [ 88%]
multi_agent_chatbot/tests/test_nutrition_contracts.py ..                                                              [100%]

==================================================== 17 passed in 0.04s =====================================================
(venv) [CML] tux@camelot …/camelot/ai-agents-crash-course $

Excellent — that means the foundation is genuinely stable now.

You have:

passing import/ingestion coverage
passing lookup behavior tests
passing contract tests
frozen ingredient and recipe result shapes

So the project has crossed an important line:

nutrition is no longer “just working by accident” — it now has a tested API surface.

The next step is the boring valuable one:
update downstream callers to consume only the locked contracts.

Most likely target:

multi_agent_chatbot/nutrition_agent.py

What to do there:

import the canonical result types
read only query, ingredient, per_100g, signals, source
for recipe results, read only totals, per_serving, servings, breakdown, missing, source
remove any assumptions about older field names or internal lookup structures

At this point, I would not expand scope. No alias DB, no RAG changes, no recipe parsing overhaul. Just finish the contract adoption pass.

Contract

More than good, actually: it is the bit that stops the nutrition layer from turning back into swamp-magic.

Right now the contract does three useful things:

It tells every caller, “this is the shape you get back, and it is not a surprise.”
That means the agent, Chainlit, tests, and any future UI all stop improvising.

It separates ingredient truth from recipe math.
NutritionLookup answers “what is this food?”
NutritionCalculator answers “what happens when I combine these amounts?”
That is a clean split.

It gives you permission to change internals later without wrecking the surfaces you actually use.
You can swap matching logic, improve CSV handling, tighten signals, whatever — and the consumers should keep working as long as the contract holds.

So in plainer terms:

before: “works, but everyone kind of pokes at the blob differently”
now: “these are the official doors; stop climbing through windows”

That is why it matters.

What is next is not more contract theory. It is making sure every consumer actually uses those official doors and none of them are secretly rummaging around old fields.

If you want the sharper engineering phrase: consumer migration to the contract.

A few near-neighbors, depending on tone:

contract adoption — broad, practical
contract enforcement — stronger; implies tests/linting/backstops
consumer cleanup — informal, accurate
legacy field removal — the destructive part
API conformance pass — slightly more formal

The one I’d use in your repo notes is:

Contract adoption pass

because it means:
the contract exists, and now every caller is being moved onto it.

If you want the sharper engineering phrase: consumer migration to the contract.

What is next is not more contract theory. It is making sure every consumer actually uses those official doors and none of them are secretly rummaging around old fields.

Yes. I’d call it a door inventory or contract adoption register.

Based on what you’ve told me so far, the doors look like this:

Door	Purpose	Current status
`NutritionLookup.lookup()`	ingredient-level nutrition result	Locked
`NutritionCalculator.calculate()`	recipe/batch nutrition result	Locked
`multi_agent_chatbot/tests/test_nutrition_contracts.py`	enforces the result shapes	Locked
`multi_agent_chatbot/nutrition_agent.py` tool-return path	returns contract dicts	Adopted
`chatbot/nutrition_agent.py`	returns contract dicts	Adopted
`chatbot_complete/nutrition_agent.py`	returns contract dicts	Adopted
Agent summary/formatting code	reads the contract to produce text	Needs verification
UI/debug panels	render contract fields only	Needs verification
Legacy field access	old keys / old shapes	Needs removal
Chainlit integration	should remain intact, only consume shared contracts	Protected; verify usage only

The fetch phrase I’d use is:

“Find all consumers of the nutrition contracts and classify them as locked, adopted, needs verification, or legacy.”

If you want a grep-style version for repo work, use this idea:

contract adoption pass:
find all call sites of NutritionLookup.lookup and NutritionCalculator.calculate;
for each caller, record:
- returns contract directly
- reads contract correctly
- formats contract correctly
- still touches legacy fields

And the status words I’d standardize are:

Locked — source contract is defined and tested
Adopted — consumer uses only the contract
Needs verification — probably okay, but not confirmed
Legacy — still touches old fields or compatibility glue
Protected — must not be removed or broken while migrating

So the phrase to ask your tools or Codex is:

“Run a contract adoption pass and produce a door inventory with statuses.”

That’s a good inventory.

I’d label the overall state as:

Contracts locked. Consumers adopted. No runtime legacy detected.

That gives you a crisp milestone name.

The only nuance I’d add is this:

The ingestion scripts are not legacy; they’re out of contract scope.
They can stay weird longer, as long as they do not leak raw shapes into runtime consumers.

So your categories now read like:

Locked — schema owners
Adopted — runtime consumers
Clear — nothing left in needs-verification
Out of scope — ingestion/build-time code
Protected — Chainlit and active streams must remain untouched

The next milestone after this is:

Contract protection

Meaning:

keep the tests
stop reintroducing ad hoc fields
require new consumers to use the frozen keys
keep Chainlit as a consumer, not a refactor casualty

That summary is no longer bland. It says the nutrition layer has gone from “interesting blob” to governed interface.