Back to Blog

LLM practical notes: prompting, tools, eval and production

2025-09-04T00:00:00

This note summarizes recent production learnings on prompting, tool orchestration, eval & feedback loop, latency & cost, and safety.

Prompting

  • Define the output schema, add few-shot examples, and specify error tolerance (fill nulls).
  • Constrain the model's attention scope explicitly.
  • Separate system vs task prompts to keep each concise.

Tools & orchestration

  • Let the LLM handle decisions/routing while pushing precise parsing to regex/AST/rules.
  • Integrate retrieval, parsing and external APIs with observable intermediates.

Eval & feedback loop

  • Build a golden set spanning modalities, languages and edge cases.
  • Wire online feedback back to datasets; enforce guardrails and fallbacks.

Latency & cost

  • Try smaller models first with constraints; add caching/truncation/degradation.
  • Mix parallelization with minimal necessary serial steps.

Safety

  • Mask secrets/PII in prompts and KB; post-filter outputs; keep audit logs.