Summary #
Firebase Genkit is the best recommendation for "purely local" tracing in a TypeScript/Google GenAI project. Unlike other tools that require running Docker containers (LangFuse) or Python servers (Arize Phoenix), Genkit provides a rich Developer UI that runs directly in your Node.js environment via npx genkit start.
While Genkit is a full framework, you can use it minimally as a "wrapper" around your agent to gain instant access to trace visualization, input/output inspection, and dataset curation tools—all locally[1].
Philosophy & Mental Model #
Genkit models your application as a set of Flows. A Flow is just a function with a schema.
- Observability by Default: Anything inside a Flow is automatically traced.
- Local-First Developer Experience: The
genkit starttool is the center of the universe. It acts as a debugger, trace viewer, and playground. - Reflection: The UI inspects your code (via TypeScript reflection/schema) to generate test interfaces.
Setup #
Install the Genkit CLI and core packages:
1npm install -g genkit
2npm install genkit @genkit-ai/googleai
Initialize a minimal config (no need for Firebase Cloud project for local use):
1# Initialize in current directory
2genkit init
3# Select "Local" mode when prompted
Core Usage Patterns #
Pattern 1: The "Wrapper" Flow #
To get tracing without rewriting your entire agent, wrap your entry point in a defineFlow. This captures the top-level inputs/outputs and any internal steps.
1import { genkit, z } from 'genkit';
2import { googleAI, gemini15Flash } from '@genkit-ai/googleai';
3
4const ai = genkit({
5 plugins: [googleAI()],
6 model: gemini15Flash, // Sets default model
7});
8
9// Wrap your existing agent logic
10export const runAgent = ai.defineFlow(
11 {
12 name: 'runAgent',
13 inputSchema: z.object({ prompt: z.string() }),
14 outputSchema: z.string(),
15 },
16 async (input) => {
17 // Your existing @google/genai code can go here
18 // Or use the Genkit 'generate' helper for auto-instrumentation
19 const result = await ai.generate({
20 prompt: input.prompt,
21 });
22
23 return result.text;
24 }
25);
Pattern 2: The Local Developer UI #
This is where the "Agent Tuning" magic happens. You don't read logs; you inspect traces.
1npx genkit start --watch
This launches http://localhost:4000. You can:
- Run your agent with various inputs.
- View Traces: See every step (retrieval, tool call, generation).
- Rate & Curate: Mark runs as "positive/negative" (store locally).
- Export: Save high-quality traces to JSON for fine-tuning.
Pattern 3: Manual Steps (Spans) #
If you have complex logic (e.g., a loop) inside the flow, use run to create sub-spans in the trace.
1// inside the flow
2const context = await ai.run('retrieve-context', async () => {
3 return await myVectorStore.search(input.prompt);
4});
Anti-Patterns & Pitfalls #
❌ Don't: Use Production Tracing Locally #
Don't configure the Google Cloud Trace exporter when running locally. It adds latency and requires credentials. Stick to the default local trace store (filesystem based).
✅ Instead: Use the Dev UI #
Rely on genkit start for all local debugging. It reads from the .genkit/ directory where traces are stored.
❌ Don't: Mix @google/genai and Genkit indiscriminately #
While possible, you get better traces if you use ai.generate() (Genkit's API) instead of raw genai.generateContent(). Genkit's wrapper automatically logs token usage, model parameters, and safety settings to the trace.
Caveats #
- Framework Buy-in: Genkit is opinionated. You must use
defineFlowto get the benefits. It's not just a "logger" you drop in. - Node.js Only: The best tooling is currently for Node.js (perfect for this project).
- Storage: Local traces are stored in a temporary or file-based queue. For long-term history, you eventually need a backend (but for "Agent Tuning" sessions, local is fine).
References #
[1] Firebase Genkit Documentation - Official docs [2] Genkit Local Observability - Details on the Developer UI [3] @genkit-ai/googleai - NPM package