Fireship — Latent Program (Actor–Critic)

A clean workspace for the “prompt-as-latent” approach: invariants + actor–critic distillation loop. This page is the map before we automate anything.
# Fireship “Latent Program” (Actor–Critic Distillation)

This is a static copy of `docs/fireship-latent/README.md` so Studio can render it reliably in production.

If you update the doc, update this file too.

# Fireship “Latent Program” (Actor–Critic Distillation)

Goal: capture a creator’s *recognizable* style as a reusable **prompt program** + **rubric**, not as a rigid beat table.

This is the new Fireship pipeline “workspace”. It is intentionally clean and self-contained.

---

## Core idea

Instead of encoding Fireship into a small, quantized beatmap, we treat “style latent space” as:

1) a **prompt program** (system + instructions + constraints + knobs), and  
2) an **actor–critic loop** that rejects drift and pushes drafts into the right basin of attraction.

The beatmap/timestamps are optional diagnostics *later* (for pacing polish), not the generator’s skeleton.

---

## Style invariants (the “contract”)

These are the things that must remain true for output to be recognizably “Code Report”:

### A) Editorial voice + vibe
- **Tea-first**: it feels like gossip about incentives and power, not a tutorial.
- **No influencer filler**: no “in today’s video…” or generic hype; every line earns its spot.
- **Confidence with hedges**: spicy claims are explicitly attributed (“reports say…”, “according to…”) or marked uncertain.

### B) The drop must land (EDM payoff)
- A single, quotable **thesis line** appears early enough and is revisited.
- The thesis is **proven**, not merely stated (cause → mechanism → impact).
- The “why is this sexy?” is answered plainly for a technical viewer.

### C) Receipts discipline
- Strong claims are anchored to **on-screenable receipts** (issue threads, docs, headlines, product UI).
- No long stretches of pure explanation without a reset (receipt/joke/visual pivot).
- If we can’t source it, we either **omit** it or label it as **unverified**.

### D) Compression + density
- High information density: each sentence is (joke) or (claim) or (reason) or (receipt).
- Minimal scene-setting; fast to the first compelling thing.
- Avoid “rigid segmentation artifacts” (scripts that feel like they were written to a grid).

### E) Rhythm (qualitative, not a hard template)
- Frequent **buttons** (short lines that land) and micro-pauses for punchlines.
- Varied cadence (fast bursts + intentional landings).
- Avoid breathless WPM spikes that make it sound like a legal disclaimer.

### F) Practical ending
- Ends with a **do-this** takeaway: BYOK keys, routing, fallbacks, etc.
- Outro is short, sharp, and on-brand.

---

## Artifacts (what we store)

### 1) `program` (the latent prompt program)
- System prompt (voice)
- Generation prompt (task + constraints)
- Negative prompt (what not to do)
- Knobs (spice level, hedge strictness, receipts density, etc.)

### 2) `rubric` (what “recognizable Fireship” means)
- Scored categories (hook/goss, drop/payoff, clarity, receipts discipline, cadence, non-cheese)
- Auto-fails (e.g., “premise is an error message”, no explicit drop, no receipts)

### 3) `drafts` (a distillation trace)
- `draft_v0.md` (actor)
- `critique_v0.json` (critic)
- `draft_v1.md` (actor revision)
- …
- `best.md` + `best_score.json`

Optional diagnostics later:
- Audio timing stats (WPM curve, pause density)
- Shotlist JSON (derived from actual audio)

---

## Actor–Critic loop (GAN-ish, but LLM-native)

Key difference from a classifier: the critic is a **thought partner** that must produce *actionable edits*.

### Loop sketch

```
TOPIC + RECEIPTS
      |
      v
  Evidence Pack  (tight research summary + receipt list + "allowed claims")
      |
      v
 ACTOR (draft) ------------------------------+
      |                                     |
      v                                     |
 CRITIC (score + failure modes + rewrites)  |
      |                                     |
      v                                     |
 ACTOR (revise using critique)              |
      |                                     |
      +------------- repeat N --------------+
                    |
                    v
             BEST DRAFT + TRACE
```

### How the critic improves over time
We update the critic’s rubric/prompt with:
- the recurring failure patterns (e.g. “cold-open on error”, “no drop”, “unsupported spicy claim”)
- high-quality positive exemplars (short excerpt-level patterns, not full transcripts)
- “red flag” phrasing to auto-detect cheesy drift

Important: we do **not** hard-template the structure; we hard-template the **failure modes**.

---

## Implementation plan (in this repo)

Phase 1 (text-only distillation; no video required):
1) Define `program` + `rubric` as versioned JSON/MD assets.
2) Build a CLI script that runs the actor–critic loop on a topic + receipts pack.
3) Publish artifacts (drafts + critiques + best) to Blob and show them in Studio.

Phase 2 (optional audio diagnostics):
4) Generate VO from the best script; compute pacing stats.
5) Use stats as another critic input (not as generation skeleton).

---

## Why this is an upgrade

- Stops over-quantizing style into a rigid beat grid.
- Makes “recognizable Fireship” a **contract + rejection loop**, not a template.
- Produces a full trace, so quality is inspectable and improvable over time.