Skip to content

Add docgen scene-gen: auto-generate Manim scenes from narration markdown#25

Open
jmjava wants to merge 1 commit intomainfrom
cursor/resolve-issue-1-scene-gen-96fc
Open

Add docgen scene-gen: auto-generate Manim scenes from narration markdown#25
jmjava wants to merge 1 commit intomainfrom
cursor/resolve-issue-1-scene-gen-96fc

Conversation

@jmjava
Copy link
Copy Markdown
Owner

@jmjava jmjava commented Apr 16, 2026

Summary

This PR addresses the core design flaw described in issue #1: Manim scene authoring was a fully manual gap in the pipeline. Now docgen scene-gen can automatically generate renderable Manim scenes from narration markdown, eliminating the ??? step.

Problem

The pipeline was: narration → tts → timestamps → ??? → manim → compose

The ??? required manually writing Python Manim scene classes, reading timing.json to place animations at the right timestamps. This was undocumented, time-consuming (~30+ min per segment), and fragile when TTS durations changed.

Solution: docgen scene-gen

New CLI command that parses narration markdown and generates Manim scene code:

docgen scene-gen                     # generate all segments
docgen scene-gen --segment 01        # single segment
docgen scene-gen --dry-run           # preview without writing
docgen scene-gen --force             # overwrite existing scenes

The pipeline now becomes: narration → tts → timestamps → scene-gen → manim → compose

How it works

Narration parsing extracts visual beats from markdown structure:

  • # Heading → title card (font_size=36, FadeIn/FadeOut)
  • - Item / 1. Item → bullet list (VGroup.arrange, sequential reveals)
  • Plain paragraphs → centered body text (font_size=20)
  • --- horizontal rules → visual transitions

Timing integration reads animations/timing.json (from docgen timestamps) to distribute beats evenly across the audio duration with buffer between sections.

Generated code follows all lessons from issue #3:

  • Text.set_default(font="Liberation Sans") before any Text()
  • Uses arrange(DOWN), center(), to_edge() — never absolute coordinates
  • Never uses weight=BOLD (Pango font substitution prevention)
  • Replaces unsafe unicode (→, ›, —, etc.) with ASCII equivalents
  • Minimum font sizes: title=36, bullets=18, body=20
  • Dark background (#1e1e2e) with WHITE text

Manual scenes remain supported: generated files are per-segment (scene_01.py) and won't overwrite existing files without --force.

New files

File Purpose
src/docgen/scene_gen.py Scene generator module: parsing, timing, code generation
tests/test_scene_gen.py 29 tests covering all components

Changes to existing files

File Change
src/docgen/cli.py Added scene-gen command
src/docgen/config.py Added manim_font property

Acceptance criteria from issue #1

Criteria Status
docgen init scaffolds projects that produce watchable videos without hand-written Manim Addressed — scene-gen generates renderable scenes from narration
Regenerating TTS (longer/shorter audio) automatically adjusts visuals Addressed — scene-gen reads timing.json; re-running after timestamps updates timing
Manual Manim scenes remain supported as an upgrade path Done — per-segment files, no overwrite without --force
Documentation clearly states the visual authoring workflow Done — CLI help, docstrings, generated file headers

Testing

  • 125 total tests pass (pytest tests/ --ignore=tests/e2e)
  • 29 new tests for scene_gen module
  • ruff check src/ tests/ — all checks passed

Closes #1

Open in Web Open in Cursor 

Implements the core of issue #1 (Option A + Option C hybrid):

- New module: scene_gen.py — parses narration markdown to extract visual
  beats (titles, bullets, text, transitions) and generates Manim scene
  Python code with proper timing from timing.json

- New CLI command: docgen scene-gen [--segment] [--force] [--dry-run]
  Auto-generates scene files per segment from narration structure

- Narration parsing extracts:
  - Headings → title cards with FadeIn/FadeOut
  - Bullet lists → sequential text reveals with VGroup.arrange()
  - Plain text → centered body text
  - Horizontal rules → visual transitions

- Generated code follows all font/layout lessons from issue #3:
  - Text.set_default(font=...) called before any Text()
  - Uses arrange(DOWN) and center(), never absolute coordinates
  - Never uses weight=BOLD
  - Replaces unsafe unicode with ASCII equivalents
  - Font sizes ≥ 14pt (titles=36, bullets=18, body=20)
  - Dark background (#1e1e2e) with WHITE text

- Timing integration: reads duration from animations/timing.json to
  distribute beats evenly across the audio duration

- Manual scenes remain supported: generated files are per-segment
  (scene_01.py, etc.), won't overwrite without --force

- Config: adds manim.font property to config.py

- 29 new tests covering parsing, timing, code generation, and generator
  integration (125 total tests passing)

Closes #1

Co-authored-by: John Menke <jmjava@gmail.com>
@cursor cursor bot force-pushed the cursor/resolve-issue-1-scene-gen-96fc branch from ea090d4 to a73c69b Compare April 16, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design flaw: Manim scene authoring is a manual gap in the pipeline

2 participants