Vector Atlas — Primer
Project Primer: Vector Atlas
Origins
Vector Atlas is an interactive map of over 1,200 AI-generated poems from The Magic Porridge Poet, plotted in two-dimensional vector space. Poems that are semantically similar appear close together; dissimilar poems are far apart. The axes have no intrinsic label — they are emergent dimensions of meaning.
The project makes the abstract concept of high-dimensional semantic space tangible: you can pan, zoom, and click through a landscape of meaning. It builds on dimensionality reduction techniques explored in the earlier Vector Transmissions project and connects to the poem search capabilities from Words Fail, Send Love.
Structure
The project has several interconnected components:
- Embedding pipeline: Poems are embedded as 1,536-dimensional vectors using OpenAI's
text-embedding-3-smallmodel, stored in a pgvector-enabled Supabase database - Dimensionality reduction:
scripts/compute-vector-atlas.pyuses UMAP (Uniform Manifold Approximation and Projection) to reduce 1,536 dimensions to 2, preserving local semantic structure - Pre-computed dataset:
lib/data/vector-atlas.json— ~3,870 points (each poem produces 3 chunks: poem text, author's note, insight) - Interactive map: Built with Leaflet using a flat coordinate system (CRS.Simple — no geography, no tiles, just a blank plane of meaning)
- Layer system: Three toggleable layers — poems (blue), notes (yellow), insights (aqua) — each rendered as Canvas-backed CircleMarkers for performance
- Poem detail panel: Clicking a marker fetches the full poem content from Sanity CMS and renders it below the map using PortableText
Key Concepts
- Vector embedding: A numerical representation of text in high-dimensional space; semantically similar texts have similar vectors
- UMAP: Uniform Manifold Approximation and Projection — an algorithm that reduces high-dimensional data to 2D while preserving local neighbourhood structure
- Semantic similarity: Poems about related themes cluster together in vector space; the map reveals these invisible relationships
- Chunk types: Each poem produces three embeddable chunks (poem, note, insight) that cluster differently, revealing how the same work occupies different regions of meaning space
- CRS.Simple: Leaflet's coordinate reference system for non-geographic maps — coordinates are just pixels on a plane
Thematic Clusters
- Emergent Meaning: The map reveals thematic clusters that were never explicitly defined — they emerge from the mathematics of embedding
- Three Perspectives: Poem text, author's note, and insight occupy different regions, showing how the same work means different things through different lenses
- Navigable Abstraction: Taking something invisible (1,536-dimensional space) and making it explorable by human senses
- Poetry and Mathematics: The intersection of creative writing and computational geometry
- Live Content: Coordinates are a static snapshot, but poem content is always fetched live from Sanity CMS
Conceptual Vocabulary
- Vector space: The mathematical space in which poems are represented as points; proximity = semantic similarity
- CircleMarker: Leaflet's Canvas-rendered marker type; used for performance across ~3,870 points
- Layer toggle: UI control to show/hide poem, note, and insight layers independently
- pgvector: PostgreSQL extension for vector similarity search; stores the raw 1,536-dimensional embeddings
Related Projects
- The Magic Porridge Poet: The source corpus — all 1,200+ poems visualised here come from this project
- Vector Transmissions: Earlier exploration of dimensionality reduction and embedding visualisation
- Words Fail, Send Love: Uses the same poem embeddings for semantic search (vector similarity queries via pgvector)
- Reconstructed War Memorials: Shares the Leaflet map interaction pattern (though for geographic rather than semantic space)
Agent Guidance
When discussing Vector Atlas:
- Explain the core metaphor: This is a map of meaning, not geography. Poems that are about similar themes sit close together; the axes have no labels because they're emergent dimensions
- Mention the three layers: Poems, notes, and insights cluster differently — toggling layers reveals how the same work occupies different regions of meaning
- Technical depth is welcome: This audience appreciates hearing about UMAP, pgvector, 1,536 dimensions → 2, and Canvas-rendered markers for performance
- Connect to the poetry project: Vector Atlas doesn't exist without The Magic Porridge Poet — it's a lens onto that corpus
- Common questions: "What do the axes mean?" (nothing specific — they're emergent dimensions from UMAP reduction), "Why do some poems cluster together?" (semantic similarity in the embedding space), "Can I search?" (click any marker to see the full poem from Sanity CMS), "How many points?" (~3,870 — three chunks per poem × 1,200+ poems)
- Avoid: Implying the 2D projection is a perfect representation of the full 1,536-dimensional space; UMAP preserves local structure but distorts global distances