The Geometry of Thought Part 1 - Embeddings Are Scary

August 18, 2025

If words can be embedded… what else can be embedded?


Table of Contents

Chapter 0: Intro

Read Chapter 0: Intro

Chapter 1: Embeddings Are Scary

One of the first things I did after downloading a small dataset was to tokenize the text, build a vocabulary using Byte Pair Encoding, convert tokens into IDs, and finally transform them into embeddings.

On the surface, embeddings look simple: a (sub)word turned into a long list of floating-point numbers. Just vectors in a high-dimensional space.

But take a step back. That random-looking sequence of numbers is carrying something extraordinary: the meaning of words, and by extension, the structure of human language, and by extension, human thought.

That realization shook me.

Because embeddings reveal two insights:

  1. Meaning is relational. Words don’t carry meaning by itself. They mean what they mean because of the words around them.
  2. Structure emerges from scale. With enough data, geometry itself begins to reveal hidden order.

You can see this everywhere: from word2vec’s skip-gram, and BERT’s masked LM and GPT’s next-token prediction, all the way to T5’s span corruption. Different objectives, same principle: meaning through relationships, structure through scale.

And this “geometry structure” isn’t triangles or circles. It’s the shape of relationships — where distances map to similarity, directions map to transformations, and clusters form neighborhoods of thought.

And then the question hit me: if words can be embedded… what else can be embedded?

The answer is: almost anything. Faces are already represented as embeddings in facial recognition systems. Body language can be embedded, powering models that translate text into motion. Even something as elusive as smell has been mapped into vectors, letting researchers predict how an odor will be perceived.

Social media embed people. Not by modeling “friendship” directly, but by following the traces of it: likes, messages, tags, follows. Out of those discrete signals, a continuous geometry of “you” begins to emerge.

Personality, identity, even relationships can all be approximated as vectors — not because they are inherently discrete, but because our actions and choices generate discrete signals that can be embedded.

So why stop there? Could memory itself be embedded? Could grief, love, identity be flattened into vectors?

Think about your first heartbreak. If we could capture every trace of brain activity — every signal, every pattern — then by the same logic we’ve seen with language, meaning through relationships and structure through scale, wouldn’t that memory be representable as a vector too?

The only barrier seems to be access: we don’t yet know how to capture those inner states at high enough fidelity.

But imagine if we could.

Imagine sending a childhood trauma to your therapist the way we now pass embeddings between models. Imagine advertising not as a billboard on a screen, but as a prompt injected directly into your stream of thought.

If memories become portable, they become copyable. If they’re copyable, they’re ownable. But by whom? Your memories will become commodities that can be shared, sold, modified and even stolen.

And here’s a even scarier thoght: embeddings are compressions. They preserve relationships, but don’t capture full details. It can reveal patterns we couldn’t see otherwise, but it can never carry the full meaning of the original.

Your first heartbreak could collapse into the same neighborhood as a thousand other heartbreaks. Just another dot in that latent space. In that averaging, something unique — something you — is lost.

So we come to the most interesting question of all:

If everything about you can be embedded, what remains that is uniquely you?

Chapter 2: Attention Is All You Have

Read Chapter 2: Attention Is All You Have

Back to Blog