LLM2Vec-Gen: Self-Supervised Embeddings Represent Model Responses Instead of Input Content

Researchers from McGill University, ServiceNow Research, and Mila have introduced LLM2Vec-Gen, a novel approach to creating embeddings from large language models that fundamentally shifts how embeddings are generated. Published on arXiv on March 11, 2026, the method learns to represent the model's potential response rather than encoding the input semantic content, achieving a 9.3% improvement over the best unsupervised baselines on the Massive Text Embedding Benchmark (MTEB).

Paradigm Shift From Input Encoding to Response Representation

Traditional embedding methods encode the semantic content of input text. LLM2Vec-Gen takes a fundamentally different approach by learning to represent what the language model would generate in response to that input. The researchers explain that "rather than encoding the input, we learn to represent the model's potential response" by adding trainable special tokens to the LLM's vocabulary and optimizing them to capture the model's completion for any given query.

This formulation helps bridge the input-output gap in embedding tasks and transfers LLM capabilities such as safety alignment and reasoning directly to the embedding space. The approach requires only unlabeled queries for training, making it fully self-supervised.

Technical Implementation With Frozen LLM Backbone

The LLM2Vec-Gen method appends trainable special tokens to input queries and optimizes these tokens to represent the LLM's response in a fixed-length sequence. Critically, the LLM backbone itself remains frozen throughout training, eliminating the need for expensive fine-tuning of billion-parameter models.

Training is guided by two complementary signals:

The LLM's own completion for each query
An unsupervised embedding teacher providing distillation targets

This dual-signal approach enables the embeddings to capture both the generative capabilities of the LLM and the semantic structure learned by traditional embedding models.

State-of-the-Art Performance on MTEB With Safety Benefits

LLM2Vec-Gen achieves state-of-the-art self-supervised performance on the Massive Text Embedding Benchmark, showing a 9.3% improvement over the best unsupervised embedding teacher. Beyond benchmark performance, the method demonstrates significant practical advantages:

43.2% reduction in harmful content retrieval compared to traditional embeddings
29.3% improvement in reasoning capabilities for embedding tasks
Interpretable embeddings that can be decoded back into text to reveal semantic content

The substantial reduction in harmful content retrieval suggests that the safety alignment present in modern LLMs successfully transfers to the embedding space, addressing a critical concern for production retrieval systems.

Implications for Retrieval-Augmented Generation Systems

The ability to transfer LLM capabilities like reasoning and safety alignment to embeddings has direct implications for retrieval-augmented generation (RAG) pipelines. Traditional embeddings optimize for semantic similarity without considering whether retrieved content aligns with the safety constraints or reasoning patterns of the downstream LLM. LLM2Vec-Gen embeddings naturally encode these properties because they represent how the model would respond.

The interpretability of these embeddings—achieved by decoding them back into text—also provides a new debugging and analysis tool for understanding what information retrieval systems prioritize.

Key Takeaways

LLM2Vec-Gen learns to represent an LLM's potential response rather than encoding input content, achieving 9.3% improvement over best unsupervised baselines on MTEB
The method requires no labeled data and keeps the LLM backbone frozen, using only trainable special tokens optimized with guidance from the LLM's completions and an embedding teacher
Embeddings show 43.2% reduction in harmful content retrieval and 29.3% improvement in reasoning capabilities, transferring LLM safety alignment to the embedding space
The approach bridges the input-output gap in embedding tasks, making embeddings better suited for retrieval-augmented generation pipelines
Embeddings are interpretable and can be decoded back into text to reveal what semantic content they capture

Paradigm Shift From Input Encoding to Response Representation

Technical Implementation With Frozen LLM Backbone

Training is guided by two complementary signals:

The LLM's own completion for each query

An unsupervised embedding teacher providing distillation targets

This dual-signal approach enables the embeddings to capture both the generative capabilities of the LLM and the semantic structure learned by traditional embedding models.

State-of-the-Art Performance on MTEB With Safety Benefits

43.2% reduction in harmful content retrieval compared to traditional embeddings

29.3% improvement in reasoning capabilities for embedding tasks

Interpretable embeddings that can be decoded back into text to reveal semantic content

Implications for Retrieval-Augmented Generation Systems

The interpretability of these embeddings—achieved by decoding them back into text—also provides a new debugging and analysis tool for understanding what information retrieval systems prioritize.

Key Takeaways

LLM2Vec-Gen learns to represent an LLM's potential response rather than encoding input content, achieving 9.3% improvement over best unsupervised baselines on MTEB

The method requires no labeled data and keeps the LLM backbone frozen, using only trainable special tokens optimized with guidance from the LLM's completions and an embedding teacher

Embeddings show 43.2% reduction in harmful content retrieval and 29.3% improvement in reasoning capabilities, transferring LLM safety alignment to the embedding space

The approach bridges the input-output gap in embedding tasks, making embeddings better suited for retrieval-augmented generation pipelines

Embeddings are interpretable and can be decoded back into text to reveal what semantic content they capture