9 min read
Why QTT for vector search (and what that sentence actually means)
Quantized Tensor Train decomposition is a mathematical idea older than modern vector search. Here is what it compresses, what it doesn't, and why that matters when 100 million embeddings have to fit on a single GPU node.