Test-Time Scaling

Paper Reading | Latent Thought Models

LTM turns extra inference-time compute into explicit per-example posterior optimization over latent thought vectors, rather than more token-space decoding.