1/
How well can Transformers build World Models?
Large Language Models (#LLMs) are statistical in nature. By learning from enormous corpora, do they actually learn “world models” latent behind the texts? Can next-token prediction really learn the principles governing the environment it operates in?
Edited 240d ago