Bingyue P.’s Post

View profile for Bingyue P., graphic

Head of Monetization GenAI at ByteDance

Introduce Infinity - the latest text to image foundation model which substantially lift the upper limits of autoregressive models by large margins in visual generation. Infinity is build on the top of our previous research VAR (Neurips 2024 Best Paper), and redefines visual autoregressive model under a bitwise token prediction framework with an infinite-vocabulary tokenizer & classifier and bitwise self-correction mechanism, remarkably improving the generation capacity and details. Without extra optimization, Infinity generates a high-quality 1024×1024 image in 0.8 seconds, making it as the fastest text-to-image model without distillation. Infinity achieves a very high win-rate compared to other top autoregressive models, and match or beat leading diffusion models. Watch our GitHub and we are going to open source both models and weights in the next couple of weeks: https://lnkd.in/g9phQ6F7

  • No alternative text description for this image
  • No alternative text description for this image

To view or add a comment, sign in

Explore topics