Hugging Face reposted this
ModernBERT, BERT revisited in the age of LLMs and Generative AI! LightOn and Answer.ai modernized BERT! Improved architecture with 8192 context length, flash attention, and trained on 2T tokens. ModernBERT outperforms version BERT and RoBERTa versions! 👀 TL;DR; 2️⃣ Comes in 2 sizes base (139M) and large (395M) 🚀 Better performance across all metrics than the original BERT 📏 8,192 token context length (16x longer than BERT) ⚡ Modern architecture with Flash Attention 2, RoPE embeddings, and alternating attention 📚 Trained on 2 trillion tokens, primarily English and Code 💨 2-4x faster than other models with mixed-length inputs 🔓 Released under Apache 2.0 🤗 Available on Hugging Face and Transformers (main) Models: https://lnkd.in/ethiJ2xh Blog: https://lnkd.in/ebiEzb4P Paper: https://lnkd.in/ezR8MUBF