BERT was completely forgotten by the media & AI hype but was probably the most influential model of all times (thanks Google)! Excited about ModernBERT by Jeremy Howard, Answer.AI and LightOn! You don't want big models for 99% of the AI tasks that can be done better, faster, cheaper with smaller models!
I just wrote a paper on BERT, and I have the same feeling. We have never expected our physicians to fix our cars, process our loan applications, and tutor our children. So why should we have a gigantic model that takes millions of dollars to train. Smaller models are cheaper, faster, and solve the data privacy problem. Let's go to SLM from LLM.
I had so wonderful results using BERT. I am excited to replicate the same project but with modernBERT this time.
BERT models were technologically ahead of their time but were simply victims of the discovered scaling laws. For these laws, the pure decoder GPT models were better suited. With the end of the scaling laws, the architectural components might make a comeback. When I look at the multimodal models that stack encoders and decoders together, it feels very close to the original concepts of the BERT models.
Thanks for sharing. What kind of ai tasks do you mean ?
Couldnt agree more Clem. Trying my hands on it right away.
Small models 💪
Efficiency is key; larger versions aren't necessarily better, particularly when smaller models can provide quicker, more affordable alternatives.!
Most influential of all time. I take issue with that statement. We move through time. Time value of value. Those older models that were cherry picked and refined are more valuable/influential the further you go backwards into history. But yeah BERT was awesome
Clem, what I love about these smaller models is how accessible they are for people who want to join the movement without shelling out for big servers or expensive GPUs. They’re also super convenient for building pipelines with multiple models working together. Definitely planning to give ModernBERT a spin — and I gotta take a moment to thank everyone making this tech accessible for developers on a budget. Your work is driving real progress 💖
Co-founder & CEO at Hugging Face
14hMore info here: https://huggingface.co/blog/modernbert!