Clem Delangue 🤗’s Post

View profile for Clem Delangue 🤗, graphic
Clem Delangue 🤗 Clem Delangue 🤗 is an Influencer

Co-founder & CEO at Hugging Face

BERT was completely forgotten by the media & AI hype but was probably the most influential model of all times (thanks Google)! Excited about ModernBERT by Jeremy Howard, Answer.AI and LightOn! You don't want big models for 99% of the AI tasks that can be done better, faster, cheaper with smaller models!

  • graphical user interface, text, application, email
Khurram R.

Founder @ Neighbors.ai | MIT MBA | AI @ UT Austin | ex-Samsung

14h

I just wrote a paper on BERT, and I have the same feeling. We have never expected our physicians to fix our cars, process our loan applications, and tutor our children. So why should we have a gigantic model that takes millions of dollars to train. Smaller models are cheaper, faster, and solve the data privacy problem. Let's go to SLM from LLM.

Mariangel Reyes

Data Analyst @ Mercantil Bank | Python | SQL | Data Visualization | Machine Learning | Physicist

3m

I had so wonderful results using BERT. I am excited to replicate the same project but with modernBERT this time.

Like
Reply
Michael Welsch

Entwickler für Künstliche Intelligenz | Entrepreneur | Ingenieur | Gründer | Dozent

3h

BERT models were technologically ahead of their time but were simply victims of the discovered scaling laws. For these laws, the pure decoder GPT models were better suited. With the end of the scaling laws, the architectural components might make a comeback. When I look at the multimodal models that stack encoders and decoders together, it feels very close to the original concepts of the BERT models.

Like
Reply
B Kumar

Neugence Technology Pvt. Ltd.

10h

Thanks for sharing. What kind of ai tasks do you mean ?

Like
Reply
Aditya Shankar

Leader in AI, Machine Learning, Advanced Analytics and Data

1m

Couldnt agree more Clem. Trying my hands on it right away.

Like
Reply
Igor Kasianenko

Partner Engineer, GenAI at Meta

4m

Small models 💪

Like
Reply
Dhilip Subramanian

Helping Small & Medium Businesses to Build Analytics Platforms and Transform Data into Insights | Data Consultant

6h

Efficiency is key; larger versions aren't necessarily better, particularly when smaller models can provide quicker, more affordable alternatives.!

Like
Reply
Don Vetal 3rd

Data Scientist | AI Architect

13h

Most influential of all time. I take issue with that statement. We move through time. Time value of value. Those older models that were cherry picked and refined are more valuable/influential the further you go backwards into history. But yeah BERT was awesome

Tim Khthondev

EdTech & FinTech Design Engineer | Founder of Valkyra Labs | Empowering User-Centric Tech Solutions

14h

Clem, what I love about these smaller models is how accessible they are for people who want to join the movement without shelling out for big servers or expensive GPUs. They’re also super convenient for building pipelines with multiple models working together. Definitely planning to give ModernBERT a spin — and I gotta take a moment to thank everyone making this tech accessible for developers on a budget. Your work is driving real progress 💖

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics