Jennifer Davis, Ph.D.’s Post

Accomplished Data Scientist and AI Expert | Transforming Industries with Strategic use of Artificial Intelligence | Innovation Leader | Team Development Accelerator

5mo

Impressive to see the release of Mistral NeMo, a 12B parameter model in collaboration with NVIDIA! While its parameter count might not be as impressive as models like Yandex100B or LLAMA2, the 128k token context window is a novel and potentially game-changing feature. This large context window could significantly enhance its reasoning and coding accuracy. Additionally, the model's quantization awareness, enabling FP8 inference without performance loss, is a smart move for reducing environmental impact. However, we must remember that quantized models can sometimes be less accurate. Excited to see how Mistral NeMo will be adopted in various applications!

Andriy Burkov

PhD in AI, ML at TalentNeuron, author of 📖 The Hundred-Page Machine Learning Book and 📖 the Machine Learning Engineering book

5mo

This is probably the most important model released since Mistral 7B, with a 128k context size, multilingual support, Apache 2.0 license, and the size allowing for $0.2 per million tokens inference. Together with the release of GPT-4o-Mini, what a day!

To view or add a comment, sign in

More Relevant Posts

Merve Noyan

open-sourceress at 🤗 | Google Developer Expert in Machine Learning, MSc Candidate in Data Science
1mo
Report this post
Small yet mighty! 💫 We are releasing SmolVLM: a new 2B small vision language made for on-device use, fine-tunable on consumer GPU, immensely memory efficient 🤠 We release three checkpoints under Apache 2.0: SmolVLM-Instruct, SmolVLM-Synthetic and SmolVLM-Base (made for fine-tuning) and an adapter trained using DPO 🤗 This release comes with a blog on training details and evaluation, a demo, fine-tuning code, MLX integration and TRL integration for DPO 💝 All the links can be found in the comments 💬
3 Comments
Like Comment
To view or add a comment, sign in
Linkerd

316 followers
3w
Report this post
You there! Are you suffering from an acute case of Kubernetes? Are your services unobserved? Your traffic unencrypted? Your cluster failover strategy a mess of shell scripts? We have the cure for you! Linkerd, the fastest, lightest, most secure service mesh on the planet. Linkerd's ultralight Rust proxies give you instant health metrics zero-config mutual TLS, retries, gRPC load balancing, cluster failover, and more. Linkerd is a CNCF-graduated, 100% open source project. Visit linkerd .io to learn more. Linkerd is not approved by the FDA. Ingest at your own risk. May cause temporary euphoria and/or blindness. Buoyant reminds you to always mesh responsibly.

1 Comment
Like Comment
To view or add a comment, sign in
Eustache Le Bihan

ML Engineer @ HuggingFace | MVA @ ENS | IMT Atlantique
4mo
Report this post
𝗧𝗵𝗮𝗻𝗸𝘀 𝘁𝗼 𝘀𝘁𝗮𝘁𝗶𝗰 𝗞𝗩 𝗰𝗮𝗰𝗵𝗶𝗻𝗴 𝗮𝗻𝗱 𝘁𝗼𝗿𝗰𝗵 𝗰𝗼𝗺𝗽𝗶𝗹𝗲, 𝗣𝗮𝗿𝗹𝗲𝗿-𝗧𝗧𝗦 𝗷𝘂𝘀𝘁 𝗴𝗼𝘁 𝘂𝗽 𝘁𝗼 𝟰.𝟱𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 🚀🤗 The Parler-TTS Mini and Large v1 models release enables high-quality, consistent speech generation via simple text prompts... but also includes 𝗮 𝗰𝗼𝗱𝗲𝗯𝗮𝘀𝗲 𝘂𝗽𝗱𝗮𝘁𝗲! 🔥
5 Comments
Like Comment
To view or add a comment, sign in
Linkerd

316 followers
1mo Edited
Report this post
You there! Are you suffering from an acute case of Kubernetes? Are your services unobserved? Your traffic unencrypted? Your cluster failover strategy a mess of shell scripts? We have the cure for you! Linkerd, the fastest, lightest, most secure service mesh on the planet. Linkerd's ultralight Rust proxies give you instant health metrics zero-config mutual TLS, retries, gRPC load balancing, cluster failover, and more. Linkerd is a CNCF-graduated, 100% open source project. Visit linked .io to learn more. Linkerd is not approved by the FDA. Ingest at your own risk. May cause temporary euphoria and/or blindness. Buoyant reminds you to always mesh responsibly.

5 Comments
Like Comment
To view or add a comment, sign in
StoneAge Waterblast Tools

9,193 followers
7mo
Report this post
Problem: Sentinel OS1 has too many steps during setup before operation. Now Introducing Machine Learning with Sentinel OS 2. ✅ https://bit.ly/3winSrB
1 Comment
Like Comment
To view or add a comment, sign in
Phil Cluff

Director of Product Management, Mux Video
8mo
Report this post
Back at the end of 2020 while the pandemic was still at its peak, I gave a talk at Mile High Video lamenting the pains of still being stuck using RTMP as a contribution protocol. I'd spent time playing with SRT, RIST, and a few other proprietary contribution protocols, trying to make them fit into Mux's multi-tenant architecture, but it just wasn't quite the right time. SRT's stream_id had only just been introduced as a way to achieve multiplexing, and wasn't widely supported. There was also still no clear leader in what protocols both software and hardware encoder vendors were supporting. Flash forward to the end of last year, and the ecosystem had moved on pretty quickly, and we had a prototype up and running proxying SRT to RTMP for customers to test their encoders and kick the tyres on, and it worked pretty well, I even showed it off at SF Video Tech (though my lightning talk will never be as good as Alex Converse's original masterpiece "SRT: How the hot new UDP video protocol actually works under the hood"). Customers were asking "when can we use it in production?". At this point, SRT had gained critical mass, and the time had come for us to actually go and integrate it natively into the product. Robert Peck, Corey Sery, Michael Smith, and the rest of the engineering team did a great job of building SRT support into our edge clusters, improving a whole host of other product features as they went, and ultimately giving customers access to a public beta a little over a month ago. We even slipped in support for HEVC. And that brings us to today, and a GA announcement of our SRT feature, including an awesome marketing video from Staci DeGagne, featuring some handsome young dude 👀. Sometimes things don't change as quickly as you want, and you just have to wait a little while for the ecosystem to stabilize around you. As always a huge thank you to Haivision for developing and open sourcing SRT for us to all use ❤️.

39 Comments
Like Comment
To view or add a comment, sign in
AI Makerspace

10,043 followers
1w
Report this post
What we built 🏗️, shipped 🚢, and shared 🚀 last week: vLLM! Last week, we explored vLLM, or Virtual LLM, the fastest and easiest-to-use open-source LLM inference and serving engine. 📚 We learned: 🍾 The bottleneck is memory 🤔 Virtual LLM is a throwback to virtual memory with paging 🏦 High-throughput serving requires batching and smart KV-caching 📜 PagedAttention does smart KV caching vLLM is not just about faster inference—it’s about building a complete package for efficient inference and serving. Full Video: https://lnkd.in/gxTBwHDz #InferenceServer #vLLM
Like Comment
To view or add a comment, sign in
Abrar Ahmed

AI Research | SSU | DI Lab
5mo
Report this post
An insightful introduction to Federated Learning (FL) that is one of the privacy enhancing technique (PET). The course includes client selection, tuning, privacy and bandwidth topics in FL.

Abrar Ahmed, congratulations on completing Intro to Federated Learning!

learn.deeplearning.ai
Like Comment
To view or add a comment, sign in
Bhaskara Reddy Sannapureddy

Senior Project Manager|Infosys|B.E(Hons) BITS, Pilani & PGD in ML & AI at IIITB & Master of Science in ML & AI at LJMU, UK | (Building AI for World & Create AICX)(Learn, Unlearn, Relearn)
9mo
Report this post
DBRX 132B - Instruct 4 bit 🔥 > Requires 70GB RAM, powered by MLX 🤗 Model: https://lnkd.in/ggk98DsA DBRX 132B is wild! 🤯 > Trained on 12 Trillion tokens. > Beats Grok-1, Mixtral, etc. > Mixture of Experts. 16 experts, 4 active. > Uses RoPE, GLU and GQA. > Context size of 32K. > Open access - Base and Instruct. 🔥 > Requires 264 GB RAM; inference with Transformers! 🤗 Check https://lnkd.in/gyY2xgk3 Both the base and instruct checkpoints are open access on the Hub: https://lnkd.in/ghb6sBcP

DBRX - a databricks Collection

huggingface.co
Like Comment
To view or add a comment, sign in
Ubidots

3,750 followers
9mo
Report this post
🧑💻 Looking to overcome problems that often arise with decoders (those in charge of transforming binary or Base64 payloads into friendly JSON ones), which can be confusing due to their length or even introduce compatibility issues, we offer a public “Decoder as a Service” endpoint. https://buff.ly/43NjlZG
Like Comment
To view or add a comment, sign in

2,221 followers

View Profile Connect

Jennifer Davis, Ph.D.’s Post

More from this author

Design Thinking for Healthcare: Unleashing the Power of Collaboration to Solve Complex Challenges

Design Thinking: The Key to Revolutionizing Problem-Solving in Healthcare

Designing for Success: Enterprise Design Thinking vs. Google Ventures-Style Design Sprints

Explore topics