Rafael Brown’s Post

CEO & Founder at Symbol Zero // Microsoft Regional Director

5mo Edited

Highlighting: "A single 8xSohu server is said to equal the performance of 160 H100 GPUs, meaning data processing centers can save both on initial and operational costs if the Sohu meets expectations." ----- Etched comes at NVidia creatively by focusing on transformer models. Could the Sohu chip reduce need for Nvidia A100 and H100 chips? ----- TomsHardware: "Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs. Startup Etched has created this LLM-tuned transformer ASIC." (Jowi Morales) (June 26, 2024) "Etched, a startup that builds transformer-focused chips, just announced Sohu, an application-specific integrated circuit (ASIC) that claims to beat Nvidia’s H100 in terms of AI LLM inference. A single 8xSohu server is said to equal the performance of 160 H100 GPUs, meaning data processing centers can save both on initial and operational costs if the Sohu meets expectations. According to the company, current AI accelerators, whether CPUs or GPUs, are designed to work with different AI architectures. These differing frameworks and designs mean hardware must be able to support various models, like convolution neural networks, long short-term memory networks, state space models, and so on. Because these models are tuned to different architectures, most current AI chips allocate a large portion of their computing power to programmability. Most large language models (LLMs) use matrix multiplication for the majority of their compute tasks and Etched estimated that Nvidia’s H100 GPUs only use 3.3% percent of their transistors for this key task. This means that the remaining 96.7% silicon is used for other tasks, which are still essential for general-purpose AI chips. Etched made a huge bet on transformers a couple of years ago when it started the Sohu project. This chip bakes in the transformer architecture into the hardware, thus allowing it to allocate more transistors to AI compute. We can liken this with processors and graphics cards let’s say current AI chips are CPUs, which can do many different things, and then the transformer model is like the graphics demands of a game title. Sure, the CPU can still process these graphics demands, but it won’t do it as fast or as efficiently as a GPU. A GPU that’s specialized in processing visuals will make graphics rendering faster and more efficient. This is what Etched did with Sohu. Instead of making a chip that can accommodate every single AI architecture, it built one that only works with transformer models. The company’s gamble now looks like it is about to pay off, big time. Sohu’s launch could threaten Nvidia’s leadership in the AI space, especially if companies that exclusively use transformer models move to Sohu. After all, efficiency is the key to winning the AI race, and anyone who can run these models on the fastest, most affordable hardware will take the lead." TomsHardware: https://lnkd.in/g2ZGiU-z #ai #cloud #aicloud #cloudai #cloudgpu #genai #transformermodel

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

tomshardware.com

To view or add a comment, sign in

More Relevant Posts

Astor Perkins

1,073 followers
7mo
Report this post
Forget Intel and AMD - Nvidia's next big competitor might be a company you've never heard of As reported by The Economist, there have been developments in the GPU field outside of the best graphics cards made by Nvidia and AMD for AI computing. That's because some of today's large language models run across many setups featuring interconnected GPUs and memory, such as with Cerebras' hardware. Cerebras Systems Inc. was founded just nine years ago but seems to benefit massively from the recent AI computing boom. It's innovated in ways that appear to put the current-gen H100 and the upcoming GB200 die to shame with a "single, enormous chip" cable of up to 900,000 GPU cores - such as with its CS-3 chip. The Cerebras CS-3 chip absolutely dwarfs the double die size of the huge GB200, and is the size of a steering wheel, requiring two hands to hold. It's been described by the manufacturer as the "world's fastest and most scalable AI accelerator" which is purpose-built to "train the world's most advanced AI models". However, it's not just Cerebras that is making moves here, as new start-up company Groq is also developing hardware for AI computing, too. Instead of going larger than its competition, it has instead developed what it calls dedicated LPUs (language processing units) which are built to run large language models effectively and quickly. In the company's own words, the Groq LPU Inference Engine is an "end-to-end inference system acceleration system, to deliver substantial performance, efficency, and precision in a simple design". It's currently running Llama-2 70B, a large-scale generative language and text model, at 300 tokens per user by second. Whether the likes of Cerebras and Groq, or even smaller companies such as MatX, have a chance here remains to be seen, however, as AI computing is still largely in its infancy, now is the time we'll be seeing the most experimentation with how the hardware can cater to the end user. Some will scale up, others will work smarter. https://lnkd.in/erF5VsxP

Forget Intel and AMD - Nvidia's next big competitor might be a company you've never heard of

techradar.com
Like Comment
To view or add a comment, sign in
Anton Dubov

Results-Driven Production Engineer @ Rychiger AG | Lean Six Sigma
7mo
Report this post
**Exciting News in AI and Computing: NVIDIA's Blackwell Architecture Unveiled!** We're witnessing a monumental leap in AI capabilities with NVIDIA's introduction of the Blackwell architecture. This cutting-edge technology is set to revolutionize generative AI and accelerated computing. Here's what you need to know: 🚀 **Blackwell GPUs**: At the heart of the architecture are the Blackwell GPUs, each packed with **208 billion transistors** and built using a custom **TSMC 4NP process**. These GPUs feature **two reticle-limited dies** connected by a **10 TB/s chip-to-chip interconnect**, functioning as a unified GPU. 💡 **GB200 NVL72**: The flagship model, the GB200 NVL72, is a marvel of engineering. It connects **36 dual-GPU Grace Blackwell "superchips"** (totaling **72 GPUs**) and **36 Grace CPUs** in a liquid-cooled, rack-scale design. This configuration acts as a single massive GPU, boasting **30X faster real-time inference** for trillion-parameter LLMs and supporting **13.5 TB of HBM3e memory**. 🧠 **Second-Generation Transformer Engine**: The Transformer Engine is enhanced with custom Blackwell Tensor Cores, enabling **4-bit floating point (FP4) AI** and fine-grain **micro-tensor scaling**. Coupled with NVIDIA TensorRT-LLM and NeMo Framework, it accelerates both inference and training for LLMs and MoE models. 🔒 **Security**: NVIDIA Confidential Computing ensures robust hardware-based security for sensitive data and AI models. 🔗 **NVLink and NVLink Switch**: The fifth-generation NVLink interconnect can scale up to **576 GPUs**, with the NVL72 configuration featuring a **72-GPU NVLink domain**. The NVLink Switch Chip provides an astounding **130TB/s of GPU bandwidth**, supporting NVIDIA SHARP™ FP8. 💲 **Pricing**: The GB200 NVL72 cabinet, with its 72 chips, is priced at a cool **$3 million**. This reflects the unparalleled performance and advanced technology that NVIDIA brings to the table. NVIDIA's Blackwell architecture is a game-changer for the AI and computing industries, offering unprecedented performance and efficiency. It's a testament to NVIDIA's commitment to innovation and leadership in the field. #NVIDIA #BlackwellArchitecture #AI #Computing #Innovation #Technology

1 Comment
Like Comment
To view or add a comment, sign in
Harsh Shandilya

I tell stories.
6mo
Report this post
day 5/90 When all you see are people buying shovels, sell the technology that guides every strike of the shovel👇 Nvidia has had a staggering growth of over 200 percent since last year and everyone credits it to them pioneering GPUs(graphics processing units) and being an almost monopoly in the chip industry boom created by AI. There is also a second and not-so-known reason behind the staggering growth of Nvidia. The software: CUDA. With GPUs booming, there was a need for software to control and configure GPUs beyond traditional graphics rendering and so, Nvidia solved its very own problem by creating a parallel computing platform. It helps in leveraging the massive parallel processing capabilities of the GPUs to accelerate computationally intensive applications. The CUDA programming model is built around the concept of hierarchy of threads. Threads are grouped into blocks, and blocks are grouped into a grid. This structure allows for efficient management of parallel computation. bla bla bla but how does it actually help in AI? Take the example of image processing for Midjourney, the model was to be trained with millions and millions of training set pictures. How can that be done quickly through CUDA? There are multiple blocks of threads where each block would process a small part of the image and the threads in each block would further break down the small part of image in smaller parts and all of the threads in all of the blocks would process the image at the same time, significantly speeding up the process. AMD and Intel also provide GPUs but the differentiating factor in it all has been CUDA, the software which enables the shovel to dig the gold. So will this arguable monopoly ever end? As one might say about every situation in life, “this too shall pass”. How? Startups like Cerebras, Groq, Hailo and many others have started building out hardware specifically built to increase the performance in AI. One of the hardware that’s slowly gaining traction is LPU or Learning process unit which are custom built to handle the workload requirements of AI. You have to keep in mind that GPUs are not custom-built for AI and so, it will always lose in the performance battle against the newer hardware. As these startups are building hardware to increase performance, the software needed to control them would be pivotal in competing with Nvidia. Whatever happens, the advancement of AI is not going to slow down and it’s only going to get faster and cheaper enabling true innovation in every field. Personally, very excited to see who becomes the first true competitor of Nvidia in the shovel business and how the market shifts to accommodate it. Godspeed.
6 Comments
Like Comment
To view or add a comment, sign in
José Fortes

Innovation, Entrepreneurship & Strategy || CEO @ LAUREON || VC || Board & Advisor
2mo
Report this post
It's always the use case, either because you find it, or because someone else does, and you were there to take advantage of it. NVIDIA, from general purpose GPU to AI GPU. A startup, a researcher and anyone who wants to turn an idea, technology or knowledge into a product that solves a real problem in the market, has to find the use case. What is it for?, what problem does it solve? GPUs have been in the market for a long time and they used to fill a niche where their special design made them solve a very specific type of problem. A GPU differs from a CPU, in essence, in that it is a vector processor that allows the same operation to be applied at the same time to a set of values, to a vector, rather than to a single value. CPUs are general purpose processors and can be used to solve any problem. How efficiently they can do so is another matter. GPUs are not naturally general purpose, since they need problems where their vector architecture makes sense. Well, an image on a screen is a set of rows of values indicating the color of each pixel. That's why the first major successful use case for NVDIA was the huge growth of the video game market. A video game has to be constantly rendering the screen to display the game graphics. The GPU handles that much faster than a CPU. NVIDIA has always tried to be seen as a possibility to solve many more problems, and they have always communicated this, creating tools to be applied to different use cases. NVIDIA did not know this, nor did they create the use case, but being positioned in the right place, the mega killer use case appeared: Generative AI with Large Language Models (LLM) used by ChatGPT and others. An LLM is a set of rows with probabilities, in essence, i.e. a vector. And who was there to solve vector computing better than anyone else? That's right, NVIDIA. From niche use cases and non-negligible success in video games to being today the third largest technology company in the world. NVDIA is, in fact, the company capturing the most value in the world with AI by far. This has led to the reasonable and expected change: NVIDIA is no longer so interested in being perceived as a solution applicable to various problems, but now the “general purpose GPU and you can do many things” has become in their own conferences “GPU for AI, faster, better”. Why? Because they have finally found the indisputable use case. That's how important the use case is. I hope it helps you to focus your value proposition, startup or transfer. Find below one of the slides I use to explain reasearchers the cycle from science to market, with the use case as the first step. #entrepreneurship #innovation #AI
Like Comment
To view or add a comment, sign in
Nadia Arian

Electrical & Electronics Engineer
8mo Edited
Report this post
Choosing the right graphics processing unit (GPU) for your AI server? 🤓 HERE IS HOW! Choosing the right GPU for your AI server is crucial for maximizing performance and efficiency. GPUs act as accelerators, aiding CPUs in processing requests to neural networks rapidly and effectively. NVIDIA's tensor cores, for instance, offer exceptional performance in various calculations crucial for high-performance computing (HPC), such as FP8, TF32, and FP16. Here's a concise guide to help you make the best GPU choice: Consider Workload Changes: Will your AI server's workload evolve over time? Modern GPUs are task-specific, with architectures tailored for certain AI areas. New hardware and software advancements can quickly render older GPUs obsolete. Training vs. Inference Focus: Determine if your emphasis is on training AI models or inference. Training involves processing vast data with numerous parameters to refine algorithms, while inference applies trained models to real-world data. Both require significant computational resources. GPU Options: NVIDIA H100: Designed for deep learning model training, offering over 32 petaflops of FP8 performance with its tensor cores and Transformer Engine. AMD Instinct™ MI300X: Known for high memory capacity and data bandwidth, ideal for inferencing-based AI applications like large language models (LLMs). Budget and Dataset Size: Consider budget constraints and dataset size. For smaller budgets or datasets, AMD and NVIDIA offer consumer-grade options like RTX 4090 or RTX 3090 for inferencing tasks. Long-term Stability: For stable long-term calculations and model training, NVIDIA's RTX A4000 or A5000 cards are recommended, providing optimal performance for certain tasks. Exotic Inference Solutions: Explore advanced inference solutions like AMD Alveo™ V70, NVIDIA A2/L4 Tensor Core, and Qualcomm® Cloud AI 100 for specialized needs. Software Optimization: Consider software optimization for HPC and AI. Servers with Intel Xeon or AMD Epyc processors paired with NVIDIA GPUs are recommended for optimal performance. In summary, tailor your GPU choice based on workload dynamics, budget, and performance requirements. For training and multi-modal neural networks, invest in higher-end GPUs like RTX 4090 to A100/H100, while for inferencing tasks, consumer-grade options or RTX A4000/A5000 can suffice. Prioritize stability, performance, and software optimization for a successful AI server setup.
4 Comments
Like Comment
To view or add a comment, sign in
Julio Wilder

Electronic Engineer. Sr. Project Manager(PMP), Energy Leader @ CACME (WEC) & Postgraduate Diploma in Hydrogen Economy @ UTN (FRBA)
2mo
Report this post
Challengers Are Coming for Nvidia’s Crown. In AI’s game of thrones, don’t count out the upstarts. By Matthew S. Smith It’s hard to overstate Nvidia’s AI dominance. Founded in 1993, Nvidia first made its mark in the then-new field of graphics processing units (GPUs) for personal computers. But it’s the company’s AI chips, not PC graphics hardware, that vaulted Nvidia into the ranks of the world’s most valuable companies. It turns out that Nvidia’s GPUs are also excellent for AI. As a result, its stock is more than 15 times as valuable as it was at the start of 2020; revenues have ballooned from roughly US $12 billion in its 2019 fiscal year to $60 billion in 2024; and the AI powerhouse’s leading-edge chips are as scarce, and desired, as water in a desert. Access to GPUs “has become so much of a worry for AI researchers, that the researchers think about this on a day-to-day basis. Because otherwise they can’t have fun, even if they have the best model,” says Jennifer Prendki, head of AI data at Google DeepMind. Prendki is less reliant on Nvidia than most, as Google has its own homespun AI infrastructure. But other tech giants, like Microsoft and Amazon, are among Nvidia’s biggest customers, and continue to buy its GPUs as quickly as they’re produced. Exactly who gets them and why is the subject of an antitrust investigation by the U.S. Department of Justice, according to press reports. Nvidia’s AI dominance, like the explosion of machine learning itself, is a recent turn of events. But it’s rooted in the company’s decades-long effort to establish GPUs as general computing hardware that’s useful for many tasks besides rendering graphics. That effort spans not only the company’s GPU architecture, which evolved to include “tensor cores” adept at accelerating AI workloads, but also, critically, its software platform, called Cuda, to help developers take advantage of the hardware. “They made sure every computer-science major coming out of university is trained up and knows how to program CUDA,” says Matt Kimball, principal data-center analyst at Moor Insights & Strategy. “They provide the tooling and the training, and they spend a lot of money on research.” Released in 2006, CUDA helps developers use an Nvidia GPU’s many cores. That’s proved essential for accelerating highly parallelized compute tasks, including modern generative AI. Nvidia’s success in building the CUDA ecosystem makes its hardware the path of least resistance for AI development. Nvidia chips might be in short supply, but the only thing more difficult to find than AI hardware is experienced AI developers—and many are familiar with CUDA. https://lnkd.in/dDsaRcxA
Like Comment
To view or add a comment, sign in
Grace Chng Grace Chng is an Influencer

Tech Editor | Author | Curator of SG100 Women in Tech Awards
9mo
Report this post
Nvidia is on fire. The tech titan has smashed through the stratosphere again, blasting its fiscal quarterly results to a staggering US$22.1 billion. We’re talking a mind-boggling 265% explosive growth. Its market cap reached US$1.977 trillion, cementing its position as the world’s third most valuable tech company after Microsoft (US$3.03 trillion) and Apple (US$2.80 trillion). Not bad for a maker of gaming graphics chips. I first heard of Nvidia more than two decades ago when the magazine I was editing reviewed gaming PCs using its graphics cards. Fast forward to 2024: Nvidia has leveraged its expertise in gaming GPUs for the AI sector. It is at the forefront of the GenAI revolution, providing the computing power that fuels this transformative technology. Can Nvidia maintain its incredible growth momentum? Nvidia's GPUs are ubiquitous in data centres. Software developers are increasingly tailoring their applications to leverage this robust technology, creating a virtuous network effect. What is Nvidia's secret? Its cornerstone of its success lies in its tech stack, built over the past few decades and which forms a competitive moat. Its GPU families like the DGX H100, DGX A100N and GH200 have become the de facto standard for AI training and inference. The hardware integrated with its software suite, AI Enterprise and Cuda, form a potent one-two punch. CUDA empowers developers to harness the immense processing power of GPUs for scientific and engineering computing applications. Additionally, its compatibility and tight integration with Nvidia GPUs guarantee seamless performance, further solidifying its dominance. Where is future growth coming from? One area is the enterprise. The company's AI Enterprise platform offers toolkits for different industries from automotive to pharmaceutical. Nvidia can monetise software services via pay-per-use-per GPU model. Another significant growth area is inferencing which is the process of utilising trained models to make predictions based on real-time data. But AI inferencing operates at great speed and scale making it an invaluable tool for tasks that demand rapid and accurate decision making. Massive GPU compute power -Nvidia’s stronghold - will be needed for inferencing. While Nvidia's current leadership in GPUs and AI is undeniable, rivals like Intel and AMD are building new chips and platforms to steal marketshare. Big tech companies like OpenAI, Microsoft, Meta and Amazon are investing in new AI chips to reduce over-reliance on Nvidia. The tech world is well known for its inherent unpredictability. The ever-present threat of disruption underlines the need for constant vigilance and innovation. Moreover, the GenAI landscape is still in its nascent stages, and unforeseen challenges may emerge. Can Nvidia continue to maintain its momentum? Its track record inspires confidence but sustained dominance is never guaranteed. #artificialintelligence #ai #nvidia #aitraining #inference #gpus
2 Comments
Like Comment
To view or add a comment, sign in
Ann-Cathrin Hoel

Head of Business Development , IP and Sustainability, Senior Partner at Onsagers AS
8mo
Report this post
Read this very interesting article on aiming the IP strategy to the specific business model here related to one of the worlds’ most intriguing companies presently- Nvidia
Prof. Dr. Alexander J. Wurzer

Director IP Management Training CEIPI | Chairman DIN77006 | Director Research Programms IP Business Academy
8mo

How to 𝐩𝐫𝐨𝐭𝐞𝐜𝐭 access to 𝐛𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐦𝐨𝐝𝐞𝐥𝐬 and an entire 𝐛𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐞𝐜𝐨𝐬𝐲𝐬𝐭𝐞𝐦 𝐰𝐢𝐭𝐡 𝐈𝐏 for the AI era 🤔? The case study of NVIDIA 🧐. NVIDIA is much more than a former graphics card manufacturer that has now become the central chip producer for AI semiconductors. I'm betting that NVIDIA will become the most valuable company in the world this year - not least because of an 𝐞𝐱𝐭𝐫𝐞𝐦𝐞𝐥𝐲 𝐜𝐥𝐞𝐯𝐞𝐫 𝐈𝐏 𝐬𝐭𝐫𝐚𝐭𝐞𝐠𝐲. To get into the exclusive club of $1 trillion companies alongside Apple, Microsoft and Google, you have to do a lot right - 𝐢𝐧𝐜𝐥𝐮𝐝𝐢𝐧𝐠 𝐰𝐡𝐞𝐧 𝐢𝐭 𝐜𝐨𝐦𝐞𝐬 𝐭𝐨 𝐈𝐏. NVIDIA's central role in the current AI boom comes from the production of graphics cards (GPUs) that are designed to operate large amounts of data that process data in parallel to build and use generative AI models. The GPUs are specifically optimized for AI and are significantly different from general CPUs like something from Intel Corporation. NVIDIA has a platform strategy to make itself indispensable in hardware and software for AI applications. NVIDIA holds patents in a variety of AI areas. The GPUs are now the infrastructure for all kinds of AI applications and users like Apple, Meta, Amazon, Google, ByteDance (TikTok) are lining up to get the chips. Data cloud users such as Netflix, Spotify, YouTube etc. depend on this. NVIDIA not only produces the hardware but also a complete Technology stack with software, algorithms and libraries. Software development kits and API frameworks for the smooth operation of deep and machine learning models. NVIDIA's over 15,000 patent publications not only serve to protect its own competitive advantage in a wide variety of AI applications. In 2023 alone, 2,589 patent publications from NVIDA were revealed. The IPC classes include: 📌 Arrangements for program control, 📌 Computer systems based on biological models 📌 3D image rendering 📌 General purpose image data processing 📌 Input arrangements for transferring data to be processed into a form capable of being handled by the computer; 📌 Output arrangements for transferring data from processing unit to output unit, 📌 Control arrangements or circuits for visual indicators 📌 Accessing, addressing or allocating within memory systems or architectures 📌 Image analysis 📌 Interconnection and transfer of information between, memories, input/output devices or central processing units 📢 𝐓𝐡𝐢𝐬 𝐦𝐞𝐚𝐧𝐬 𝐭𝐡𝐚𝐭 𝐍𝐕𝐈𝐃𝐈𝐀 𝐝𝐨𝐞𝐬 𝐧𝐨𝐭 𝐬𝐞𝐞 𝐢𝐭𝐬𝐞𝐥𝐟 𝐚𝐬 𝐚 𝐆𝐏𝐔 𝐦𝐚𝐧𝐮𝐟𝐚𝐜𝐭𝐮𝐫𝐞𝐫 𝐛𝐮𝐭 𝐫𝐚𝐭𝐡𝐞𝐫 𝐚𝐬 𝐚𝐧 𝐞𝐜𝐨𝐬𝐲𝐬𝐭𝐞𝐦 𝐨𝐫𝐜𝐡𝐞𝐬𝐭𝐫𝐚𝐭𝐨𝐫 𝐚𝐧𝐝 𝐟𝐨𝐫 𝐭𝐡𝐢𝐬 𝐲𝐨𝐮 𝐧𝐞𝐞𝐝 𝐚 𝐬𝐮𝐢𝐭𝐚𝐛𝐥𝐞 𝐩𝐚𝐭𝐞𝐧𝐭 𝐩𝐨𝐫𝐭𝐟𝐨𝐥𝐢𝐨. 𝐍𝐕𝐈𝐃𝐈𝐀 𝐡𝐚𝐬 𝐜𝐨𝐦𝐩𝐥𝐞𝐭𝐞𝐝 𝐭𝐡𝐞 𝐡𝐨𝐦𝐞𝐰𝐨𝐫𝐤 𝐢𝐧 𝐚𝐧 𝐞𝐱𝐞𝐦𝐩𝐥𝐚𝐫𝐲 𝐦𝐚𝐧𝐧𝐞𝐫. Many thanks to Andrew Klein for the patent analysis, where the image comes from 👇. I'll give you the link in the comments.
Like Comment
To view or add a comment, sign in
Fernando Cormenzana

Independent ICT Consultant: AI, Emergent Technologies & e-Government. CTO at Australis_UY
8mo
Report this post
Nvidia’s latest chip promises to boost AI’s speed and energy efficiency. What’s new: The market leader in AI chips announced the B100 and B200 graphics processing units (GPUs) designed to eclipse its in-demand H100 and H200 chips. The company will also offer systems that integrate two, eight, and 72 chips. How it works: The new chips are based on Blackwell, an updated chip architecture specialized for training and inferencing transformer models. Compared to Nvidia’s earlier Hopper architecture, used by H-series chips, Blackwell features hardware and firmware upgrades intended to cut the energy required for model training and inference. Training a 1.8-trillion-parameter model (the estimated size of OpenAI’s GPT-4 and Beijing Academy of Artificial Intelligence’s WuDao) would require 2,000 Blackwell GPUs using 4 megawatts of electricity, compared to 8,000 Hopper GPUs using 15 megawatts, the company said. Blackwell includes a second-generation Transformer Engine. While the first generation used 8 bits to process each neuron in a neural network, the new version can use as few as 4 bits, potentially doubling compute bandwidth. A dedicated engine devoted to reliability, availability, and serviceability monitors the chip to identify potential faults. Nvidia hopes the engine can reduce compute times by minimizing chip downtime. Nvidia doesn’t make it easy to compare the B200 with rival AMD’s top offering, the MI300X. Price and availability: The B200 will cost between $30,000 and $40,000, similar to the going rate for H100s today, Nvidia CEO Jensen Huang told CNBC. Nvidia did not specify when the chip would be available. Google, Amazon, and Microsoft stated intentions to offer Blackwell GPUs to their cloud customers. Behind the news: Demand for the H100 chip has been so intense that the chip has been difficult to find, driving some users to adopt alternatives such as AMD’s MI300X. Moreover, in 2022, the U.S. restricted the export of H100s and other advanced chips to China. The B200 also falls under the ban. Why it matters: Nvidia holds about 80 percent of the market for specialized AI chips. The new chips are primed to enable developers to continue pushing AI’s boundaries, training multi-trillion-parameter models and running more instances at once. We’re thinking: Cathie Wood, author of ARK Invest’s “Big Ideas 2024” report, estimated that training costs are falling at a very rapid 75 percent annually, around half due to algorithmic improvements and half due to compute hardware improvements. Nvidia’s progress paints an optimistic picture of further gains. It also signals the difficulty of trying to use model training to build a moat around a business. It’s not easy to maintain a lead if you spend $100 million on training and next year a competitor can replicate the effort for $25 million. Andrew Ng, [email protected]
1 Comment
Like Comment
To view or add a comment, sign in
Deepak gupta

Technical Author at TechShali
1mo
Report this post
If we talk about Artificial intelligenceinevitably we have to talk about NVIDIA, the largest manufacturer of AI accelerators, with permission from AMD that with its range AMD Instinct It is making a significant place for itself in many data centers used to train large language models (LLM). NVIDIA’s presence in Artificial Intelligence is reduced to graphics for the home market and servers. Outside of this field, the company has no presence, however, that does not mean that the company is not working to expand its scope of action to processors. A rumor has been circulating for several months pointing to an association between NVIDIA and MediaTek to create a processor with ARM architecture to the market. Both companies have been working for several years and the latest fruit of this agreement is a chip for monitors compatible with NVIDIA G-Sync. Neither of the two companies has confirmed, nor has it denied, that they are working on a new processor for PCs. Once again, you don’t really need to confirm it, since the NVIDIA company has taken care of that through the new communication channel it has created called NVIDIA AI PC. Hey there! 👋 Welcome to the new NVIDIA AI PC channel! 💻 Whether you are deep into AI or just a little curious, we’re here to explore the power of AI on your local PC. Follow along for the latest news, tech, and AI inspiration. 🚀 https://t.co/Fb4rlmMKLG November 18, 2024 • 18:00 NVIDIA ARM processors Beyond the Tegra chip of the nintendo switchNVIDIA’s presence in the processor market is non-existent. With this new communication channel, the company only confirms that it is preparing to launch a processor with ARM architecture to the market. And we say with ARM architecture and not x86for its association with MediaTek, Qualcomm’s most direct rival in the processor segment with this architecture. At this time, Qualcomm is the only manufacturer that already has equipment on the market, specifically laptops, with ARM processors through the processors Snapdragon X Elite and Snapdragon X Plus. The next generation of these, Snapdragon Elite, without the X, will mean an important change in terms of performance and graphic power with respect to the current generation, so NVIDIA and MediaTek have to work very well hand in hand to launch a product on the market that is truly a quality alternative to what Qualcomm already offers. The term IA PC is already being used by both Qualcomm and Intel and amd and refers to the Neural Processing Unit that is found inside, better known as NPU. The NPU is responsible for carrying out Artificial Intelligence tasks locally, without the need to use the graph, so the power it can offer is, initially, irrelevant. But, if the performance of the NPU is combined with that of the graphics, the performance in AI tasks locally on the computer increases considerably. This type of

Now yes: NVIDIA prepares the launch of its AI PC processors
Like Comment
To view or add a comment, sign in

21,831 followers

View Profile Connect

Rafael Brown’s Post

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

tomshardware.com

More from this author

An Improbable Identity Crisis. What is Improbable? Cloud hosting for Games, while trying to be anything else.

Netflix, Zack Snyder, & Evil Genius Games: When Worldbuilding Licensing Wrong

Peter Molyneux is ready to launch his biggest (crypto) scam, and he means it with all his heart.

Explore topics