Rafael Brown’s Post

View profile for Rafael Brown, graphic

CEO & Founder at Symbol Zero // Microsoft Regional Director

Highlighting: "A single 8xSohu server is said to equal the performance of 160 H100 GPUs, meaning data processing centers can save both on initial and operational costs if the Sohu meets expectations." ----- Etched comes at NVidia creatively by focusing on transformer models. Could the Sohu chip reduce need for Nvidia A100 and H100 chips? ----- TomsHardware: "Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs. Startup Etched has created this LLM-tuned transformer ASIC." (Jowi Morales) (June 26, 2024) "Etched, a startup that builds transformer-focused chips, just announced Sohu, an application-specific integrated circuit (ASIC) that claims to beat Nvidia’s H100 in terms of AI LLM inference. A single 8xSohu server is said to equal the performance of 160 H100 GPUs, meaning data processing centers can save both on initial and operational costs if the Sohu meets expectations. According to the company, current AI accelerators, whether CPUs or GPUs, are designed to work with different AI architectures. These differing frameworks and designs mean hardware must be able to support various models, like convolution neural networks, long short-term memory networks, state space models, and so on. Because these models are tuned to different architectures, most current AI chips allocate a large portion of their computing power to programmability. Most large language models (LLMs) use matrix multiplication for the majority of their compute tasks and Etched estimated that Nvidia’s H100 GPUs only use 3.3% percent of their transistors for this key task. This means that the remaining 96.7% silicon is used for other tasks, which are still essential for general-purpose AI chips. Etched made a huge bet on transformers a couple of years ago when it started the Sohu project. This chip bakes in the transformer architecture into the hardware, thus allowing it to allocate more transistors to AI compute. We can liken this with processors and graphics cards let’s say current AI chips are CPUs, which can do many different things, and then the transformer model is like the graphics demands of a game title. Sure, the CPU can still process these graphics demands, but it won’t do it as fast or as efficiently as a GPU. A GPU that’s specialized in processing visuals will make graphics rendering faster and more efficient. This is what Etched did with Sohu. Instead of making a chip that can accommodate every single AI architecture, it built one that only works with transformer models. The company’s gamble now looks like it is about to pay off, big time. Sohu’s launch could threaten Nvidia’s leadership in the AI space, especially if companies that exclusively use transformer models move to Sohu. After all, efficiency is the key to winning the AI race, and anyone who can run these models on the fastest, most affordable hardware will take the lead." TomsHardware: https://lnkd.in/g2ZGiU-z #ai #cloud #aicloud #cloudai #cloudgpu #genai #transformermodel

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

tomshardware.com

To view or add a comment, sign in

Explore topics