Last week at Google I/O, Google unveiled their most powerful AI chip yet: the Tensor Processing Unit (TPU) v6, codenamed Trillium. This cutting-edge technology is turning tech-heads everywhere, and here's why: Faster AI Processing: Trillium performs 4.7 times faster than TPU v5. Imagine getting your AI tasks done in a fraction of the time! Double the Efficiency: Trillium doubles the high-bandwidth memory, internal bandwidth, and chip-to-chip interconnect speed. Swoosh, that's fast. Reduced Costs: Trillium's improved energy efficiency (67% better!) translates to lower operating costs for your AI workloads. Bonus technical tip from our AI Infra team: Google Cloud TPUs use a custom bfloat16 numeric format that provides the same dynamic range as a 32-bit IEEE float with reduced precision. This allows TPUs to achieve higher matrix multiplication performance with minimal impact on model accuracy. When using TPUs, be sure to cast your model weights and inputs to bfloat16. At CloudWerx, we're experts in building AI infrastructure at scale, such as training and inference on TPUs, and we're committed to getting at the forefront of cutting-edge tech. If your organization is looking to utilize and capitalize on the price/performance of TPUs, or if you have questions about inputs, reach out to us! We want to talk tech -- what do you think of this news? Let us know in the comments! Read more: Google Announces Sixth-generation AI Chip, a TPU Called Trillium: https://lnkd.in/d8GqVkNy #AIinfrastructure
CWX’s Post
More Relevant Posts
-
Massive news for AI enthusiasts (and anyone looking to give their business a serious edge)! Did you hear about Google's new Tensor Processing Unit (TPU) v6, codenamed Trillium? This is a game-changer for AI processing power. Ready to supercharge your AI efforts? ⚡️ Reach out to the team CloudWerx today, and let's chat about how TPUs can revolutionize your business!
Last week at Google I/O, Google unveiled their most powerful AI chip yet: the Tensor Processing Unit (TPU) v6, codenamed Trillium. This cutting-edge technology is turning tech-heads everywhere, and here's why: Faster AI Processing: Trillium performs 4.7 times faster than TPU v5. Imagine getting your AI tasks done in a fraction of the time! Double the Efficiency: Trillium doubles the high-bandwidth memory, internal bandwidth, and chip-to-chip interconnect speed. Swoosh, that's fast. Reduced Costs: Trillium's improved energy efficiency (67% better!) translates to lower operating costs for your AI workloads. Bonus technical tip from our AI Infra team: Google Cloud TPUs use a custom bfloat16 numeric format that provides the same dynamic range as a 32-bit IEEE float with reduced precision. This allows TPUs to achieve higher matrix multiplication performance with minimal impact on model accuracy. When using TPUs, be sure to cast your model weights and inputs to bfloat16. At CloudWerx, we're experts in building AI infrastructure at scale, such as training and inference on TPUs, and we're committed to getting at the forefront of cutting-edge tech. If your organization is looking to utilize and capitalize on the price/performance of TPUs, or if you have questions about inputs, reach out to us! We want to talk tech -- what do you think of this news? Let us know in the comments! Read more: Google Announces Sixth-generation AI Chip, a TPU Called Trillium: https://lnkd.in/d8GqVkNy #AIinfrastructure
Google Announces Sixth-generation AI Chip, a TPU Called Trillium
hpcwire.com
To view or add a comment, sign in
-
Beyond the compute, the networking and the storage ultimately we will need to start considering on efficient ways to protect modern MLOps workflows. This new approach must consider the iterative nature of modern data science activities which represent a fundamental component of modern AI factories. In addition, and perhaps not evident at the moment, will be the need for model compliance and observability requirements which might become a necessity once these models are applied to assist on critical decisions.
🚀 We agree with NVIDIA that new PowerScale advances bring "massive" #AI storage opportunities for our partners.🌐💾 #Storage Read the full details in this CRN article: https://dell.to/3KGSqHh
Dell And Nvidia Say GenAI’s Data Byproducts Give Partners 'Massive' Storage Opportunity
crn.com
To view or add a comment, sign in
-
Google’s Own AI Chip ⭐️ As the race in AI technology heats up, Google is stepping up its game with a groundbreaking move! The tech giant is launching its own custom Arm-based CPU, named Axion, aimed at enhancing AI capabilities within data centers. This bold step not only introduces a more powerful iteration of Google’s Tensor Processing Units (TPU) AI chips but also signifies Google's entry into a competitive domain currently led by tech behemoths Microsoft and Amazon. Key Highlights: 1. Google’s Axion CPU - Designed to support AI workloads, Axion is already powering critical Google services like YouTube ads and Google Earth Engine. Set to be available to Google Cloud business customers later this year, Axion promises a performance boost of 30% over general-purpose Arm chips and 50% over Intel's offerings. This chip embodies Google’s commitment to enhancing cloud services with top-tier AI performance. 2. TPU v5p Update - Google is also enhancing its TPU AI chips, vital for AI acceleration tasks and an alternative to Nvidia's GPUs. The new TPU v5p version is a powerhouse, containing 8,960 chips — more than double its predecessor — specifically built to train large and complex generative AI models.Google's strategic move to develop in-house chips like Axion and the TPU v5p not only reduces dependency on external suppliers like Intel and Nvidia but also intensifies the competition in the AI market. This initiative underscores Google's ambition to lead in both hardware innovation and cloud services, highlighting a significant shift towards custom silicon in cloud and AI operations.Stay tuned for more updates on how this development reshapes the AI landscape! #AI #GoogleCloud #Innovation https://lnkd.in/eB6QkHkH
To view or add a comment, sign in
-
Microsoft and OpenAI bet $100 billion to free themselves from the shackles and overreliance on the world's most profitable semiconductor chip brand for AI chips Microsoft and OpenAI have reportedly invested over $100 billion in a new project, Stargate. It's a data center that will help support both companies' AI advances by meeting their high demand for GPUs. The project is expected to launch in 2028 and will reduce the overreliance on NVIDIA for AI chips. NVIDIA is undoubtedly cashing out as we leap toward the most significant technology revolution with AI. Due to the rising demand for GPUs, the company was ranked as the world's most profitable semiconductor chip brand for Q3 2023. This designation translates to $18.12 billion in revenue and a $10.42 billion profit. https://lnkd.in/dqg_PB-J By @ Kevin Okemwa
Microsoft and OpenAI bet $100 billion to free themselves from the shackles and overreliance on the world's most profitable semiconductor chip brand for AI chips
windowscentral.com
To view or add a comment, sign in
-
Follow the AI trend lines... Local "AI" compute is radically increasing, following the likes of the introduction of the Apple M4 yesterday, which is reportedly six times faster than Intel's AI PC. I think we will see dedicated "AI cards" similar to Google's TPU. There could be a strong possibility of consumer variations of Nvidia's GPUs configured for AI workloads with more video memory, although GPUs are a more general-purpose type of parallel compute than that needed for inference use cases alone. I would love to see a consumer version of Groq's specialist processor (LPU), which simply blows everything else away when it comes to inference workloads. Turning to open-source models, with recent releases such as Llama 3 and Phi-3, we're seeing more capable, local GPT-4-like performance with fewer parameters, able to run on a smaller on-prem server or local device. They are pursuing the route of using higher-quality training data, combined with a greater number of training runs to achieve similar, if not better, results than the much larger current models that can only run in data centres. The prevailing thought is that more time spent training on the same data could yield even better performance... but they had to stop somewhere. Finally, the excellent research on structured prompting approaches (chains, trees, and graphs of thought) enables the model to "think" about its answers across a number of possibilities, again enhancing results. Adding in advances this past year around retrieval-augmented generation and "agentic" workflows, and the results can be very impressive with today's models. Bringing these elements together, before long (this year, I would venture), we will have GPT-4-like reasoning and beyond, able to run at the edge on any device (smartphone, tablet, laptop, PC), using your own hardware and power, without a subscription. As the cost per token drops, both in terms of compute and power consumption, and as the responsiveness and speed of inference increases to the hundreds/thousands/tens of thousands of tokens per second and beyond, the quality of the results and associated possibilities start to become very interesting indeed. All this, and the very large models running in increasingly larger data centres with continuously expanding capabilities, are being rolled out frequently. GPT 4.x/5, the very large Llama 3 400 billion parameter model, and Google's Gemini largest model will become widely available. We will start to see interesting real world distributed architectures, coordinating the inference workflows/workloads between local edge devices and cloud compute as we optimise solutions across various dimensions, such as: capabilities, quality of results, compute, latency, responsiveness, cost, data storage, data security, data privacy, interpretability, bias, ownership, copyright, etc. 2024 is turning out to be another huge year for AI, and it is not slowing down... #artificialintelligence #ai #aistrategy #aitransformation
To view or add a comment, sign in
-
“Google was built for this moment. We’ve been pioneering GPUs for more than a decade,” Google CEO Sundar Pichai Introducing Trillium, Google's sixth-generation Tensor Processing Unit (TPU). Trillium was specifically designed to accelerate the development and deployment of advanced AI models. With significant improvements in compute power, memory, and networking capabilities Key advancements in Trillium include: ➡ Enhanced compute performance: 4.7X increase per chip, enabling faster model training and serving. ➡Increased memory and bandwidth: Doubled High Bandwidth Memory (HBM) capacity and bandwidth, along with Interchip Interconnect (ICI) bandwidth, for handling larger models and datasets. ➡SparseCore accelerator: third-generation SparseCore, Specialized for processing ultra-large embeddings common in advanced AI workloads. ➡Sustainability: 67% more energy-efficient than the previous generation. 👉 👉👉👉 https://lnkd.in/dyASZGWy #AI #ML
To view or add a comment, sign in
-
29 Mar 2024 — As is often the case with the latest tech craze, communications service providers (CSPs) are rushing to get a piece of the action. But let me strip out all the glitz and glamour. My prediction: In a few years' time, GPUaaS will become a commodity offered by many CSPs. Fast forward, ... how then does a CSP differentiate itself? Can the CSPs compete head-to-head with hyperscalers or IT companies that boast of better skillsets and resources? Also, more non-CSPs will enter the fray as well. There are many other strategic pieces that need to fall into place: the AI models, good-quality datasets, security, and sustainability (Nvidia AI chips are energy guzzlers) among other things. I expect many enterprises will trip over themselves trying to get onto the AI bandwagon (just because their board and financial market expect them to do so) but they lack the deep understanding and resources to extract benefits from AI. After a few years, many will likely claim that the AI promise is a chimera. AI is still a long way off from being a "plug and play" solution. Here, do CSPs have the know-how to help enterprises bridge the knowledge/execution gap? This is crucial! If CSPs stumble here, their GPUaaS vision will fail alongside. This is not Kevin Costner's Field of Dreams where one "builds it and they will come" (buyers have alternatives to CSP offerings). A CSP also needs to know how to sell the service. Is its sales force well trained and properly incentivised for the task at hand? A holistic go-to-market approach (i.e, well-coordinated product, marketing, sales, and customer support initiatives) is needed, and this is where I fear many CSPs will come up short. I do not expect many CSPs to succeed. I am betting that the one who will laugh all the way to the bank is Nvidia with its "picks and shovels" strategy. In a gold rush, the prospectors seldom make money ... it's the guy selling the picks and shovels that does! But even here, how long can the "pick and shovels" guy continue to reap the gains? Competition hates a monopoly. And Nvidia's upcoming Blackwell GPU will be priced from $30,000 to $40,000 (a not insignificant increase from the previous generation chip). This will invite the development of cheaper, competitive chips. And then portability and inter-operability concerns will emerge alongside. In short, there are a lot of things for CSPs to ponder over. #AI #GPUaaS #nvidia
Singtel targets enterprises with GPUaaS offering
https://www.mobileworldlive.com
To view or add a comment, sign in
-
There are no advancements in #AI, #ML, or #Analytics without #HPC (high performance compute). In 1938, Germany's Z1 performed at a speed of 1 instructions per second or #IPS. Roughly 30 years later in1969 Lawrence Livermore National Laboratory's CDC 7600 got us up to 36 #Megaflops or 36,000,000 Floating Point Operations Per Second! Forward 30 years or so to 1996 and the IAJS University of Tsukuba's Hitachi CP-PACS pushes us to 386.20 #Gigaflops or 386,200,000,000 #flops. Fast forward to the Oak Ridge National Laboratory in 2023 and Hewlett Packard Enterprise - Cray Inc. #Frontier is the first to break the #exascale barrier at 1.194 #exaflops or 1,194,000,000,000,000,000 #flops! Where will #Stargate take us in 2028? https://lnkd.in/eMhmtTiC
Microsoft Stargate: The Next AI Platform Will Be An Entire Cloud Region
https://www.nextplatform.com
To view or add a comment, sign in
-
#Google has introduced its sixth-generation #TPU, #Trillium, aimed at powering large-scale #AI workloads. Trillium significantly surpasses its predecessor, TPU v5e, with a 4x boost in #training performance and a 3x increase in inference throughput. Enhanced with double the #HBM capacity and twice the Interchip Interconnect (#ICI) bandwidth, Trillium is optimized for large #language models like #Gemma 2 and #Llama, as well as high-demand inference tasks, including diffusion models. Trillium also boasts a 67% improvement in #energy efficiency and excels in #benchmark tests, achieving 4x faster training speeds on #models like Gemma 2-27b and Llama2-70B. Its #scalability allows up to 256 #chips in a high-bandwidth pod, expandable to thousands across Google’s #Jupiter data centers, maintaining high performance with Google’s #Multislice software. This release is part of Google #Cloud’s AI #Hypercomputer, which combines #TPUs, #GPUs, and #CPUs to support advanced generative AI demands. https://lnkd.in/gEaizgTB
Google puts Nvidia on high alert as it showcases Trillium, its rival AI chip, while promising to bring H200 Tensor Core GPUs within days
techradar.com
To view or add a comment, sign in
-
#PowerScale is on track to become the FIRST Ethernet-connected storage solution validated on NVIDIA DGX SuperPOD. See what TechTarget has to say: https://dell.to/3HcnLzX #AI #GenerativeAI #UnstructuredData
Dell PowerScale updates, new partnerships point to AI stack | TechTarget
techtarget.com
To view or add a comment, sign in
3,091 followers