Joe Jheng’s Post

AI Compiler SWE | HW/SW co-design LLM

7mo Edited

"Building on its legacy in compiler technology, Skymizer has been dedicatedly developing its upcoming Large Language Model (LLM) accelerator, aptly named EdgeThought." "This software-hardware co-design IP exemplifies the company’s commitment to optimizing edge inferencing systems in terms of computation, memory usage, power efficiency, and cost." "EdgeThought leverages Skymizer’s advanced compiler technologies to tailor AI performance features specifically to edge computing environments, which are often constrained by power and space." "With EdgeThought, Skymizer introduces a cost-effective solution that allows for the deployment of sophisticated AI capabilities without the need for heavy infrastructure investments." "Skymizer’s move is a clear indication of the evolving landscape in AI chip design, where the integration of software and hardware through compiler-first strategies is becoming increasingly crucial." "The company is expected to showcase EdgeThought and its various applications at the upcoming tech conference, Computex Taipei, where it will delve deeper into the technical aspects and benefits of its innovative co-design approach." The above content is excerpted from Skymizer's official website. Come to meet us at Computex 2024!

COMPUTEX TAIPEI

computextaipei.com.tw

To view or add a comment, sign in

More Relevant Posts

Lior Paster

Data Center Technology Executive @ NVIDIA
9mo Edited
Report this post
#nvidia Spectrum-X Platform is Generally Available NOW through our major infrastructure partners! Spectrum-X is NVIDIA's full-stack, HW + SW, Ethernet-based Fabric for GPU compute and GenAI infrastructure. Spectrum-X is the culmination of 2 years of many NVIDIAns hard work, a fully integrated platform with extreme scale, built on the foundation of MELLANOX high performance ETH switch ASIC & Systems, operated & orchestrated with CUMULUS Linux NOS, and CUMULUS AIR fabric "digital-twin" - for design and simulation of the fabric. NVIDIA BlueField-3 SuperNIC's are delivering incredible throughput @ low latency to every GPU with a rail-optimized design for fabric-wide optimized communications. The bottom line: delivering extreme performance on every level, this platform is optimized for GenAI and the most intense training, with innovations like enhanced adaptive routing, enhanced RoCE and programmable congestion control, and granular telemetry and instrumentation. Check it out: https://lnkd.in/ekM7K9cX Building GenAI infrastructure? Come to GTC - lets talk! https://lnkd.in/ewsCUJMb Amit Katz Bill Webb Barak Gafni Ranga Maddipudi

NVIDIA Spectrum-X Networking Platform

nvidia.com

4 Comments
Like Comment
To view or add a comment, sign in
Patryk Wolsza

AI Software Solutions & Cloud Architect at Intel, vExpert ⭐⭐⭐⭐⭐ | VCAP-CIA | MCSA | EMCCA
1w
Report this post
It’s great to see our clients using and testing our products in the field 😍. This time, #Intel #Gaudi was tested by SqueezeBits and published thanks to Taesu Kim 👌. What's even more important is that it's not just yet another blog post but a document from R&D that explains the nuances of the Gaudi's architecture and how they influence AI pipelines for vLLM workloads. Long story short, you’ll find methodology, tests, analysis with some pros and cons versus the competition, and in the end, it’s very valuable feedback for us. Go and check the article 👇👇. Lastly, subscribe https://lnkd.in/eVvxBFZk because there will be more 💪💪 #IamIntel

[Intel Gaudi] #2. Graph Compiler and Overall Performance Evaluation - SqueezeBits

blog.squeezebits.com
Like Comment
To view or add a comment, sign in
Aniket Kumar Singh

Engineer@Granicus | Formerly Exotel | IEM’21 | Exploring Generative AI & MLOps
6mo Edited
Report this post
We can dive deep into the realm of innovation with foundation models like Meta's Llama and Google's FLAN-T5, unleashing creativity while saving time and resources. These models, brimming with billions of parameters, are thoroughly trained over weeks and months on vast datasets using distributed CPU and GPU clusters. Learn more about Llama(https://llama.meta.com) and FLAN-T5(https://lnkd.in/dhhskekc). 🚀 #Llama #FLAN-T5

Meta Llama

llama.meta.com
Like Comment
To view or add a comment, sign in
Laurent Thiers

Vice President DDN Storage
9mo
Report this post
POWERED BY DDN Storage You can still register to the #GTC session on March 19th at 3:00 PM - 3:25 PM PDT (11:00 PM - 11:25 PM CET): Join PNY Technologies and Scaleway for an exclusive and insightful session as we delve into the success story behind the deployment of Europe's largest NVIDIA DGX H100 SuperPOD. This testimonial session brings together industry leaders, technologists, and innovators to share their firsthand experiences, challenges, and triumphs in implementing cutting-edge artificial intelligence at an unprecedented scale. https://lnkd.in/e9hB3vhh For every $1 spent on @DDN Storage, Gain $2 on Infrastructure and Performance Efficiency. #DataStorage #exascaler #infinia #multi-tenancy #hpc #artificialintelligence #ai #machinelearning #nlp #llm #mlops #digitaltransformation #science #gpt3 #gpt4 #NoNFS #gpucomputing #tech #research #innovation #AIinfrastructure #AIadoption #AItechnology #generativeai #dgx #superpod

NVIDIA #GTC2024 Conference Session Catalog

nvidia.com

3 Comments
Like Comment
To view or add a comment, sign in
Spheron Network

6,598 followers
7mo
Report this post
Unveiling the Spheron Whitepaper: A New Era in GPU Resource Allocation With our whitepaper, we introduced a blueprint for a future where AI and ML breakthroughs are no longer gated by access to computational power. We challenge the status quo of GPU resource allocation, setting the stage for global innovation in technology. Let's dive in: Presently, GPU accessibility is plagued by high costs and gatekeeping enacted by large corporations. The combination of these factors only helps to stifle innovation. Our decentralized vision aims to democratize access, leveling the playing field for innovators everywhere. Via the decentralization of GPU resources, we can: - Sidestep the risks of centralized control, - Effectively reduce costs and increase the availability of GPU resources to devs and researchers worldwide, - Propagate a tech landscape that actively breeds innovation. At Spheron, we propose a new model. Our Decentralized Compute Network (DCN) will leverage Ethereum's smart contract capabilities to manage resource transactions in a transparent manner. Through this, we can cultivate an environment that is trustless, fair, and secure. Integral to our vision for equitable GPU resource allocation is $SPHN. Our native token is essential for transaction management, incentivizing protocol contributions, and maintaining a healthy supply and demand balance within the Spheron ecosystem. Our mission is to bring about a new paradigm - one that pushes the boundaries of what is possible in the realm of AI. More equitable global access to GPU resources leads to a global acceleration of innovation for machine learning models. Spheron acts as the first domino. Enterprises and individuals alike can leverage Spheron to provide GPU resources and monetize their contributions to the network. Earn financial rewards by providing GPU resources to a global pool, and play a part in driving forward technological advancement on a global scale. The release of our whitepaper is only the first step in our mission to advance AI worldwide. As we continue to move forward, we intend to incorporate more underutilized computing resources into our network, furthering our vision to make processing power available to all. Innovation shouldn't be monopolized by entities with deep pockets. Innovation is the fundamental product of human collaboration. At Spheron, we aim to foster a future where technological advancements are driven by global collaboration thanks to decentralization. Read our whitepaper now: https://lnkd.in/eKsPj9PX #DePIN #AI #Crypto

Dive into Spheron's Whitepaper V1

blog.spheron.network
Like Comment
To view or add a comment, sign in
Abhishek Nandy

Bioinformatics AI,AI Consultant, Intel certified oneAPI Instructor.Ex-Microsoft MVP(Windows), PHD Scholar, GenAI, Microsoft AI-900 certified , Intel Unnati Labs External Mentor.
6mo Edited
Report this post
Getting started with Tiny Llama2 with Intel NPU offload and full conversational bot loop with AI for PC.check how fast it's compiling and inferencing Shriram Vasudevan (FIE, FIETE,SMIEEE) Anish Kumar Sachin Kelkar Pooja Baraskar Dmitriy Pastushenkov Sriram Ramkrishna #iamintel #ai #npu #research #coding #developmen t #llm.

Tiny Llama Intel NPU implementation

https://www.youtube.com/

4 Comments
Like Comment
To view or add a comment, sign in
Vivan Amin

Advancing Embodied AI at Microsoft Research | Board Advisor | Strategic Program Manager
8mo
Report this post
A few things here on the topic of GenAI: - Digital twins are a focus for many across various industries, from manufacturing to medicine. It has the potential to bring greater safety and precision, and I do believe we will see the capacity of this technology increase greatly this year. - Like Sead Fadilpasic says here, we're going to be seeing some switch-off between CPU and GPU in these coming years. - With clustered systems and distributed computing, the numbers of nodes and computers we use for computing are growing with the need for increased computing capacity. #GenAI #GenerativeAI #DigitalTwins #GPU #AI

Generative AI in software engineering: trends and innovations - SiliconANGLE

siliconangle.com
Like Comment
To view or add a comment, sign in
Cindy Kohtala

Professor in Design for Sustainability
4mo
Report this post
this is alarming. designers, makers and fablabbers really need to start taking low-carbon computing and the invisible impacts of AI, the interwebs and our digital infrastructures seriously. niche and emerging, but there are compelling discourses and experiments going on in the world of "low-tech labs" (growing in France), "permacomputing", smallweb, and so on. (since the amazing Anna Puchalska introduced me to the concept of permacomputing when she was doing her MFA thesis, it has become the term of choice in the fediverse for low-energy digital awareness.) https://lnkd.in/d7bPrSwd

I watched Nvidia's Computex 2024 keynote and it made my blood run cold

techradar.com

3 Comments
Like Comment
To view or add a comment, sign in
SGV Group

15 followers
6mo
Report this post
Can Generative AI improve the security of the digitization and tokenization of commodities and RWA? CEO of Nvidia, Jensen Huang generative AI presentation is astounding. We have rewatched it several times. Let us know what you think.

NVIDIA CEO Jensen Huang Keynote at COMPUTEX 2024

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Marc Lozano Bosch

//Generative Ai Director & Consultant //Filmmaker //Photographer
6mo
Report this post
In a compelling keynote in Taiwan, Jen-Hsun Huang, CEO of Nvidia, laid out a future where artificial intelligence reshapes every corner of our lives. Huang's vision, steeped in the advancements of generative AI, proposes a profound shift from traditional applications to a new era where AI's capabilities are both broad and deeply integrated into the fabric of daily and industrial operations. One of the standout concepts introduced was that of the AI factories. This idea extends beyond using AI for specific tasks; it envisions entire manufacturing ecosystems powered by AI at every step. Such integration promises to revolutionize efficiency and innovation, potentially transforming production processes in ways we've only begun to imagine. Equally transformative is the "Earth 2" project, a digital twin of our planet designed to simulate and predict environmental changes and natural phenomena. The ability to forecast climate impacts with high accuracy offers unprecedented opportunities for disaster preparedness and environmental management. This project could become a cornerstone in our strategy to combat and adapt to climate change, providing crucial data to policymakers and scientists alike. Huang also highlighted significant advancements in Nvidia’s CUDA and its libraries, which have become critical in handling the vast data requirements of modern AI systems. This foundational technology supports the increased demand for deep learning and physical simulations, facilitating a new level of AI application that is more dynamic and capable than ever before. Perhaps the most profound element of Huang's presentation was his reflection on the broad impact of generative AI. This technology is set to redefine our interactions with digital systems, moving from static databases to interactive, predictive models that learn and evolve. The implications for industries like telecommunications, manufacturing, healthcare, and many others are staggering, promising to enhance the quality of services and the efficiency of systems across the globe. #AI #Nvidia #GenerativeAI #DigitalTransformation #Sustainability #TechForGood #FutureOfWork

Nvidia's 2024 Computex Keynote: Everything Revealed in 15 Minutes

https://www.youtube.com/
Like Comment
To view or add a comment, sign in

198 followers

View Profile Connect

Joe Jheng’s Post

COMPUTEX TAIPEI

computextaipei.com.tw

More from this author

Unveiling the Beauty of the Beast: Lessons from Beautiful C++

Rediscovering Design Patterns: A Modern C++ Journey

Explore topics

Joe Jheng’s Post

More Relevant Posts

Tiny Llama Intel NPU implementation

https://www.youtube.com/

NVIDIA CEO Jensen Huang Keynote at COMPUTEX 2024

https://www.youtube.com/

Nvidia's 2024 Computex Keynote: Everything Revealed in 15 Minutes

https://www.youtube.com/

More from this author

Unveiling the Beauty of the Beast: Lessons from Beautiful C++

Rediscovering Design Patterns: A Modern C++ Journey

Explore topics