George Z. Lin’s Post

Navigating the AI landscape! 🤖🚀💼🌐 AI Leader, Investor, & Advisor | MassChallenge | Wharton VentureLab

7mo

Tel Aviv and IBM teams question the conventional benchmarking practices, which typically involve training models from scratch with random initialization as modeling long-range dependencies in sequences has led to notable architectural advancements, with state space models (SSMs) emerging as a significant alternative to Transformers. According to the team, this method may overestimate the differences between architectures. The researchers propose pretraining models using standard denoising objectives with downstream task data, a method they term selfpretraining (SPT). This approach significantly narrows the performance gap between Transformers and SSMs. For example, pretrained vanilla Transformers can match the performance of advanced SSMs like S4 on benchmarks such as the Long Range Arena (LRA). Specifically, SPT improved the best reported results of SSMs on the PathX-256 task by 20 points. Key findings from the study include: 1. Transformers vs. SSMs: Properly pretrained vanilla Transformers can achieve performance comparable to S4 on LRA tasks, challenging the notion that Transformers are less capable of modeling long-range dependencies. 2. Redundancy of Structured Parameterizations: Structured parameterizations in SSMs become mostly redundant with data-driven initialization through pretraining, suggesting that simpler models can match the performance of more complex architectures. 3. Effectiveness Across Data Scales: SPT is particularly beneficial when training data is scarce, with relative gains more pronounced with smaller datasets. 4. Adaptability of Convolution Kernels: Data-driven kernels learned via SPT adapt to specific task distributions, enhancing performance on long-sequence tasks. The study emphasizes the importance of incorporating a pretraining stage in model evaluation to ensure accurate performance estimation and simplify architecture design. This approach not only provides a fair comparison between different architectures but also highlights the efficiency of pretraining in leveraging task data. Arxiv: https://lnkd.in/enaH3mhu

To view or add a comment, sign in

More Relevant Posts

VidyaSagar Machupalli FBCS

Executive IT Architect @ IBM Cloud & AI | BCS Fellow, Distinguished Architect (Certified), Thought Leader | Cryptography & Optimization, Math, Quantum & Data Science
9mo Edited
Report this post
Retrieval Augmented Generation Architecture Retrieval augmented generation (RAG) is an architectural pattern that enables foundation models to produce factually correct outputs for specialized or proprietary topics that were not part of the model's training data. By augmenting users' questions and prompts with relevant data retrieved from external data sources RAG gives the model 'new' (to the model) facts and details on which to base its response. https://lnkd.in/dhdvqF48 IBM
1 Comment
Like Comment
To view or add a comment, sign in
oz quartler

IBM product manager
6mo
Report this post
🚀 Pioneering the Future of Data Storage: IBM Flash Systems 🚀 In today's digital age, where data is the new oil, the need for speed, efficiency, and reliability in data storage has never been more critical. IBM's Flash Systems are at the forefront of this revolution, setting new standards for innovation and performance in the industry. 🔹 Blazing Fast Speed: With ultra-low latency and high throughput, IBM Flash Systems are designed to handle the most demanding workloads. Whether it's AI, big data analytics, or high-frequency trading, these systems ensure your applications run at lightning speed. 🔹 Unmatched Reliability: Downtime is not an option. IBM Flash Systems offer enterprise-grade reliability with advanced data protection features. This ensures your critical data is always available, safeguarding your business operations. 🔹 Cost Efficiency: Think high performance comes with a high price tag? Think again. IBM's innovative architecture delivers exceptional value by optimizing performance and reducing total cost of ownership. It's a smart investment for any forward-thinking organization. 🔹 Scalability: As your business grows, so does your data. IBM Flash Systems offer seamless scalability, allowing you to expand your storage capacity without compromising on performance. This flexibility makes it easier to adapt to changing business needs. 🔹 Sustainability: In an era where sustainability matters, IBM is leading the way with energy-efficient designs that reduce power consumption and environmental impact. It's high performance with a conscience. 🔹 Innovative Technology: Powered by IBM's cutting-edge technology, including AI-driven analytics and predictive insights, Flash Systems enable smarter, data-driven decisions. It's not just about storing data; it's about leveraging it to drive innovation and growth. In summary, IBM Flash Systems are not just about storage; they're about empowering your business to achieve more. With unparalleled speed, reliability, cost efficiency, and scalability, these systems are redefining what's possible in the world of data storage. 🌟 Unlock the full potential of your data with IBM Flash Systems. The future is now. 🌟 #DataStorage #Innovation #IBMFlashSystems #TechRevolution #EnterpriseIT #FutureOfData #Sustainability #AI #BigData
Like Comment
To view or add a comment, sign in
Nicola De Coppi

Empowering Enterprises with GenAI to Drive Measurable Business Outcomes 🚀 | ex-Salesforce, ex-Accenture, ex-Founder
4mo
Report this post
As enterprise data continues to be critical for #AI, grow in complexity and volume, understanding its origin, movement, and transformations is crucial for making informed decisions. This is where Data Lineage comes in – a critical component of data governance that helps organizations track data from its source to its destination. 🤔 So, what is Data Lineage? 🤔 Data Lineage is the process of documenting the journey of data from its creation to its consumption, including all transformations, aggregations, and modifications along the way. It provides a complete and accurate picture of data's life cycle, enabling organizations to: 1️⃣ Improve data quality: By identifying data sources, processing steps, and potential errors, organizations can enhance data accuracy and reliability. 2️⃣ Enhance transparency and trust: Data Lineage provides a clear understanding of data's origin, helping to build trust among stakeholders, customers, and regulatory bodies. 3️⃣ Support regulatory compliance: By maintaining a record of data's life cycle, organizations can demonstrate compliance with regulations such as GDPR, HIPAA, and CCPA. ⚙ How does Data Lineage work? ⚙ Data Lineage solutions use metadata to create a visual representation of data's journey, providing insights into: - Data sources and ingestion - Data processing and transformations - Data storage and management - Data consumption and analytics 🚀 Benefits of Data Lineage 🚀 - Improved data governance and quality increased transparency and trust - Enhanced regulatory compliance - Better decision-making and analytics - Reduced data-related risks and costs And IBM has a product for it: IBM Manta Data Lineage! Watch the video from this page to learn more: https://lnkd.in/d7Bbi-Mt #DataLineage #DataGovernance #DataQuality #Transparency #Compliance #IBM #Manta

IBM Manta Data Lineage

ibm.com
Like Comment
To view or add a comment, sign in
Dennis Soans

Director Consulting - Expert at CGI Inc.
7mo
Report this post
Modernize your IBM Z applications. Facilitate access to relational and traditional non-relational mainframe data and other data sources. Build AI applications directly from your IBM Z data. Learn more here: https://lnkd.in/eesGUEhA #AI #ApplicationModernization

IBM Data Virtualization Manager for z/OS

ibm.com
Like Comment
To view or add a comment, sign in
Jens Ziemann

Senior Director, Global Learning Services EMEA at Red Hat
1mo
Report this post
Dive into the details of IBM's Granite foundation model with our latest blog post! Learn about its training data and how it's shaping the future of technology. #IBM #RedHat

IBM's Granite foundation model: A detailed look at its training data

redhat.com
Like Comment
To view or add a comment, sign in
StreamSets

13,292 followers
9mo
Report this post
IBM is one of the largest corporate networks known to mankind. 🌐 With the health of that network riding on their back, the CIO Network Engineering team needs continuous, reliable, and transparent operational data that teams around the world can use in real-time. When their existing DataOps solution wasn't cutting the mustard, they turned to our smart data pipeline platform for rescue! 🦸 “We use StreamSets because it’s the only technology that handles volume at scale.” – Stephan Barabasi, big data, cloud architect, and data scientist. What's next? Plans to scale StreamSets beyond the CIO Network Engineering team. 🚀 https://bit.ly/49j4aLa #DataPipelines #DataOps #CIO

IBM: How Self-Service Data Supports Operational Excellence

https://streamsets.com
Like Comment
To view or add a comment, sign in
Kari Nagle

Director of Consulting at CGI
7mo
Report this post
Modernize your IBM Z applications. Facilitate access to relational and traditional non-relational mainframe data and other data sources. Build AI applications directly from your IBM Z data. Learn more here: https://lnkd.in/dYQXsSfT #AI #ApplicationModernization

IBM Data Virtualization Manager for z/OS

ibm.com
Like Comment
To view or add a comment, sign in
Clarisse Hedglin

Performance Consultant at IBM
4mo
Report this post
See what IBM Client Engineering can do for your business. As a recent example, a global banking client is interested in converting custom payment file formats from numerous clients into industry standard formats by leveraging generative AI instead of manual conversions. Currently, any conversion from custom payment file formats involves a laborious manual process taking multiple weeks to develop code to execute the transformation. To add to this, the growing volume of clients and custom formats make the current manual process difficult to scale. The IBM Client Engineering. team co-created a 4 week Pilot leveraging watsonx.ai LLMs to generate Python scripts that can map custom payment file formats to ISO format with minimal human intervention. The team showcased that watsonx.ai was able to optimize worker productivity and labor costs, ultimately creating a scalable and more efficient file transformation process. With the adoption of watsonx.ai, this banking client will: 80% Reduction in processing time Reducing development time to ~2 days instead of 2-3 weeks Save over 27K+ hours in over 3 years, Attain ~$3.7MM in associated labor cost savings That’s the power of IBM. What can we do for your organization? #ibm #clientengineering #innovation watsonx #ai #financialservices #showdonttell https://lnkd.in/emhvcMK4

Client Engineering | IBM

ibm.com

1 Comment
Like Comment
To view or add a comment, sign in
Gabriel Marquez

Digital Sales @ IBM
10mo Edited
Report this post
What can your organization achieve with a modern #observability solution? Data from a new Forrester Consulting study showed that a composite organization that used the IBM Instana Observability platform achieved a 219% ROI over three years. Likewise, it saw a 90% reduction in troubleshooting time by providing high-fidelity data to the right people at the right time. #Instana #IBM IBM

Average 219% ROI: The Total Economic Impact™ of IBM Instana Observability - IBM Blog

https://www.ibm.com/blog

1 Comment
Like Comment
To view or add a comment, sign in
JOFRANTOBA

41 followers
7mo
Report this post
🚀 Discover how to design a scalable and stable on-premise infrastructure for distributed computing, big data, artificial intelligence, and scalable software projects. Read our article on maturity layers in technology infrastructure! #ITInfrastructure #DistributedComputing #BigData #AI #ScalableSoftware #Software #OnPremise #Jofrantoba 🔍 Want to build robust and efficient systems for your business? Learn how to follow a layered approach to design an on-premise infrastructure that meets your business and IT needs! #OnPremiseInfrastructure #EnterpriseTechnology #ITEfficiency #Software #Jofrantoba 🛠️ Unlock the keys to a solid and reliable technology infrastructure in our latest article. Don't miss out on the maturity layers for a scalable and stable on-premise infrastructure in cutting-edge projects! #EnterpriseTechnology #TechnologyInfrastructure #ITEfficiency #Software #Jofrantoba https://lnkd.in/eFGBRKwU

(PDF) Maturity Layers for a Scalable and Stable On-Premise Infrastructure in Distributed Computing, Big Data, Artificial Intelligence, and Scalable Software Projects - @jofrantoba

researchgate.net

1 Comment
Like Comment
To view or add a comment, sign in

3,019 followers

293 Posts

View Profile Follow

George Z. Lin’s Post

More Relevant Posts

Explore topics