Today, we’re announcing Scale has closed $1B of financing at a $13.8B valuation, led by existing investor Accel. For 8 years, Scale has been the leading AI data foundry helping fuel the most exciting advancements in AI, including autonomous vehicles, defense applications, and generative AI. With today’s funding, we’re moving into the next phase of our journey: accelerating the abundance of frontier data to pave the road to Artificial General Intelligence (AGI). “Our vision is one of data abundance, where we have the means of production to continue scaling frontier LLMs many more orders of magnitude. We should not be data-constrained in getting to GPT-10.” - Alexandr Wang, CEO and founder of Scale AI. This new funding also enables Scale to build upon our prior model evaluation work with enterprise customers, the U.S. Department of Defense, and collaboration with the White House to deepen our capabilities and offerings for both public and private evaluations. There’s a lot left to do. If this challenge excites you, join us: https://scale.com/careers Read the full announcement: https://lnkd.in/gVBhaPZ5
Scale AI
Software Development
San Francisco, California 191,509 followers
The Data Engine that powers the most advanced AI models.
About us
At Scale, our mission is to accelerate the development of AI applications. We believe that to make the best models, you need the best data. The Scale Generative AI Platform leverages your enterprise data to customize powerful base generative models to safely unlock the value of AI. The Scale Data Engine consists of all the tools and features you need to collect, curate and annotate high-quality data, in addition to robust tools to evaluate and optimize your models. Scale powers the most advanced LLMs and generative models in the world through world-class RLHF, data generation, model evaluation, safety, and alignment. Scale is trusted by leading technology companies like Microsoft and Meta, enterprises like Fox and Accenture, Generative AI companies like Open AI and Cohere, U.S. Government Agencies like the U.S. Army and the U.S. Airforce, and Startups like Brex and OpenSea.
- Website
-
https://scale.com
External link for Scale AI
- Industry
- Software Development
- Company size
- 501-1,000 employees
- Headquarters
- San Francisco, California
- Type
- Privately Held
- Founded
- 2016
- Specialties
- Computer Vision, Data Annotation, Sensor Fusion, Machine Learning, Autonomous Driving, APIs, Ground Truth Data, Training Data, Deep Learning, Robotics, Drones, NLP, and Document Processing
Locations
-
Primary
303 2nd St
South Tower, 5th FL
San Francisco, California 94107, US
Employees at Scale AI
Updates
-
Introducing Multimodal and Multilingual LLM Leaderboards from SEAL, Scale’s Safety, Evaluations, and Alignment Lab 👇 Today we’re launching FIVE new leaderboards: Visual-Language Understanding and Multilingual (Japanese, Korean, Arabic, Chinese, and Spanish). 👉 Visual-Language Understanding: To assess models’ visual reasoning capabilities, we’re introducing VISTA: a novel multimodal benchmark. This rubric-based visual task assessment benchmark pushes beyond simple Q&A to evaluate complex visual-language understanding. Key results: Gemini-2.0-flash-exp leads rubric-based at ~39.7% average task accuracy Claude models are strong performers second/third positions (38.4% and 37.9%) Scores <40% indicate that multimodal reasoning remains a tough challenge ✅ Learn more and view the leaderboard: https://lnkd.in/g43qgC_R 👉 Multilingual: We are also launching 4 new multilingual LLM leaderboards — Arabic, Chinese, Japanese, and Korean PLUS we’ve updated the Spanish rankings. We created a Multilingual Prompts Dataset, composed of 1,000 prompts per language and tailored to enhance models’ interaction capabilities across multiple languages. Key Results: Gemini 1.5 Pro (gemini-exp-1121) leads in Arabic Gemini 1.5 Pro (gemini-1.5-pro-exp-0827) takes the top spot in Chinese o1-preview dominates Japanese & Korean Leaderboard results 👇 ✅ Japanese: https://lnkd.in/gE7MyAz9 ✅ Korean: https://lnkd.in/gKB8d-Xp ✅ Arabic: https://lnkd.in/gPdHYrMJ ✅ Chinese: https://lnkd.in/gFYW-HGt ✅ Spanish (updated): https://lnkd.in/gzeCUf6q
-
Scale is proud to support the bipartisan House AI Task Force report, a culmination of rigorous collaboration and thoughtful analysis. We thank Chairmen Jay Obernolte, Co-Chair Ted Lieu and all of the task force Members for their dedication to ensuring that the United States leads the world in AI development. 🚀 Scale looks forward to continuing working with the next Congress and the Trump Administration to advance these recommendations. 🇺🇸 https://lnkd.in/egfWxDck
-
It's NeurIPS day 3! Here’s what’s on deck today, Dec 12: ✅ Poster | Learning Goal-Conditioned Representations for Language Reward Models | Poster Session 3 East | 11AM to 2PM | East Exhibit Hall A-C #2711 Thank you to everyone who's dropped by the booth or attended our spotlight poster presentation yesterday: A Careful Examination of Large Language Model Performance on Grade School Arithmetic (https://lnkd.in/gZTqUDMA) If you haven’t yet, swing by booth #115 in Hall A to grab your custom NeurIPS swag and chat with one of our experts. Missed yesterday’s poster presentation? See open research roles as well as all original research from Scale here: https://scale.com/research
-
We’ve partnered with TIME to launch the first GenAI experience for Person of the Year. This interactive platform was built through a strategic partnership in which we brought together TIME’s proprietary data, the base LLM model's latent knowledge, data about relevant current events and custom guardrails to keep the model on-topic and safer. Together with TIME, we are pioneering a future where media is not just consumed, but experienced. Learn more about our work with TIME here: https://lnkd.in/gCatWiDj
Scale AI & TIME Unveil a GenAI Experience
scale.com
-
We’ve heard your feedback about Outlier and have been hard at work making continuous updates to improve the contributor experience. Today, we’re happy to announce Outlier’s Q3 NPS score has climbed to 44 (scores range from -100 to 100) as a result! This score puts Outlier, a part of Scale’s family of products and services, in the top half of all companies, according to global benchmarks. We’re nowhere close to done, but under Outlier’s new General Manager and the rollout of new features based on user feedback over the past months, Outlier is building a best-in-class platform powering the future of flexible work. Read the full story here: https://lnkd.in/e8ebu_S8
-
We are officially set up and going at NeurIPS! Here’s what’s on deck for tomorrow, Dec 11 👇 ✅ Spotlight Poster | A Careful Examination of Large Language Model Performance on Grade School Arithmetic | Poster Session 1 West | 11AM to 2PM | West Ballroom A-D #6902 Don’t forget to stop by booth #115 in Hall A to grab your custom NeurIPS swag and chat with one of our experts! Interested in doing research at Scale? See open research roles as well as all original research from Scale here: https://scale.com/research
-
We’re thrilled to be one of ten inaugural partners in Anduril Industries’ Lattice Partner Program which launches today. Combining Scale Donovan, with Lattice, Anduril’s Command and Control software, creates a groundbreaking end-to-end solution that delivers a decision advantage through software and AI-enabled joint staff planning. The integration will enable users to access Lattice’s orchestration, planning and simulation capabilities, and extend Donovan’s AI capabilities to ensure true machine-speed command and control. Learn more: https://lnkd.in/g3v9Hj3p
-
Scale is at NeurIPS 2024 in Vancouver this week! Here’s what we’re up to👇 We’re presenting the following posters: ✅ Learning Goal-Conditioned Representations for Language Reward Models: https://lnkd.in/gTXcZJh4 ✅ A Careful Examination of Large Language Model Performance on Grade School Arithmetic: https://lnkd.in/gZTqUDMA We’re presenting at the following papers at workshops: ✅ Planning In Natural Language Improves LLM Search For Code Generation: https://lnkd.in/gkNbvVMC ✅ LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet: https://lnkd.in/gUxjS9PT ✅ Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs: https://lnkd.in/gbsHfgjd In addition to the posters and workshops, we’re also talking through our research highlights, LLM leaderboards, and frontier capabilities (navigating the era of AI agents and reasoning) at our booth. Swing by booth #115 in Hall A to say hi and pick up some custom NeurIPS swag! Stay tuned for more NeurIPS updates and grab our full schedule here: https://lnkd.in/gsHFR7Bu
-
The latest research from Scale investigates various synthetic data generation strategies for fine-tuning LLMs under different constraints, providing a robust framework for enterprises to evaluate the most optimal and cost-effective data strategies for fine-tuning LLMs. Our paper on this work will be presented at NeurIPS 2024 next week. 👇 Fine-tuning LLMs requires high-quality datasets which are expensive and time-consuming to create. Our researchers explored methods to generate these datasets synthetically while optimizing for different resource constraints and task types. They found the optimal data generation strategy depends on the query budget ratio, or the number of LLM queries divided by initial seed instruction size. Specifically, when you have few initial training examples and a small budget, it’s most effective to add more details to existing answers. As your budget grows, generating new questions is better. Learn more about the research: https://lnkd.in/gB46msrn Our paper “Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs” on this work will be presented at the main track of NeurIPS 2024 next week. Paper authors: Jerry Chan, George Pu, Apaar Shanker, Parth Suresh, Penn J., John Heyer, Sam Denton