Sharing the catch-up weekly paper roundup for the week of October 28, 2024. Spotlight paper should be skimmed by engineers building document parsing/understanding tech. Other noteworthy themes: - New coding benchmark contradicts the general vibe that Sonnet 3.5 is the best - Web agents/computer use and AI/data scientist exploration continue strong momentum - Can we teach foundation models to understand ECG data? - The fine line between rote memorization and deep understanding - Mechanistic interpretability for diffusion models https://lnkd.in/gwibWHSh #ai2incubator #harmonious
AI2 Incubator
Software Development
Seattle, WA 3,890 followers
We help entrepreneurs create AI-first startups through world-leading A.I. support and funding.
About us
We help entrepreneurs form new teams and build AI-first startups. Apply on our website for $500k investment, up to $1M in free AI compute, and network of AI founders and researchers. We bring together world-class engineers, researchers, and entrepreneurs to create new companies together from scratch. From ideation to execution, we help generate ideas, find co-founders, secure pilot customers, integrate cutting-edge AI, and more. Our hands-on support guides founders through building, scaling, and raising millions in venture funding. In 6 years, we've backed 40+ companies now valued at $1B+, raising $220M+ and creating 700+ jobs. Our startups are transforming industries: improving immigrant communication (Yoodli), accelerating cancer research (Ozette and Modulus), enhancing legal efficiency (Lexion, acquired by DocuSign), turning smartphones into medical devices (PreemptiveAI), and so much more. Any VC can write a check, but what truly sets us apart is our unparalleled technical and AI expertise. With our heritage at the Allen Institute for AI—founded by Microsoft co-founder Paul Allen—we’ve been singularly focused on commercializing AI long before it became a buzzword. AI2 has 200+ PhDs, researchers, engineers, professors, and support staff—and is well known for its contributions to the A.I. research community including 600+ papers, 20+ best paper awards, and numerous products/open-source offerings including SemanticScholar, AllenNLP, and more. Our core team includes pioneers like Oren Etzioni, a leader in AI for over three decades, and our deep technical expertise means we don’t just back you financially—we help you build breakthrough products, navigate complex challenges, and accelerate your company’s growth. We're soon launching AI House in Seattle, a central hub for AI innovation. In partnership with the Mayor of Seattle and supported by Washington state, this space will serve as a central hub for the region’s AI community.
- Website
-
https://www.ai2incubator.com
External link for AI2 Incubator
- Industry
- Software Development
- Company size
- 11-50 employees
- Headquarters
- Seattle, WA
- Type
- Privately Held
- Founded
- 2016
- Specialties
- artificial intelligence, software development, startups, entrepreneurship, venture capital, funding, seattle, machine learning, deep learning, computer vision, NLP, natural language processing, neural networks, speech, TTS, STT, and innovation
Locations
-
Primary
2101 North 34th Street
Suite 195
Seattle, WA 98103, US
Employees at AI2 Incubator
Updates
-
How are you designing the UI/UX for GenAI applications? If curious about various patterns in this rapidly evolving space, check out this week's spotlight paper at Harmonious, courtesy of Adobe research. Other noteworthy papers: - Fully open (according to Ai2's definition), frontier code LLM from INF and M-A-P. If you have not heard about these two organizations, you are not alone. - Google developed a tech that takes your video clip and generates all sorts of variants (zoom/pan/tilt/orbit), unleashing the inner cinematographer in you. Caveat: code/tool not released yet. - Agent K, an LLM agent that reached Kaggle GrandMaster level (albeit the Internet challenged that claim). - LLM computer use is progressing nicely (not just with Anthropic's latest Claude update). - o1 almost made Medprompt obsolete (paper by Microsoft and OpenAI). As discussed previously, there's an avalanche of benchmark papers. The formula seems to be: a) identify a potential weakness of current AI, b) create a benchmark to highlight this weakness, aiming to maximize the gap between human and AI, c) invite the research community to work on closing the gap, d) profit (in number of citations). I personally would like to see more work advancing AI performance on an existing benchmark, to balancing things out a bit. Details are here: https://lnkd.in/g2-EDxu3 #ai2incubator #harmonious
Weekly paper roundup: Survey of User Interface Design for GenAI (11/4/2024)
harmonious.ai
-
Have you ever struggled with how to segment/chunk in your RAG applications? Harmonious' spotlight paper this past week dives into this problem, providing a few ideas. Benchmarking continues to be the center of attention. Other LLM-related topics include knowledge editing, introspection, math reasoning, cognitive behavior therapy, design-to-html, and web agents. Finally, there's a paper describing AutoTrain, an open-source library authored by the world's first Kaggle Quadruple Grandmaster. https://lnkd.in/gHC8JkZZ #ai2incubator #harmonious
Weekly paper roundup: Meta-Chunking (10/21/2024)
harmonious.ai
-
📌 This week we have two spotlight papers at Harmonious.ai, one of which is a massive technical report from perhaps the most formidable AI big tech company today, claiming SOTA performances beating everyone from OpenAI's Sora to hot startups such as Runway, Luma, Pika, and Eleven Labs. 📌 The number one research topic at the moment is hands down benchmarking and evaluation (2 out of 3 among the papers I reviewed this week). 📌 OpenAI's o1 models absolutely dominate in one of the benchmarks (JudgeBench). 📌 There's the introduction of the first true multimodal, open-weight model from one of China's AI Tigers. 📌 A new concept to watch is Flow Matching which was introduced just under two years ago. 📌 What's cooler than LLM-as-a-Judge? It's Agent-as-a-Judge! 📌 What's cooler than RAG? A RAG that can see: VisRAG! Read more at https://lnkd.in/gQpA_tvA Every week I read about 100 papers shared by the fine folks at Hugging Face, provide short commentaries for a couple of dozens, and select one or two as spotlights. Sign up at https://harmonious.ai to get the scoop delivered weekly to your mailbox. #ai2incubator #harmonious
Weekly paper roundup: Movie Gen (10/14/2024)
harmonious.ai
-
Are you building RAG and interested in inference time scaling (o1 etc?). Then Harmonious' spotlight paper from Google DeepMind is your must read. Another interesting paper I read is from a duo of MIT alums and seemingly co-founders of an obscure startup called BitEnergy, with jaw dropping claims of making transformers 5X to 20X more energy efficient (it was even covered by Yahoo!). Don't want to miss my weekly AI paper roundups? Take 10 seconds to sign up at https://harmonious.ai and have them delivered to your mailbox! https://lnkd.in/gV6UC9za #ai2incubator #harmonious
Weekly paper roundup: Inference Scaling for Long-Context RAG (10/7/2024)
harmonious.ai
-
How do startups building GenAI/LLM-based products interview candidates for data scientist, AI engineer, and applied researcher roles? My quick googling showed guides suggesting questions about SQL, Spark, PyTorch, gradient boosting, L1 vs L2 penalty, etc. I believe that those skills may be relevant pre-ChatGPT, but increasingly insufficient in today's world of RAG, GenAI UX, agents, prompt management, AI API orchestration, guard rails, etc. Founders, please share your experiences/perspectives. I'll share mine in an upcoming post. PC: Dalle3 via ChatGPT.
-
Harmonious' weekly paper roundup: 📌 Spotlight paper is from Meta's Llama team, its title is inspired by a game show 📌 Multimodality is still very active, with speech and video making progress 📌 AI-generated synthetic data is pervasive 📌 GPTs still have the upper hands on new (hence lesser-known) benchmarks. OpenAI is still the best in the business in avoiding overfitting 📌 ByteDance may have come up with a nice improvement on residual connection: hyper connection. Resnet 👉 HyperNet? 📌 The frightening pace of AI research from Chinese institutions continues 📌 Apple is publishing a lot more Read more at: https://lnkd.in/dxPbYD7M #ai2incubator #harmonious
Weekly paper roundup: Law of the Weakest Link (9/30/2024)
harmonious.ai
-
At the beginning of this year, I wrote in Insight #13: """ 2024 Prediction: VoiceGPTs We wrap up Insight #13 with our prediction for 2024. Similar to many other 2024 predictions, we anticipate multimodal models to take center stage. We are particularly excited about models that combine text and speech modalities, enabling seamless end-to-end conversations that are voice-based. This is in contrast to the current pipelined approach of sandwiching an LLM with a pair of speech-to-text and text-to-speech models that results in highly stilted, walkie-talkie-like experience. Multimodal text and speech models, which we refer to as VoiceGPTs, will elevate the popular ChatGPT experience beyond the confines of the keyboard. Imagine having a natural conversation about any topic with a VoiceGPT on your Alexa, Siri, or Home device. This is a highly non-trivial technical challenge. We will only see a preview of such technology in 2024. """ Fast forward 9 months, I was proven to be a bit too cautious. First, OpenAI announced in May GPT-4o, with scifi-like demos evocative of the movie Her (and drawing the ire of Scarlett Johansson). GPT-4o's voice mode was rolled out to users earlier this month. OpenAI continues to show the way, leaving the rest of the industry (Anthropic, Google, Meta, etc.) scrambling to catch up. The company that is closest to catching up here is however a French AI research lab "with a $330 million budget that will make everything open source". It is called Kyutai. Last week they shared Moshi (model, weights for Moshi and its Mimi codec, streaming inference code in Pytorch, Rust and MLX, and a fantastic technical report). Amazing! Moshi's technical report is our pick for past week's spotlight paper at Harmonious. https://lnkd.in/gFEVEdTq #ai2incubator #harmonious
Weekly paper roundup: Moshi (9/16/2024)
harmonious.ai
-
Harmonious' spotlight paper this week: OLMoE: Open Mixture-of-Experts Language Models Authors: Allen Institute for AI; Contextual AI; University of Washington; Princeton University This paper presents OLMoE, an innovative language model leveraging a sparse Mixture-of-Experts architecture, which achieves remarkable efficiency and performance with its 7 billion parameters. I found the emphasis on key design choices and their detailed analysis of MoE training particularly insightful. The open-source nature of their work fosters transparency and collaboration in the AI community. However, the high computational resources required for pretraining may limit accessibility for many academic institutions. Lastly, I am curious about whether the outcomes observed in smaller models will hold true in significantly larger models. #harmoniou #ai2incubator
Weekly paper roundup: OLMoE (9/2/2024)
harmonious.ai
-
Harmonious' weekly paper roundup for the week of August 26, 2024. The reviewed papers collectively delve into various advancements in AI models, particularly focusing on multimodal, vision-language, and inference strategies. Several papers explore the enhancement of Large Language Models (LLMs) through innovative techniques such as improved inference patterns for long contexts, the utilization of mixed encoders, and energy-efficient on-device processing (WiM, Eagle, Dolphin). Another recurring theme is multimodality, with in-depth studies on optimizing LLMs for cross-modal alignment and real-time interactions in complex environments (Law of Vision Representation, GameNGen, CogVLM2). Further contributions include advancements in text-to-image diffusion models, audio language modeling, and AI-generated content in music, reflecting the expanding scope of AI applications (SwiftBrush v2, WavTokenizer, Foundation Models for Music). The practical impact of these models is underscored by initiatives to enhance the functionality and accessibility of benchmarks and operational pipelines, ensuring robust performance in real-world scenarios (SWE-bench-java, LlamaDuo, MME-RealWorld). https://lnkd.in/gw7za8-X #ai2incubator #harmonious
Weekly paper roundup: Writing in the Margins (8/26/2024)
harmonious.ai