✨ ✨ Come join us for a guest lecture by Hyung Won Chung from OpenAI on "Shaping the Future of AI from the History of Transformer".
The talk will be on Monday October 14 from 1:45 PM to 3:15 PM ET, as part of my course on Large Language Models at the University of Pennsylvania.
Hyung Won is a research scientist at OpenAI who worked most recently on their 🍓 o1 model. His other notable works include contributions to Flan-T5, Flan-PaLM, T5X, and the PaLM language model at Google.
➡️ Registration link: https://lu.ma/q0gghdtc
Title: Shaping the Future of AI from the History of Transformer
Abstract: AI is developing at such an overwhelming pace that it is hard to keep up. Instead of spending all our energy catching up with the latest development, I argue that we should study the change itself. First step is to identify and understand the driving force behind the change. For AI, it is the exponentially cheaper compute and associated scaling. I will provide a highly-opinionated view on the early history of Transformer architectures, focusing on what motivated each development and how each became less relevant with more compute. This analysis will help us connect the past and present in a unified perspective, which in turn makes it more manageable to project where the field is heading.
Bio: Hyung Won Chung is a research scientist at OpenAI. His recent work focuses on o1. He has worked on various aspects of Large Language Models: pre-training, instruction fine-tuning, reinforcement learning with human feedback, reasoning, multilinguality, parallelism strategies, etc. Some of the notable work includes scaling Flan paper (Flan-T5, Flan-PaLM) and T5X, the training framework used to train the PaLM language model. Before OpenAI, he was at Google Brain and before that he received a PhD from MIT.
Supplementary Readings:
➡️ Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. (https://lnkd.in/eEA6MSJx).
➡️ Fast Transformer Decoding: One Write-Head is All You Need. (https://lnkd.in/eusN_AHm).
#ai #machinelearning #largelanguagemodels #llm #upenn #turing
Cloud Cybersecurity AI
1wDimitar, does Phi-4 also excels in other areas of generative AI?