OPC reposted this
H Company might've just created the best AI agent yet. After raising $200M, they just introduced an agent that can execute any task from a prompt. Their "Runner H" can basically turn instructions into action with human-like precision. Features: ▸ Navigates web interfaces with pixel-level precision. ▸ Interprets pixels and text to understand screens and elements. ▸ Automates workflows for web testing, onboarding, and e-commerce. ▸ Adapts automatically to UI changes. ▸ Achieves a 67% success rate on WebVoyager, outperforming competitors. Architecture: ▸ Powered by a 2B-paramezer LLM for function calling and coding. ▸ Includes a 3B-parameter VLM for understanding graphical and text elements. You can signup for the private beta here: https://lnkd.in/gdrK6u6A
How is this different from function calling? Also, someone else posted this with basically same wording yesterday so this makes me question the authenticity and incentives here...
67% May be better than the competition, but in reality it’s not that useful. Do you really want a tool that fails 1/3 of the time?
The marketing team is usually better than the engineering one. How many videos like this one have we seen in the last year?
Thoughts on how this will impact web design? Sanjog Bora Vikhyat Kaushik
QA testers are finished :D
is this legit? how do we know this isn't just a llama wrapper or some fine tuning on selenium using langchain
F**k there goes my billion dollar idea 🥲
Eric Vyacheslav This is groundbreaking—AI agents are getting closer to true task automation! For those curious about the implications of advancements like this, we recently had Advitya, a Machine Learning Engineer 2 at Microsoft specializing in Responsible AI, on my YouTube channel Ready Set Do. He shared insights on navigating the challenges of AI development, ensuring responsible deployment, and the future of human-AI collaboration. Definitely worth checking out for anyone passionate about AI! https://youtu.be/OJzpyENIomE
Pixel-level precision for navigating web interfaces is a game-changer for tasks like e-commerce automation and web testing. It’s fascinating how 'Runner H' combines visual and textual understanding for adaptability. How does the system handle highly dynamic or non-standard UIs—are there any constraints or failure modes you’ve identified?
CEO and Co-Founder at AskUI | What can be said can be solved.
1wHeres an open source alternative :) https://huggingface.co/AskUI/PTA-1 https://github.com/askui/vision-agent