arXiv·6d agoResearch
AI browsing agents hit a new milestone on WebArena benchmark
A team at DeepMind publishes a web-agent that scores 78% on WebArena — up from last year's 52% leader. Paper breaks down the changes in exploration policy.
A team at DeepMind publishes a web-agent that scores 78% on WebArena — up from last year's 52% leader. Paper breaks down the changes in exploration policy.
Community discussion
Be the first to comment. Short and specific beats long and polished.