Anthropic Unveils Major Funding Program for Advanced AI Benchmarks

Anthropic has launched a new initiative to fund the development of advanced benchmarks for evaluating AI models, including its own generative model, Claude. Announced on Monday, this program will provide funding to third-party organizations that can effectively measure advanced AI capabilities. Applications for funding are open on a rolling basis.

“Our aim with these evaluations is to enhance AI safety across the industry, providing valuable tools for the entire ecosystem. Creating high-quality, safety-relevant evaluations is challenging, and the demand for these evaluations is growing faster than the supply,” Anthropic stated on its blog.

AI currently faces significant benchmarking issues, with existing tests often failing to reflect real-world use cases. Many benchmarks, especially those developed before modern generative AI, may no longer accurately measure their intended metrics. Anthropic's solution is to develop challenging benchmarks focusing on AI security and societal impacts through innovative tools, infrastructure, and methods.

Anthropic emphasizes the need for tests assessing a model's ability to perform tasks such as cyberattacks, enhancing weapons of mass destruction, and manipulating or deceiving people through deepfakes or misinformation. The company also aims to create an "early warning system" for identifying and assessing AI risks related to national security and defense, although specific details of this system were not disclosed.

More research into benchmarks and tasks

Moreover, the program plans to support research into benchmarks and tasks that explore AI’s potential in scientific research, multilingual communication, bias mitigation, and self-censoring toxicity. Anthropic envisions new platforms allowing subject-matter experts to develop evaluations and large-scale model trials involving thousands of users. The company has hired a full-time coordinator for this program and may invest in or expand promising projects.

While Anthropic’s effort to improve AI benchmarks is commendable, its commercial interests in the AI industry might raise concerns about impartiality. The company has stated that it wants funded evaluations to align with its AI safety classifications, which might push applicants to accept definitions of "safe" or "risky" AI they might not agree with.

Concerns?

Moreover, some in the AI community may be skeptical of Anthropic’s focus on "catastrophic" and "deceptive" AI risks, like those involving nuclear weapons, as many experts believe such scenarios are unlikely in the near future. Critics argue that such claims distract from current regulatory issues like AI’s tendency to produce misleading information.

Despite these concerns, Anthropic hopes its program will promote comprehensive AI evaluation as an industry standard, a goal shared by many independent efforts to improve AI benchmarks. Last year, Anthropic announced that it would receive $100 million in funding from South Korea's largest telco, SK Telecom.