Anthropic introduces prompt-testing for Claude 3.5

With new features in Anthropic Console, developers should find it much easier to interact with Anthropic’s Claude-3 LLMs. The AI developer now offers functionality that tests and evaluates automatically generated prompts for effectiveness.

Creating the right prompts or queries is the best guarantee of getting the most value out of LLMs. Anthropic wants to greatly help end users in this regard by testing automatically generated prompts and evaluating them for their effectiveness.

For its Claude 3.5 Sonnet-LLM, the AI developer now offers new functionality for this purpose in its Anthropic Console interface. Over time, developers can realize better inputs based on the feedback and thus improve Claude-LLM’s responses for specialized tasks.

Usability

More specifically, the new environment helps users test and evaluate prompts created by the built-in prompt generator, introduced in May of this year. It allows them to test the effectiveness of these prompts in different scenarios. Users will find this under the Evaluate tab in the interface.

In addition, they can upload their own examples to the test environment or ask the Claude 3.5 Sonnet-LLM to produce a series of AI-generated test cases. In this way, developers can compare different prompts side by side in effectiveness and rate these examples on a scale of one to five.

The test environment is now available immediately to all Anthropic Console end users.

Also read: Anthropic launches initiative to develop better benchmarks for LLMs

Top story

Domain-specific AI beats general models in business applications

Visma’s AI team is quietly redefining document processing across Europe. With a background spanning nearly ...

Berry Zwets July 10, 2025

Tech calendar

Anthropic introduces prompt-testing for Claude 3.5

Usability

Stay tuned, subscribe!

Is English the next programming language? JetBrains’ CEO says no

Nvidia reaches milestone of $4 trillion market value

KnowBe4 evolves from security training to human risk management

New Alteryx release tears down walls between cloud services and datasets

Wikidata unlocks its own knowledge base by vectorizing its data

With Fabric, Microsoft aims for biggest launch since SQL Server

Microsoft Fabric will be like Office, but for data platforms

Krijg Volledig Inzicht van Gebruiker tot Cloud met Cisco ThousandEyes

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices