Anthropic has published a new constitution for its AI model Claude. In this document, the company describes the values, behavioral principles, and considerations that the model must follow when processing user questions.
The constitution has been made publicly available under a Creative Commons CC0 license, allowing the content to be used freely without permission.
Anthropic published the first version of this constitution in May 2023. According to the company, the earlier version proved to have limitations, including Claude’s difficulty in correctly applying safety and behavioral guidelines in new or unforeseen situations. When instructions did not provide explicit guidance for a specific prompt, the model could still generate undesirable or incorrect responses, reports SiliconANGLE.
The new constitution, therefore, not only uses instructions but also contains detailed explanations of the underlying reasons for desired behavior. Anthropic argues that these explanations make it easier for Claude to apply the guidelines to unfamiliar tasks or contexts.
Four core principles for Claude’s behavior
Unlike previous versions, the document does not consist of separate principles, but rather a coherent description of priorities and context. The constitution is structured around four core principles that guide Claude’s behavior. Among other things, it states that the model must be helpful by tailoring responses to users’ explicit wishes. For example, Anthropic states that Claude should not generate code in a programming language other than the one requested by the user, according to SiliconANGLE.
In addition, the document describes what Anthropic understands by “broadly safe” behavior. This includes, among other things, that Claude may not perform actions explicitly prohibited by a user, and that the model must be transparent about how decisions are made. The constitution also contains guidelines for ethical conduct and for complying with additional, more specific instructions from Anthropic. These additional guidelines relate, among other things, to warding off jailbreaking attempts and interactions with external applications and tools.
Anthropic states that the constitution plays a direct role in training Claude. The document is part of the training data. It is also used by the models to generate synthetic training data. For example, by simulating conversations in which the guidelines from the constitution apply.
According to the company, the constitution also serves customers and users. Organizations that use Claude can use the document to assess whether the model’s output aligns with the established principles. If this is not the case, they can provide feedback to Anthropic.
Other AI providers are also publishing behavioral frameworks
The publication is part of a broader development in which AI developers explicitly lay down their principles and behavioral frameworks. Other parties in the sector have also published similar documents. For example, OpenAI Group PBC also uses a CC0 license for its own AI constitution, which is part of the training data for GPT-5.
Anthropic emphasizes that the constitution is not a static document. The company expects its content to be adapted as AI systems continue to develop.