Integrated Knowledge Solutions: Constitutional AI

AI has advanced rapidly in recent years, with large language models (LLMs) like ChatGPT creating enormous excitement. These models can generate remarkably human-like text albeit with certain limitations. In this post, we'll look at a new member of the family of large language models, Anthropic's Claude 2, and highlight some of its features.

Claude 2 Overview

Claude2 was released in February 2023. Claude 2 utilizes a context window of approximately 4,000 tokens during conversations. This allows it to actively reference the last 1,000-2,000 words spoken in order to strengthen contextual awareness and continuity. The context window is dynamically managed, expanding or contracting slightly based on factors like conversation complexity. This context capacity exceeds ChatGPT's approximately 1,000 token window, enabling Claude 2 to sustain longer, more intricate dialogues while retaining appropriate context. In addition to conversational context, Claude 2 can take in multiple documents to incorporate information from different sources.

Claude2's distinguishing features are Constitutional AI and Constitutional Instructive Reward techniques. The incorporation of these two techniques is claimed to improve safety and reliability. As a result, Claude 2 is seen to provide helpful, harmless, and honest responses compared to other models; its performance on a wide range of conversational queries is over 99% accuracy. In benchmarks, ChatGPT produces inconsistent or incorrect responses approximately 5-10% of the time.

What is Constitutional AI?

The Constitutional AI technique constrains Claude 2 to behave according to a "constitution" defined by its designers at Anthropic. The "constitution" takes the form of a modular library of neural network modules that encodes rules guiding allowed model outputs. The constitutional rule modules are designed using a combination of techniques like supervised learning from human feedback, adversarial training to surface edge cases, and reinforcement learning optimized for consistency and oversight. The modules operate on Claude 2's internal representations, blocking or altering potential model outputs that violate defined constitution policies. These policies prohibit overtly harmful responses and mitigate risks identified during Claude 2's training. This technique constrains Claude 2 to behave according to a "constitution" defined by its designers at Anthropic. The constitution sets guidelines for providing helpful, honest, harmless information. Concrete rules prohibit harmful responses, while allowing Claude 2 to politely decline inappropriate requests. This establishes ethical boundaries unmatched by other LLMs.

What is Constitutional Instructive Reward Technique?

Constitutional Instructive Reward technique builds on Constitutional AI by further optimizing Claude 2's training process. Anthropic generates a large dataset of hypothetical conversational scenarios that might challenge model integrity. The Constitutional AI modules provide feedback on which responses are acceptable versus violations. This dataset then trains an auxiliary Constitutional AI Advisor model through self-supervised learning.

The Constitutional AI Advisor produces reward signals that feed back into Claude 2's reinforcement learning loop. This focuses the overarching optimization toward mitigating identified risks and providing helpful instructions to users. The Advisor guides Claude 2 toward more nuanced integrity not encapsulated by the core Constitutional AI modules. It also provides explainability, since the Advisor outputs can be inspected to identify why specific responses qualified as unwise or unethical.

Useful Claude2 Metrics

The information below is gleaned from Anthropic's publications.

- Claude 2 can generate approximately 300 tokens (2,000 words) per second on a modern GPU. This enables rapid response times for conversational queries.

- Its average query response latency is under 500 milliseconds, allowing for smooth and natural dialogue flow.

- The model is optimized to run efficiently on commercially available GPUs like the Nvidia A100. On this hardware, it can process over 10 queries per second concurrently.

- Claude 2 requires only 50 GPU hours to train, this improves sustainability.

- In a benchmark test on the SuperGLUE natural language toolkit, Claude 2 achieved a 94% score while running up to 24x faster than GPT-3.

- Cloud-deployed versions of Claude 2 scale to handle over 100,000 users simultaneously.

In summary, Claude2 is a welcome addition to the growing family of LLMs with some distinct performance superiority over other models. The best thing about Claude2 is that it is totally free.

Integrated Knowledge Solutions

Pages

Claude 2: A New Member of the Growing Family of Large Language Models

Claude 2 Overview

What is Constitutional AI?

What is Constitutional Instructive Reward Technique?

Useful Claude2 Metrics

Search This Blog