This guide covers what you actually need to get started with the Claude API. It assumes you have Python 3.9+ or Node.js 18+ installed and can install packages. It does not assume prior experience with LLM APIs.

Prerequisites

  1. An Anthropic account with API access — create one at console.anthropic.com
  2. An API key from the Anthropic console
  3. Python 3.9+ or Node.js 18+

Installation

Python:

pip install anthropic

TypeScript / Node.js:

npm install @anthropic-ai/sdk

Authentication

Keep your API key in an environment variable. Never hardcode it, never commit it to source control.

export ANTHROPIC_API_KEY="sk-ant-..."

Or use a .env file with python-dotenv (Python) or dotenv (Node.js).

First Call: Basic Completion

Python:

import anthropic

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from environment

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain what a transformer model is in two paragraphs."}
    ]
)

print(message.content[0].text)

TypeScript:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic(); // reads ANTHROPIC_API_KEY from environment

const message = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "Explain what a transformer model is in two paragraphs." },
  ],
});

console.log(message.content[0].text);

The response object contains message.content (array of content blocks), message.usage (input and output token counts), and message.stop_reason (why generation stopped).

System Prompts

The system parameter sets context for the conversation. Use it to define the model's role, constraints, and output format.

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are a technical writer. Respond concisely and use code examples where relevant. Do not use marketing language.",
    messages=[
        {"role": "user", "content": "What is a vector embedding?"}
    ]
)

Multi-Turn Conversations

The messages array holds the conversation history. The API is stateless — each call must include the full history you want the model to consider.

messages = [
    {"role": "user", "content": "What is RAG in the context of LLMs?"},
]

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=messages
)

# Add the reply to history and continue
messages.append({"role": "assistant", "content": response.content[0].text})
messages.append({"role": "user", "content": "What are the main failure modes?"})

response2 = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=messages
)

Streaming Responses

For interactive applications, streaming delivers tokens as they are generated.

Python:

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain attention mechanisms."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Getting Structured JSON Output

import json

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are a data extraction assistant. Always respond with valid JSON only. No explanation, no markdown fencing.",
    messages=[
        {
            "role": "user",
            "content": "Extract the key claims from this text as JSON with keys: claims (array), confidence (high/medium/low).\n\nText: The model achieves 94.2% accuracy on MMLU, outperforming the previous generation by 8 percentage points."
        }
    ]
)

data = json.loads(message.content[0].text)
print(data)

Error Handling

try:
    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}]
    )
except anthropic.APIConnectionError as e:
    print(f"Connection failed: {e}")
except anthropic.RateLimitError as e:
    print(f"Rate limit hit — implement exponential backoff")
except anthropic.APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")

For production, implement retry logic with exponential backoff for rate limit errors (429) and transient server errors (5xx).

Model Selection

Model Context Best for
claude-opus-4-7 200K tokens Complex reasoning, long-form analysis
claude-sonnet-4-6 200K tokens Balanced capability and speed
claude-haiku-4-5-20251001 200K tokens Fast, cost-efficient tasks

Use Sonnet for most production workloads. Use Haiku where latency and cost matter more than maximum capability. Use Opus for tasks requiring the highest reasoning capability.

Next Steps

Once you have basic completions working: explore tool use (function calling) to allow Claude to call functions in your code; prompt caching to reduce latency and cost on repeated calls; the Batch API for processing large volumes asynchronously at reduced cost.

The official documentation at docs.anthropic.com covers all of these in detail. This guide is a starting point; the documentation is the authoritative reference.