Perplexity API Setup Guide: API Key, Sonar Models & Python/Node.js Integration

Perplexity API Setup Guide for Developers

Perplexity AI offers a powerful API that combines large language models with real-time web search, enabling developers to build research automation workflows with grounded, citation-backed responses. This guide walks you through API key generation, model selection, search parameter configuration, and full integration into Python and Node.js projects.

Step 1: Create Your Perplexity API Account

Visit perplexity.ai and sign in or create an account.- Navigate to Settings → API or go directly to the API settings page at perplexity.ai/settings/api.- Add a payment method. Perplexity API uses a pay-per-use billing model separate from Perplexity Pro subscriptions.- Click Generate API Key. Copy and store the key securely — it will only be shown once.Set the key as an environment variable to keep it out of your source code: # Linux / macOS export PERPLEXITY_API_KEY=“YOUR_API_KEY”


Windows PowerShell

$env:PERPLEXITY_API_KEY=“YOUR_API_KEY”

Step 2: Understand Sonar Model Options

Perplexity offers several models under the **Sonar** family. Each is optimized for different use cases:

Model ID	Context Window	Best For	Search
`sonar`	128k tokens	General research queries, summarization	Yes
`sonar-pro`	200k tokens	Complex multi-step research, deep analysis	Yes
`sonar-reasoning`	128k tokens	Chain-of-thought reasoning with citations	Yes
`sonar-reasoning-pro`	128k tokens	Advanced reasoning tasks requiring web data	Yes
`sonar-deep-research`	128k tokens	Exhaustive multi-query research reports	Yes

All Sonar models automatically perform web searches and return citations. Choose sonar for speed and cost-efficiency, or sonar-pro when you need higher accuracy on complex queries.

Step 3: Configure Search Context Parameters

The Perplexity API is compatible with the OpenAI Chat Completions format but adds search-specific parameters:

search_domain_filter — Restrict results to specific domains (e.g., [“wikipedia.org”, “arxiv.org”]). Up to 3 domains.- search_recency_filter — Limit results by time: day, week, month, or year.- return_related_questions — Returns suggested follow-up questions (true/false).- search_context — Provide your own context or documents for the model to reference alongside web results.- return_citations — Include inline citation references in the response (true by default for Sonar models).

Step 4: Python Integration

Install the OpenAI Python SDK, which is fully compatible with Perplexity’s API: pip install openai

Create a research automation script: import os from openai import OpenAI


client = OpenAI(
api_key=os.environ.get(“PERPLEXITY_API_KEY”),
base_url=“https://api.perplexity.ai”
)
def research_topic(query, model=“sonar”, recency=None, domains=None):
"""Run a grounded research query with optional filters."""
extra_body = {}
if recency:
extra_body[“search_recency_filter”] = recency
if domains:
extra_body[“search_domain_filter”] = domains
response = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": "You are a research assistant. Provide concise, well-cited answers."
        },
        {
            "role": "user",
            "content": query
        }
    ],
    extra_body=extra_body
)
return response.choices[0].message.content
Example: Research recent AI developments from specific sources

result = research_topic( query=“What are the latest advances in small language models?”, model=“sonar-pro”, recency=“week”, domains=[“arxiv.org”, “huggingface.co”] ) print(result)

Step 5: Node.js Integration

Install the OpenAI Node.js SDK: npm install openai

Build a reusable research module: import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: “https://api.perplexity.ai”, });


async function researchTopic(query, options = {}) {
const { model = “sonar”, recency = null, domains = null } = options;
const body = {
model,
messages: [
{ role: “system”, content: “You are a research assistant. Cite your sources.” },
{ role: “user”, content: query },
],
};
if (recency) body.search_recency_filter = recency;
if (domains) body.search_domain_filter = domains;
const response = await client.chat.completions.create(body);
return response.choices[0].message.content;
}
// Example: Batch research automation
const topics = [
“Current market size of AI coding assistants”,
“Top open-source RAG frameworks in 2026”,
];

for (const topic of topics) { const result = await researchTopic(topic, { model: “sonar-pro”, recency: “month”, }); console.log(\n--- ${topic} ---\n${result}); }

Step 6: Streaming Responses

For long research outputs, use streaming to display results incrementally: # Python streaming example stream = client.chat.completions.create( model="sonar", messages=[{"role": "user", "content": "Summarize the latest EU AI Act updates"}], stream=True )

for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)

Pro Tips

Use system prompts strategically — Guide the model’s output format (e.g., “Answer in bullet points with inline citations”) to get structured, parseable responses.- Chain sonar with sonar-reasoning — Use sonar for initial fact-gathering, then pass results to sonar-reasoning for synthesis and analysis in a two-step pipeline.- Domain filtering for accuracy — When researching technical topics, restrict to authoritative sources like arxiv.org, docs.python.org, or developer.mozilla.org to reduce noise.- Cache responses — Perplexity charges per request. Cache results locally (e.g., SQLite, Redis) for queries that don’t require real-time freshness.- Monitor usage — Check your token consumption regularly at the Perplexity API dashboard to avoid unexpected charges. Set billing alerts.- Use sonar-deep-research for reports — This model autonomously performs multiple search passes and produces comprehensive research documents. Ideal for automated report generation but slower and more expensive.

Troubleshooting

Error	Cause	Solution
`401 Unauthorized`	Invalid or missing API key	Verify the key is correct and the environment variable is set. Regenerate the key if needed.
`429 Too Many Requests`	Rate limit exceeded	Implement exponential backoff. Default limits are 50 requests/minute for most plans.
`400 Bad Request` with model error	Invalid model ID	Double-check the model name. Use `sonar`, not `sonar-small` or other deprecated names.
`search_domain_filter` ignored	More than 3 domains specified	Reduce the domain list to a maximum of 3 entries.
Empty or generic responses	Overly vague query	Add specificity to your prompt. Include time context, scope, and desired format.
`timeout` errors	Complex queries on deep-research model	Increase client timeout. `sonar-deep-research` can take 30-60+ seconds. Set timeout to at least 120 seconds.

## Frequently Asked Questions

Is the Perplexity API the same as a Perplexity Pro subscription?

No. The API and Pro subscription are billed separately. A Pro subscription gives you enhanced access to the Perplexity web and mobile apps, while the API is a pay-per-token developer service. You need to add a payment method and purchase API credits independently, even if you already have a Pro plan.

Can I use the Perplexity API with existing OpenAI SDK code?

Yes. Perplexity’s API follows the OpenAI Chat Completions format. You only need to change the base_url (or baseURL in Node.js) to https://api.perplexity.ai and swap in your Perplexity API key. Search-specific parameters like search_recency_filter are passed as additional body fields.

How do citations work in Perplexity API responses?

Sonar models return inline numbered references (e.g., [1], [2]) within the response text. The corresponding source URLs are included in the response metadata under the citations field. You can parse these programmatically to build footnotes, link lists, or verify sources in your application.

Explore More Tools

Antigravity AI Content Pipeline Automation Guide: Google Docs to WordPress Publishing Workflow Guide Bolt.new Case Study: Marketing Agency Built 5 Client Dashboards in One Day Case Study Bolt.new Best Practices: Rapid Full-Stack App Generation from Natural Language Prompts Best Practices ChatGPT Advanced Data Analysis (Code Interpreter) Complete Guide: Upload, Analyze, Visualize Guide ChatGPT Custom GPTs Advanced Guide: Actions, API Integration, and Knowledge Base Configuration Guide ChatGPT Voice Mode Guide: Build Voice-First Customer Service and Internal Workflows Guide Claude API Production Chatbot Guide: System Prompt Architecture for Reliable AI Assistants Guide Claude Artifacts Best Practices: Create Interactive Dashboards, Documents, and Code Previews Best Practices Claude Code Hooks Guide: Automate Custom Workflows with Pre and Post Execution Hooks Guide Claude MCP Server Setup Guide: Build Custom Tool Integrations for Claude Code and Claude Desktop Guide Cursor Composer Complete Guide: Multi-File Editing, Inline Diffs, and Agent Mode Guide Cursor Case Study: Solo Founder Built a Next.js SaaS MVP in 2 Weeks with AI-Assisted Development Case Study Cursor Rules Advanced Guide: Project-Specific AI Configuration and Team Coding Standards Guide Devin AI Team Workflow Integration Best Practices: Slack, GitHub, and Code Review Automation Best Practices Devin Case Study: Automated Dependency Upgrade Across 500-Package Python Monorepo Case Study ElevenLabs Case Study: EdTech Startup Localized 200 Course Hours to 8 Languages in 6 Weeks Case Study ElevenLabs Multilingual Dubbing Guide: Automated Video Localization Workflow for Global Content Guide ElevenLabs Voice Design Complete Guide: Create Consistent Character Voices for Games, Podcasts, and Apps Guide Gemini 2.5 Pro vs Claude Sonnet 4 vs GPT-4o: AI Code Generation Comparison 2026 Comparison Gemini API Multimodal Developer Guide: Image, Video, and Document Analysis with Code Examples Guide