Perplexity API Setup Guide: API Key, Sonar Models & Python/Node.js Integration

Perplexity API Setup Guide for Developers

Perplexity AI offers a powerful API that combines large language models with real-time web search, enabling developers to build research automation workflows with grounded, citation-backed responses. This guide walks you through API key generation, model selection, search parameter configuration, and full integration into Python and Node.js projects.

Step 1: Create Your Perplexity API Account

  • Visit perplexity.ai and sign in or create an account.- Navigate to Settings → API or go directly to the API settings page at perplexity.ai/settings/api.- Add a payment method. Perplexity API uses a pay-per-use billing model separate from Perplexity Pro subscriptions.- Click Generate API Key. Copy and store the key securely — it will only be shown once.Set the key as an environment variable to keep it out of your source code: # Linux / macOS export PERPLEXITY_API_KEY=“YOUR_API_KEY”

Windows PowerShell

$env:PERPLEXITY_API_KEY=“YOUR_API_KEY”

Step 2: Understand Sonar Model Options

Perplexity offers several models under the **Sonar** family. Each is optimized for different use cases:

Model IDContext WindowBest ForSearch
sonar128k tokensGeneral research queries, summarizationYes
sonar-pro200k tokensComplex multi-step research, deep analysisYes
sonar-reasoning128k tokensChain-of-thought reasoning with citationsYes
sonar-reasoning-pro128k tokensAdvanced reasoning tasks requiring web dataYes
sonar-deep-research128k tokensExhaustive multi-query research reportsYes
All Sonar models automatically perform web searches and return citations. Choose sonar for speed and cost-efficiency, or sonar-pro when you need higher accuracy on complex queries.

Step 3: Configure Search Context Parameters

The Perplexity API is compatible with the OpenAI Chat Completions format but adds search-specific parameters:

  • search_domain_filter — Restrict results to specific domains (e.g., [“wikipedia.org”, “arxiv.org”]). Up to 3 domains.- search_recency_filter — Limit results by time: day, week, month, or year.- return_related_questions — Returns suggested follow-up questions (true/false).- search_context — Provide your own context or documents for the model to reference alongside web results.- return_citations — Include inline citation references in the response (true by default for Sonar models).

Step 4: Python Integration

Install the OpenAI Python SDK, which is fully compatible with Perplexity’s API: pip install openai

Create a research automation script: import os from openai import OpenAI

client = OpenAI( api_key=os.environ.get(“PERPLEXITY_API_KEY”), base_url=“https://api.perplexity.ai” )

def research_topic(query, model=“sonar”, recency=None, domains=None): """Run a grounded research query with optional filters.""" extra_body = {} if recency: extra_body[“search_recency_filter”] = recency if domains: extra_body[“search_domain_filter”] = domains

response = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": "You are a research assistant. Provide concise, well-cited answers."
        },
        {
            "role": "user",
            "content": query
        }
    ],
    extra_body=extra_body
)
return response.choices[0].message.content

Example: Research recent AI developments from specific sources

result = research_topic( query=“What are the latest advances in small language models?”, model=“sonar-pro”, recency=“week”, domains=[“arxiv.org”, “huggingface.co”] ) print(result)

Step 5: Node.js Integration

Install the OpenAI Node.js SDK: npm install openai

Build a reusable research module: import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: “https://api.perplexity.ai”, });

async function researchTopic(query, options = {}) { const { model = “sonar”, recency = null, domains = null } = options;

const body = { model, messages: [ { role: “system”, content: “You are a research assistant. Cite your sources.” }, { role: “user”, content: query }, ], };

if (recency) body.search_recency_filter = recency; if (domains) body.search_domain_filter = domains;

const response = await client.chat.completions.create(body); return response.choices[0].message.content; }

// Example: Batch research automation const topics = [ “Current market size of AI coding assistants”, “Top open-source RAG frameworks in 2026”, ];

for (const topic of topics) { const result = await researchTopic(topic, { model: “sonar-pro”, recency: “month”, }); console.log(\n--- ${topic} ---\n${result}); }

Step 6: Streaming Responses

For long research outputs, use streaming to display results incrementally: # Python streaming example stream = client.chat.completions.create( model="sonar", messages=[{"role": "user", "content": "Summarize the latest EU AI Act updates"}], stream=True )

for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)

Pro Tips

  • Use system prompts strategically — Guide the model’s output format (e.g., “Answer in bullet points with inline citations”) to get structured, parseable responses.- Chain sonar with sonar-reasoning — Use sonar for initial fact-gathering, then pass results to sonar-reasoning for synthesis and analysis in a two-step pipeline.- Domain filtering for accuracy — When researching technical topics, restrict to authoritative sources like arxiv.org, docs.python.org, or developer.mozilla.org to reduce noise.- Cache responses — Perplexity charges per request. Cache results locally (e.g., SQLite, Redis) for queries that don’t require real-time freshness.- Monitor usage — Check your token consumption regularly at the Perplexity API dashboard to avoid unexpected charges. Set billing alerts.- Use sonar-deep-research for reports — This model autonomously performs multiple search passes and produces comprehensive research documents. Ideal for automated report generation but slower and more expensive.

Troubleshooting

ErrorCauseSolution
401 UnauthorizedInvalid or missing API keyVerify the key is correct and the environment variable is set. Regenerate the key if needed.
429 Too Many RequestsRate limit exceededImplement exponential backoff. Default limits are 50 requests/minute for most plans.
400 Bad Request with model errorInvalid model IDDouble-check the model name. Use sonar, not sonar-small or other deprecated names.
search_domain_filter ignoredMore than 3 domains specifiedReduce the domain list to a maximum of 3 entries.
Empty or generic responsesOverly vague queryAdd specificity to your prompt. Include time context, scope, and desired format.
timeout errorsComplex queries on deep-research modelIncrease client timeout. sonar-deep-research can take 30-60+ seconds. Set timeout to at least 120 seconds.
## Frequently Asked Questions

Is the Perplexity API the same as a Perplexity Pro subscription?

No. The API and Pro subscription are billed separately. A Pro subscription gives you enhanced access to the Perplexity web and mobile apps, while the API is a pay-per-token developer service. You need to add a payment method and purchase API credits independently, even if you already have a Pro plan.

Can I use the Perplexity API with existing OpenAI SDK code?

Yes. Perplexity’s API follows the OpenAI Chat Completions format. You only need to change the base_url (or baseURL in Node.js) to https://api.perplexity.ai and swap in your Perplexity API key. Search-specific parameters like search_recency_filter are passed as additional body fields.

How do citations work in Perplexity API responses?

Sonar models return inline numbered references (e.g., [1], [2]) within the response text. The corresponding source URLs are included in the response metadata under the citations field. You can parse these programmatically to build footnotes, link lists, or verify sources in your application.

Explore More Tools

Antigravity AI Content Pipeline Automation Guide: Google Docs to WordPress Publishing Workflow Guide Bolt.new Case Study: Marketing Agency Built 5 Client Dashboards in One Day Case Study Bolt.new Best Practices: Rapid Full-Stack App Generation from Natural Language Prompts Best Practices ChatGPT Advanced Data Analysis (Code Interpreter) Complete Guide: Upload, Analyze, Visualize Guide ChatGPT Custom GPTs Advanced Guide: Actions, API Integration, and Knowledge Base Configuration Guide ChatGPT Voice Mode Guide: Build Voice-First Customer Service and Internal Workflows Guide Claude API Production Chatbot Guide: System Prompt Architecture for Reliable AI Assistants Guide Claude Artifacts Best Practices: Create Interactive Dashboards, Documents, and Code Previews Best Practices Claude Code Hooks Guide: Automate Custom Workflows with Pre and Post Execution Hooks Guide Claude MCP Server Setup Guide: Build Custom Tool Integrations for Claude Code and Claude Desktop Guide Cursor Composer Complete Guide: Multi-File Editing, Inline Diffs, and Agent Mode Guide Cursor Case Study: Solo Founder Built a Next.js SaaS MVP in 2 Weeks with AI-Assisted Development Case Study Cursor Rules Advanced Guide: Project-Specific AI Configuration and Team Coding Standards Guide Devin AI Team Workflow Integration Best Practices: Slack, GitHub, and Code Review Automation Best Practices Devin Case Study: Automated Dependency Upgrade Across 500-Package Python Monorepo Case Study ElevenLabs Case Study: EdTech Startup Localized 200 Course Hours to 8 Languages in 6 Weeks Case Study ElevenLabs Multilingual Dubbing Guide: Automated Video Localization Workflow for Global Content Guide ElevenLabs Voice Design Complete Guide: Create Consistent Character Voices for Games, Podcasts, and Apps Guide Gemini 2.5 Pro vs Claude Sonnet 4 vs GPT-4o: AI Code Generation Comparison 2026 Comparison Gemini API Multimodal Developer Guide: Image, Video, and Document Analysis with Code Examples Guide