Perplexity API Setup Guide: API Key, Sonar Models & Python/Node.js Integration

Perplexity API Setup Guide for Developers

Perplexity AI offers a powerful API that combines large language models with real-time web search, enabling developers to build research automation workflows with grounded, citation-backed responses. This guide walks you through API key generation, model selection, search parameter configuration, and full integration into Python and Node.js projects.

Step 1: Create Your Perplexity API Account

  • Visit perplexity.ai and sign in or create an account.- Navigate to Settings → API or go directly to the API settings page at perplexity.ai/settings/api.- Add a payment method. Perplexity API uses a pay-per-use billing model separate from Perplexity Pro subscriptions.- Click Generate API Key. Copy and store the key securely — it will only be shown once.Set the key as an environment variable to keep it out of your source code: # Linux / macOS export PERPLEXITY_API_KEY=“YOUR_API_KEY”

Windows PowerShell

$env:PERPLEXITY_API_KEY=“YOUR_API_KEY”

Step 2: Understand Sonar Model Options

Perplexity offers several models under the **Sonar** family. Each is optimized for different use cases:

Model IDContext WindowBest ForSearch
sonar128k tokensGeneral research queries, summarizationYes
sonar-pro200k tokensComplex multi-step research, deep analysisYes
sonar-reasoning128k tokensChain-of-thought reasoning with citationsYes
sonar-reasoning-pro128k tokensAdvanced reasoning tasks requiring web dataYes
sonar-deep-research128k tokensExhaustive multi-query research reportsYes
All Sonar models automatically perform web searches and return citations. Choose sonar for speed and cost-efficiency, or sonar-pro when you need higher accuracy on complex queries.

Step 3: Configure Search Context Parameters

The Perplexity API is compatible with the OpenAI Chat Completions format but adds search-specific parameters:

  • search_domain_filter — Restrict results to specific domains (e.g., [“wikipedia.org”, “arxiv.org”]). Up to 3 domains.- search_recency_filter — Limit results by time: day, week, month, or year.- return_related_questions — Returns suggested follow-up questions (true/false).- search_context — Provide your own context or documents for the model to reference alongside web results.- return_citations — Include inline citation references in the response (true by default for Sonar models).

Step 4: Python Integration

Install the OpenAI Python SDK, which is fully compatible with Perplexity’s API: pip install openai

Create a research automation script: import os from openai import OpenAI

client = OpenAI( api_key=os.environ.get(“PERPLEXITY_API_KEY”), base_url=“https://api.perplexity.ai” )

def research_topic(query, model=“sonar”, recency=None, domains=None): """Run a grounded research query with optional filters.""" extra_body = {} if recency: extra_body[“search_recency_filter”] = recency if domains: extra_body[“search_domain_filter”] = domains

response = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": "You are a research assistant. Provide concise, well-cited answers."
        },
        {
            "role": "user",
            "content": query
        }
    ],
    extra_body=extra_body
)
return response.choices[0].message.content

Example: Research recent AI developments from specific sources

result = research_topic( query=“What are the latest advances in small language models?”, model=“sonar-pro”, recency=“week”, domains=[“arxiv.org”, “huggingface.co”] ) print(result)

Step 5: Node.js Integration

Install the OpenAI Node.js SDK: npm install openai

Build a reusable research module: import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: “https://api.perplexity.ai”, });

async function researchTopic(query, options = {}) { const { model = “sonar”, recency = null, domains = null } = options;

const body = { model, messages: [ { role: “system”, content: “You are a research assistant. Cite your sources.” }, { role: “user”, content: query }, ], };

if (recency) body.search_recency_filter = recency; if (domains) body.search_domain_filter = domains;

const response = await client.chat.completions.create(body); return response.choices[0].message.content; }

// Example: Batch research automation const topics = [ “Current market size of AI coding assistants”, “Top open-source RAG frameworks in 2026”, ];

for (const topic of topics) { const result = await researchTopic(topic, { model: “sonar-pro”, recency: “month”, }); console.log(\n--- ${topic} ---\n${result}); }

Step 6: Streaming Responses

For long research outputs, use streaming to display results incrementally: # Python streaming example stream = client.chat.completions.create( model="sonar", messages=[{"role": "user", "content": "Summarize the latest EU AI Act updates"}], stream=True )

for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)

Pro Tips

  • Use system prompts strategically — Guide the model’s output format (e.g., “Answer in bullet points with inline citations”) to get structured, parseable responses.- Chain sonar with sonar-reasoning — Use sonar for initial fact-gathering, then pass results to sonar-reasoning for synthesis and analysis in a two-step pipeline.- Domain filtering for accuracy — When researching technical topics, restrict to authoritative sources like arxiv.org, docs.python.org, or developer.mozilla.org to reduce noise.- Cache responses — Perplexity charges per request. Cache results locally (e.g., SQLite, Redis) for queries that don’t require real-time freshness.- Monitor usage — Check your token consumption regularly at the Perplexity API dashboard to avoid unexpected charges. Set billing alerts.- Use sonar-deep-research for reports — This model autonomously performs multiple search passes and produces comprehensive research documents. Ideal for automated report generation but slower and more expensive.

Troubleshooting

ErrorCauseSolution
401 UnauthorizedInvalid or missing API keyVerify the key is correct and the environment variable is set. Regenerate the key if needed.
429 Too Many RequestsRate limit exceededImplement exponential backoff. Default limits are 50 requests/minute for most plans.
400 Bad Request with model errorInvalid model IDDouble-check the model name. Use sonar, not sonar-small or other deprecated names.
search_domain_filter ignoredMore than 3 domains specifiedReduce the domain list to a maximum of 3 entries.
Empty or generic responsesOverly vague queryAdd specificity to your prompt. Include time context, scope, and desired format.
timeout errorsComplex queries on deep-research modelIncrease client timeout. sonar-deep-research can take 30-60+ seconds. Set timeout to at least 120 seconds.
## Frequently Asked Questions

Is the Perplexity API the same as a Perplexity Pro subscription?

No. The API and Pro subscription are billed separately. A Pro subscription gives you enhanced access to the Perplexity web and mobile apps, while the API is a pay-per-token developer service. You need to add a payment method and purchase API credits independently, even if you already have a Pro plan.

Can I use the Perplexity API with existing OpenAI SDK code?

Yes. Perplexity’s API follows the OpenAI Chat Completions format. You only need to change the base_url (or baseURL in Node.js) to https://api.perplexity.ai and swap in your Perplexity API key. Search-specific parameters like search_recency_filter are passed as additional body fields.

How do citations work in Perplexity API responses?

Sonar models return inline numbered references (e.g., [1], [2]) within the response text. The corresponding source URLs are included in the response metadata under the citations field. You can parse these programmatically to build footnotes, link lists, or verify sources in your application.

Explore More Tools

Grok Best Practices for Real-Time News Analysis and Fact-Checking with X Post Sourcing Best Practices Devin Best Practices: Delegating Multi-File Refactoring with Spec Docs, Branch Isolation & Code Review Checkpoints Best Practices Bolt Case Study: How a Solo Developer Shipped a Full-Stack SaaS MVP in One Weekend Case Study Midjourney Case Study: How an Indie Game Studio Created 200 Consistent Character Assets with Style References and Prompt Chaining Case Study How to Install and Configure Antigravity AI for Automated Physics Simulation Workflows Guide How to Set Up Runway Gen-3 Alpha for AI Video Generation: Complete Configuration Guide Guide Replit Agent vs Cursor AI vs GitHub Copilot Workspace: Full-Stack Prototyping Compared (2026) Comparison How to Build a Multi-Page SaaS Landing Site in v0 with Reusable Components and Next.js Export How-To Kling AI vs Runway Gen-3 vs Pika Labs: Complete AI Video Generation Comparison (2026) Comparison Claude 3.5 Sonnet vs GPT-4o vs Gemini 1.5 Pro: Long-Document Summarization Compared (2025) Comparison Midjourney v6 vs DALL-E 3 vs Stable Diffusion XL: Product Photography Comparison 2025 Comparison Runway Gen-3 Alpha vs Pika 1.0 vs Kling AI: Short-Form Video Ad Creation Compared (2026) Comparison BMI Calculator - Free Online Body Mass Index Tool Calculator Retirement Savings Calculator - Free Online Planner Calculator 13-Week Cash Flow Forecasting Best Practices for Small Businesses: Weekly Updates, Collections Tracking, and Scenario Planning Best Practices 30-60-90 Day Onboarding Plan Template for New Marketing Managers Template Amazon PPC Case Study: How a Private Label Supplement Brand Lowered ACOS With Negative Keyword Mining and Exact-Match Campaigns Case Study ATS-Friendly Resume Formatting Best Practices for Career Changers Best Practices Accounts Payable Automation Case Study: How a Multi-Location Restaurant Group Cut Invoice Processing Time With OCR and Approval Routing Case Study Apartment Move-Out Checklist for Renters: Cleaning, Damage Photos, and Security Deposit Return Checklist