Perplexity API Setup Guide: API Key, Sonar Models & Python/Node.js Integration
Perplexity API Setup Guide for Developers
Perplexity AI offers a powerful API that combines large language models with real-time web search, enabling developers to build research automation workflows with grounded, citation-backed responses. This guide walks you through API key generation, model selection, search parameter configuration, and full integration into Python and Node.js projects.
Step 1: Create Your Perplexity API Account
- Visit perplexity.ai and sign in or create an account.- Navigate to Settings → API or go directly to the API settings page at
perplexity.ai/settings/api.- Add a payment method. Perplexity API uses a pay-per-use billing model separate from Perplexity Pro subscriptions.- Click Generate API Key. Copy and store the key securely — it will only be shown once.Set the key as an environment variable to keep it out of your source code:# Linux / macOS export PERPLEXITY_API_KEY=“YOUR_API_KEY”
Windows PowerShell
$env:PERPLEXITY_API_KEY=“YOUR_API_KEY”
Step 2: Understand Sonar Model Options
Perplexity offers several models under the **Sonar** family. Each is optimized for different use cases:
| Model ID | Context Window | Best For | Search |
|---|---|---|---|
sonar | 128k tokens | General research queries, summarization | Yes |
sonar-pro | 200k tokens | Complex multi-step research, deep analysis | Yes |
sonar-reasoning | 128k tokens | Chain-of-thought reasoning with citations | Yes |
sonar-reasoning-pro | 128k tokens | Advanced reasoning tasks requiring web data | Yes |
sonar-deep-research | 128k tokens | Exhaustive multi-query research reports | Yes |
sonar for speed and cost-efficiency, or sonar-pro when you need higher accuracy on complex queries.
Step 3: Configure Search Context Parameters
The Perplexity API is compatible with the OpenAI Chat Completions format but adds search-specific parameters:
search_domain_filter— Restrict results to specific domains (e.g.,[“wikipedia.org”, “arxiv.org”]). Up to 3 domains.-search_recency_filter— Limit results by time:day,week,month, oryear.-return_related_questions— Returns suggested follow-up questions (true/false).-search_context— Provide your own context or documents for the model to reference alongside web results.-return_citations— Include inline citation references in the response (trueby default for Sonar models).
Step 4: Python Integration
Install the OpenAI Python SDK, which is fully compatible with Perplexity’s API:
pip install openai
Create a research automation script:
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ.get(“PERPLEXITY_API_KEY”),
base_url=“https://api.perplexity.ai”
)
def research_topic(query, model=“sonar”, recency=None, domains=None):
"""Run a grounded research query with optional filters."""
extra_body = {}
if recency:
extra_body[“search_recency_filter”] = recency
if domains:
extra_body[“search_domain_filter”] = domains
response = client.chat.completions.create(
model=model,
messages=[
{
"role": "system",
"content": "You are a research assistant. Provide concise, well-cited answers."
},
{
"role": "user",
"content": query
}
],
extra_body=extra_body
)
return response.choices[0].message.content
Example: Research recent AI developments from specific sources
result = research_topic(
query=“What are the latest advances in small language models?”,
model=“sonar-pro”,
recency=“week”,
domains=[“arxiv.org”, “huggingface.co”]
)
print(result)
Step 5: Node.js Integration
Install the OpenAI Node.js SDK:
npm install openai
Build a reusable research module:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.PERPLEXITY_API_KEY,
baseURL: “https://api.perplexity.ai”,
});
async function researchTopic(query, options = {}) {
const { model = “sonar”, recency = null, domains = null } = options;
const body = {
model,
messages: [
{ role: “system”, content: “You are a research assistant. Cite your sources.” },
{ role: “user”, content: query },
],
};
if (recency) body.search_recency_filter = recency;
if (domains) body.search_domain_filter = domains;
const response = await client.chat.completions.create(body);
return response.choices[0].message.content;
}
// Example: Batch research automation
const topics = [
“Current market size of AI coding assistants”,
“Top open-source RAG frameworks in 2026”,
];
for (const topic of topics) {
const result = await researchTopic(topic, {
model: “sonar-pro”,
recency: “month”,
});
console.log(\n--- ${topic} ---\n${result});
}
Step 6: Streaming Responses
For long research outputs, use streaming to display results incrementally:
# Python streaming example
stream = client.chat.completions.create(
model="sonar",
messages=[{"role": "user", "content": "Summarize the latest EU AI Act updates"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Pro Tips
- Use system prompts strategically — Guide the model’s output format (e.g., “Answer in bullet points with inline citations”) to get structured, parseable responses.- Chain sonar with sonar-reasoning — Use
sonarfor initial fact-gathering, then pass results tosonar-reasoningfor synthesis and analysis in a two-step pipeline.- Domain filtering for accuracy — When researching technical topics, restrict to authoritative sources likearxiv.org,docs.python.org, ordeveloper.mozilla.orgto reduce noise.- Cache responses — Perplexity charges per request. Cache results locally (e.g., SQLite, Redis) for queries that don’t require real-time freshness.- Monitor usage — Check your token consumption regularly at the Perplexity API dashboard to avoid unexpected charges. Set billing alerts.- Usesonar-deep-researchfor reports — This model autonomously performs multiple search passes and produces comprehensive research documents. Ideal for automated report generation but slower and more expensive.
Troubleshooting
| Error | Cause | Solution |
|---|---|---|
401 Unauthorized | Invalid or missing API key | Verify the key is correct and the environment variable is set. Regenerate the key if needed. |
429 Too Many Requests | Rate limit exceeded | Implement exponential backoff. Default limits are 50 requests/minute for most plans. |
400 Bad Request with model error | Invalid model ID | Double-check the model name. Use sonar, not sonar-small or other deprecated names. |
search_domain_filter ignored | More than 3 domains specified | Reduce the domain list to a maximum of 3 entries. |
| Empty or generic responses | Overly vague query | Add specificity to your prompt. Include time context, scope, and desired format. |
timeout errors | Complex queries on deep-research model | Increase client timeout. sonar-deep-research can take 30-60+ seconds. Set timeout to at least 120 seconds. |
Is the Perplexity API the same as a Perplexity Pro subscription?
No. The API and Pro subscription are billed separately. A Pro subscription gives you enhanced access to the Perplexity web and mobile apps, while the API is a pay-per-token developer service. You need to add a payment method and purchase API credits independently, even if you already have a Pro plan.
Can I use the Perplexity API with existing OpenAI SDK code?
Yes. Perplexity’s API follows the OpenAI Chat Completions format. You only need to change the base_url (or baseURL in Node.js) to https://api.perplexity.ai and swap in your Perplexity API key. Search-specific parameters like search_recency_filter are passed as additional body fields.
How do citations work in Perplexity API responses?
Sonar models return inline numbered references (e.g., [1], [2]) within the response text. The corresponding source URLs are included in the response metadata under the citations field. You can parse these programmatically to build footnotes, link lists, or verify sources in your application.