ElevenLabs API Case Study: How an Indie Game Studio Generated 200+ NPC Dialogue Lines in 48 Hours

From Casting Calls to API Calls: Replacing Traditional Voice Acting Pipelines

For indie game studios, voice acting is one of the most expensive and time-consuming production bottlenecks. Casting agencies, recording sessions, retakes, and post-processing can consume weeks of calendar time and thousands of dollars — even for a modest RPG with a handful of NPCs. This case study documents how a fictional but representative indie studio, Ironpine Games, used the ElevenLabs API to generate over 200 fully voiced NPC dialogue lines in just 48 hours. The workflow leveraged three core ElevenLabs features: Projects API, Voice Design presets, and Pronunciation Dictionaries — replacing what would have traditionally required a 3-week casting and recording pipeline.

The Challenge

212 dialogue lines across 14 unique NPCs for a fantasy RPG vertical slice- Budget constraint: under $500 total voice production cost- Timeline: 48 hours before a publisher demo- Each NPC needed a distinct, consistent voice with correct pronunciation of 30+ fictional proper nouns (place names, spells, lore terms)

Step 1: Environment Setup and Installation

Install the ElevenLabs Python SDK

pip install elevenlabs

Configure Your API Key

# Set your API key as an environment variable
export ELEVENLABS_API_KEY=YOUR_API_KEY

# Python initialization
from elevenlabs.client import ElevenLabs
client = ElevenLabs(api_key=“YOUR_API_KEY”)

Step 2: Design Unique NPC Voices with Voice Design

Instead of auditioning voice actors, Ironpine used the Voice Design API to generate distinct voice profiles for each NPC archetype — grizzled blacksmith, young apprentice, ancient oracle, and so on. from elevenlabs import VoiceDesign, Gender, Age, Accent

`Design a grizzled blacksmith voice`


blacksmith_preview = client.text_to_voice.create_previews(
voice_description=“A gruff, deep-voiced male blacksmith in his 50s with a slight rasp”,
text=“Aye, that blade will cost ye three gold crowns. No less.”
)
Listen to generated previews, then save the best one as a persistent voice
blacksmith_voice = client.text_to_voice.create_voice_from_preview(
voice_name=“Blacksmith_Gorath”,
voice_description=“Gruff male blacksmith NPC”,
generated_voice_id=blacksmith_preview.previews[0].generated_voice_id
)

print(f”Created voice: {blacksmith_voice.voice_id}“)

The team repeated this for all 14 NPCs, generating 2-3 preview variations per character and selecting the best fit — a process that took roughly 3 hours compared to weeks of casting calls.

Step 3: Create a Pronunciation Dictionary for Lore Terms

Fantasy games are full of invented words. Without a pronunciation dictionary, the TTS engine will guess — often incorrectly. ElevenLabs Pronunciation Dictionaries solve this definitively. import json


Create a pronunciation dictionary from a lexicon file
pronunciation_lexicon.pls is a PLS (Pronunciation Lexicon Specification) XML file
with open(“pronunciation_lexicon.pls”, “rb”) as f:
dictionary = client.pronunciation_dictionary.add_from_file(
file=f,
name=“ironpine_rpg_lore”,
description=“Pronunciation rules for all fantasy proper nouns”
)

print(f”Dictionary ID: {dictionary.id}”) print(f”Rules added: {dictionary.version_id}“)

Example PLS Lexicon File



  
Valdrethar
vɑːl.drɛ.θɑːr
  
  
Kythira
kɪ.θaɪ.rə
  
  
Aethermancy
iː.θɜːr.mæn.si

Step 4: Batch Generate All Dialogue with the Projects API

The Projects API is where the entire pipeline comes together. It allows you to organize chapters, assign voices per character, attach pronunciation dictionaries, and batch-convert an entire script. # Create a project for the RPG vertical slice project = client.projects.add( name="Ironpine RPG - Vertical Slice", default_model_id="eleven_multilingual_v2", pronunciation_dictionary_versions_locators=[ {"pronunciation_dictionary_id": dictionary.id, "version_id": dictionary.version_id} ], default_paragraph_voice_id=blacksmith_voice.voice_id )

print(f”Project created: {project.project_id}”)

# Add a chapter for each game area or quest
chapter = client.projects.add_chapter(
project_id=project.project_id,
name=“Chapter 1 - Village of Valdrethar”
)
print(f”Chapter ID: {chapter.chapter_id}“)

Bulk Upload Dialogue Lines via Script

import csv
import time

npc_voices = {
    "Gorath": "voice_id_blacksmith",
    "Lyra": "voice_id_apprentice",
    "Elder Morvyn": "voice_id_oracle",
    # ... 11 more NPC voice mappings
}

with open("dialogue_script.csv", "r") as f:
    reader = csv.DictReader(f)  # columns: npc_name, line_id, text
    for row in reader:
        voice_id = npc_voices.get(row["npc_name"])
        if not voice_id:
            continue

        audio = client.text_to_speech.convert(
            voice_id=voice_id,
            text=row["text"],
            model_id="eleven_multilingual_v2",
            pronunciation_dictionary_locators=[
                {"pronunciation_dictionary_id": dictionary.id, "version_id": dictionary.version_id}
            ]
        )

        filename = f"audio/{row['line_id']}.mp3"
        with open(filename, "wb") as out:
            for chunk in audio:
                out.write(chunk)

        print(f"Generated: {filename}")
        time.sleep(0.5)  # respect rate limits

Results Summary

Metric	Traditional Pipeline	ElevenLabs API Pipeline
Casting & auditions	5-7 days	3 hours (Voice Design)
Recording sessions	3-5 days	0 (API batch generation)
Pronunciation retakes	1-2 days	0 (dictionary-driven)
Post-processing	2-3 days	Minimal normalization
Total elapsed time	~15-20 days	~48 hours
Cost (212 lines)	$3,000-$8,000+	~$80-$150 API credits

## Pro Tips for Power Users - **Version your pronunciation dictionaries.** As your lore evolves during development, update the dictionary and re-generate only the affected lines. The version ID system makes this traceable.- **Use voice settings for emotional variation.** Adjust stability (lower = more expressive) and similarity_boost per line to convey anger, whispers, or excitement without needing separate voice profiles.- **Parallelize with async requests.** Use asyncio and httpx to generate multiple lines concurrently. Respect the concurrency limits on your plan tier.- **Export a voice map JSON.** Keep a single source-of-truth mapping npc_name → voice_id in version control so your entire team references the same voices.- **Tag lines with SSML-style markers.** Insert in dialogue text for natural pauses between sentences — especially useful for dramatic NPC monologues. ## Troubleshooting Common Issues

Error / Symptom	Cause	Fix
`401 Unauthorized`	Invalid or expired API key	Regenerate your key at elevenlabs.io dashboard and update the environment variable
`429 Too Many Requests`	Rate limit exceeded	Add exponential backoff or `time.sleep(1)` between calls; upgrade plan tier if persistent
Pronunciation dictionary not applied	Missing or incorrect `version_id`	Always pass both `pronunciation_dictionary_id` and `version_id` in the locator object
Voice sounds inconsistent between lines	Stability set too low	Increase `stability` to 0.6-0.75 for NPC dialogue; reserve low values for emotional peaks
Generated audio has clipping	Text contains unusual punctuation or symbols	Sanitize input text; remove stray unicode characters and excessive exclamation marks

## Frequently Asked Questions

Can I use ElevenLabs-generated voices commercially in a shipped game?

Yes. ElevenLabs allows commercial usage of generated audio on paid plans. The voices created through Voice Design are fully owned synthetic voices with no likeness rights concerns, making them ideal for indie game distribution on Steam, itch.io, or console storefronts. Always review the current terms of service for your specific plan tier.

How do I maintain voice consistency when generating hundreds of lines over multiple sessions?

Once you save a designed voice via create_voice_from_preview, it receives a persistent voice_id. All subsequent TTS calls using that ID produce consistent output. Keep stability at 0.5 or higher and use the same model_id across all generations. Avoid regenerating the voice profile mid-production.

What happens if I need to add new dialogue lines after the initial batch?

Simply run the same script with additional CSV rows. The voice IDs, pronunciation dictionary, and model settings remain unchanged. New lines will sound consistent with previously generated audio. For large additions, consider using the Projects API to organize new content into separate chapters for easier management.

Explore More Tools

Grok Best Practices for Real-Time News Analysis and Fact-Checking with X Post Sourcing Best Practices Devin Best Practices: Delegating Multi-File Refactoring with Spec Docs, Branch Isolation & Code Review Checkpoints Best Practices Bolt Case Study: How a Solo Developer Shipped a Full-Stack SaaS MVP in One Weekend Case Study Midjourney Case Study: How an Indie Game Studio Created 200 Consistent Character Assets with Style References and Prompt Chaining Case Study How to Install and Configure Antigravity AI for Automated Physics Simulation Workflows Guide How to Set Up Runway Gen-3 Alpha for AI Video Generation: Complete Configuration Guide Guide Replit Agent vs Cursor AI vs GitHub Copilot Workspace: Full-Stack Prototyping Compared (2026) Comparison How to Build a Multi-Page SaaS Landing Site in v0 with Reusable Components and Next.js Export How-To Kling AI vs Runway Gen-3 vs Pika Labs: Complete AI Video Generation Comparison (2026) Comparison Claude 3.5 Sonnet vs GPT-4o vs Gemini 1.5 Pro: Long-Document Summarization Compared (2025) Comparison Midjourney v6 vs DALL-E 3 vs Stable Diffusion XL: Product Photography Comparison 2025 Comparison Runway Gen-3 Alpha vs Pika 1.0 vs Kling AI: Short-Form Video Ad Creation Compared (2026) Comparison BMI Calculator - Free Online Body Mass Index Tool Calculator Retirement Savings Calculator - Free Online Planner Calculator 13-Week Cash Flow Forecasting Best Practices for Small Businesses: Weekly Updates, Collections Tracking, and Scenario Planning Best Practices 30-60-90 Day Onboarding Plan Template for New Marketing Managers Template Amazon PPC Case Study: How a Private Label Supplement Brand Lowered ACOS With Negative Keyword Mining and Exact-Match Campaigns Case Study ATS-Friendly Resume Formatting Best Practices for Career Changers Best Practices Accounts Payable Automation Case Study: How a Multi-Location Restaurant Group Cut Invoice Processing Time With OCR and Approval Routing Case Study Apartment Move-Out Checklist for Renters: Cleaning, Damage Photos, and Security Deposit Return Checklist