Dria Batch Inference API
Overview
Dria’s Batch Inference API is an open-source, crowdsourced batch AI inference API which is optimized for massive AI workloads. It’s ideal for:
- Processing large amounts of data
- Optimizing inference costs
- Large-scale offline evaluations
- Experimenting with offline workflows
Key benefits:
- Offers the cheapest inference for big jobs
- Has high throughput
- Offers asynchronous processing
Getting Started
1. Prepare Your Batch File
Create a .jsonl
file where each line is a JSON object representing a single request. Each request must include a unique custom_id
(UUID recommended), a type
, a version
, and a body
with the model and input parameters.
Example:
- custom_id: Unique identifier for each request (UUID required)
- type: Task type (e.g., “completions”)
- version: API version (e.g., “v1”)
- body: Model and input parameters (see supported models below)
2. Upload Your Batch File (Python Example)
Dria utilizes a two-step upload process:
- Acquire a pre-signed URL for the upload.
- Upload your file to the provided URL.
- Complete the upload to begin processing.
3. Monitor and Retrieve Results
After the upload has been completed, your batch job will be processed asynchronously.Immediately following this you may:
- Use the web interface to upload, check of, and download any subsequent results.
Example Output File
Each line in the output .jsonl
file corresponds to a request, including the model, result, and token usage.
Example:
- Use
custom_id
to match input and output lines. - Failed requests will include error information.
Limits and Plans
- Tasks per file: Up to 100,000
- Free plan: 3 files per day
- Developer plan: 10 files per day
- Enterprise: Custom Limits ( Reach out to “inference@dria.co“ )
Supported Models
Dria Batch API supports a wide range of models, including:
- Claude 3.7 Sonnet
- Claude 3.5 Sonnet
- Gemini 2.5 Pro Experimental
- Gemini 2.0 Flash
- Gemma3 4b, 12b, 27b
- GPT-4o-mini, GPT-4o
- Llama 3.3 70B Instruct, 3.1 8B Instruct, 3.2 1B Instruct
- Mistral Nemo
View the Dria Batch Inference page for the latest list.
Web Interface
You can use the Dria Batch Inference Web Interface to:
- Upload
.jsonl
files - Check batch status
- Download results
- Obtain your API key
Example Error Handling (Python)
FAQ
- How do I check the status of my batch job?
- Use the web interface.
- How are results delivered?
- Downloadable
.jsonl
file after processing.
- Downloadable
- What if my upload fails?
- Check your API key, file size, and network connection. Review error messages for details.
For more details or support, visit the Dria Batch Inference page.