Skip to main content

Batch Caption Processing Guide

Overview

PhotoSwipe Pro with AI SEO now supports batch caption generation, allowing you to process 1-50 images in a single API request. This dramatically improves performance for galleries with many images.

API Key Ownership - IMPORTANT

Who needs API keys?

  • Server owner (you, the PhotoSwipe Pro license holder) - Provides ONE API key via environment variables
  • End users (website visitors) - Do NOT need their own API keys
  • Client applications - Only need PhotoSwipe Pro license key

How it works:

  1. You (server owner) set up ONE Gemini or OpenRouter API key in your .env file
  2. Your server proxies all AI requests
  3. Clients authenticate with their PhotoSwipe Pro license key
  4. Your server validates the license and forwards requests to the AI provider
  5. You pay for the AI API costs as part of your Pro service

This architecture protects your API keys and provides a seamless experience for customers.


Supported Providers

1. OpenRouter (GPT-4o Vision)

  • Pros: Multiple model options, reliable, excellent vision capabilities
  • Cons: Higher cost per image
  • Setup: Get API key from https://openrouter.ai/
  • Model: openai/gpt-4o (default)

2. Gemini (Google AI)

  • Pros: Lower cost, fast, good vision capabilities
  • Cons: Requires base64 image encoding (higher bandwidth)
  • Setup: Get API key from https://aistudio.google.com/app/apikey
  • Model: gemini-1.5-flash (default)

3. Mock (Testing)

  • Pros: Free, instant, no API key needed
  • Cons: Returns fake data
  • Usage: Development and testing only

Environment Setup

# .env file
AI_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key-here
AI_MODEL=openai/gpt-4o
AI_TIMEOUT_MS=15000

Option 2: Gemini

# .env file
AI_PROVIDER=gemini
GEMINI_API_KEY=your-gemini-api-key-here
GEMINI_MODEL=gemini-1.5-flash
AI_TIMEOUT_MS=15000

Batch Configuration

# Maximum images per batch request
BATCH_MAX_SIZE=50

# Timeout for entire batch (60 seconds)
BATCH_TIMEOUT_MS=60000

# Rate limiting (20 requests per minute per IP)
AI_RATE_LIMIT_WINDOW_MS=60000
AI_RATE_LIMIT_MAX=20

# License validation
REQUIRE_LICENSE=true # Set to false for demo mode

API Endpoints

Single Image

POST /api/ai/caption

{
"url": "https://example.com/photo.jpg",
"context": { "title": "Product Name" },
"licenseKey": "your-photoswipe-pro-license-key"
}

Response:

{
"alt": "A red bicycle leaning against a brick wall",
"caption": "Vintage red bicycle with basket against urban brick wall"
}

Batch Processing

POST /api/ai/caption/batch

{
"images": [
{ "url": "https://example.com/photo1.jpg", "context": { "title": "Product 1" } },
{ "url": "https://example.com/photo2.jpg", "context": { "title": "Product 2" } },
{ "url": "https://example.com/photo3.jpg", "context": { "title": "Product 3" } }
],
"licenseKey": "your-photoswipe-pro-license-key"
}

Response:

{
"results": [
{
"url": "https://example.com/photo1.jpg",
"alt": "Description of photo 1",
"caption": "Engaging caption for photo 1"
},
{
"url": "https://example.com/photo2.jpg",
"alt": "Description of photo 2",
"caption": "Engaging caption for photo 2"
},
{
"url": "https://example.com/photo3.jpg",
"error": "processing_failed"
}
],
"summary": {
"total": 3,
"success": 2,
"failed": 1
}
}

Client-Side Usage

Basic Batch Processing

import { CaptionProvider } from 'photoswipe-pro/ai';

const provider = new CaptionProvider({ baseUrl: '/api/ai' });

// Process 10 images at once
const images = [
{ url: 'https://example.com/photo1.jpg', context: { title: 'Product 1' } },
{ url: 'https://example.com/photo2.jpg', context: { title: 'Product 2' } },
// ... up to 50 images
];

const result = await provider.generateBatch({
images,
licenseKey: 'your-license-key'
});

// Handle results
result.results.forEach(item => {
if (item.error) {
console.error(`Failed to process ${item.url}: ${item.error}`);
} else {
console.log(`${item.url}: ${item.alt}`);
// Update your UI with item.alt and item.caption
}
});

console.log(`Processed ${result.summary.success}/${result.summary.total} images`);

Auto-Batching Helper

For large galleries, use the generateForUrls helper that automatically batches into chunks of 50:

import { CaptionProvider } from 'photoswipe-pro/ai';

const provider = new CaptionProvider({ baseUrl: '/api/ai' });

// Process 200 images (automatically batched into 4 requests of 50)
const urls = [
'https://example.com/photo1.jpg',
'https://example.com/photo2.jpg',
// ... 200 photos
];

const captionsMap = await provider.generateForUrls({
urls,
context: { category: 'products' },
licenseKey: 'your-license-key'
});

// Results stored in a Map
urls.forEach(url => {
const result = captionsMap.get(url);
if (result) {
console.log(`${url}: ${result.alt}`);
// Update your image with result.alt and result.caption
}
});
import PhotoSwipeLightbox from 'photoswipe/lightbox';
import { CaptionProvider } from 'photoswipe-pro/ai';

const provider = new CaptionProvider({ baseUrl: '/api/ai' });

// Initialize PhotoSwipe
const lightbox = new PhotoSwipeLightbox({
gallery: '#my-gallery',
children: 'a',
pswpModule: () => import('photoswipe')
});

// Generate captions for all images on page load
const images = Array.from(document.querySelectorAll('#my-gallery a')).map(el => ({
url: el.href,
context: { title: el.querySelector('img').alt || '' }
}));

const result = await provider.generateBatch({
images,
licenseKey: 'your-license-key'
});

// Update DOM with AI-generated captions
result.results.forEach((item, index) => {
if (!item.error) {
const img = document.querySelectorAll('#my-gallery img')[index];
img.alt = item.alt;
img.dataset.caption = item.caption;
}
});

lightbox.init();

Performance Considerations

Concurrency

The server processes 5 images in parallel per batch to optimize throughput while respecting AI provider rate limits.

Timeouts

  • Single image: 15 seconds (configurable via AI_TIMEOUT_MS)
  • Batch request: 60 seconds (configurable via BATCH_TIMEOUT_MS)
  • Image fetch (Gemini only): 10 seconds

Rate Limiting

Default limits (per IP address):

  • 20 requests per minute
  • Applies to both single and batch endpoints
  • Batch of 50 images counts as 1 request

Cost Optimization

For 100 images:

MethodAPI CallsApprox TimeCost (GPT-4o)
Single100 calls~25 minutes$0.50-$1.00
Batch2 calls~2 minutes$0.50-$1.00

Cost is the same, but batch is 12× faster!


Error Handling

Partial Failures

Batch processing continues even if some images fail. Check the error field in results:

const result = await provider.generateBatch({ images, licenseKey });

const failed = result.results.filter(r => r.error);
if (failed.length > 0) {
console.warn(`${failed.length} images failed:`, failed);

// Retry failed images
const retryImages = failed.map(r => ({ url: r.url }));
const retryResult = await provider.generateBatch({
images: retryImages,
licenseKey
});
}

Common Errors

Error CodeDescriptionSolution
400 invalid_inputMissing or malformed URLCheck image URLs are valid HTTPS
400 batch_too_largeMore than 50 imagesSplit into smaller batches
402 license_invalidInvalid or expired licenseCheck license key
429 rate_limitedToo many requestsImplement exponential backoff
502 provider_errorAI provider failedCheck API key and provider status

Retry Logic

async function generateWithRetry(provider, input, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await provider.generateBatch(input);
} catch (error) {
if (error.message === 'rate_limited' && i < maxRetries - 1) {
// Exponential backoff: 2s, 4s, 8s
await new Promise(resolve => setTimeout(resolve, 2000 * Math.pow(2, i)));
continue;
}
throw error;
}
}
}

Static Site Generation (SSG)

For static sites, pre-generate captions at build time:

// build-captions.js
import { CaptionProvider } from 'photoswipe-pro/ai';
import fs from 'fs';

const provider = new CaptionProvider({ baseUrl: 'http://localhost:4001/api/ai' });

// Read image URLs from your static site
const imageUrls = JSON.parse(fs.readFileSync('images.json'));

// Generate captions
const captionsMap = await provider.generateForUrls({
urls: imageUrls,
licenseKey: process.env.PHOTOSWIPE_LICENSE_KEY
});

// Save to JSON file
const captions = Object.fromEntries(captionsMap);
fs.writeFileSync('captions.json', JSON.stringify(captions, null, 2));

console.log(`Generated captions for ${captionsMap.size} images`);

Then in your build process:

# Start local server
npm run server &

# Generate captions
node build-captions.js

# Build static site
npm run build

# Kill server
kill %1

Next.js Integration

// pages/api/generate-captions.js
import { CaptionProvider } from 'photoswipe-pro/ai';

export default async function handler(req, res) {
if (req.method !== 'POST') {
return res.status(405).json({ error: 'Method not allowed' });
}

const { images } = req.body;
const provider = new CaptionProvider({ baseUrl: process.env.AI_API_URL });

try {
const result = await provider.generateBatch({
images,
licenseKey: process.env.PHOTOSWIPE_LICENSE_KEY
});
res.json(result);
} catch (error) {
res.status(500).json({ error: error.message });
}
}

Monitoring and Analytics

Track batch processing performance:

const startTime = Date.now();
const result = await provider.generateBatch({ images, licenseKey });
const duration = Date.now() - startTime;

console.log('Batch Processing Stats:', {
total: result.summary.total,
success: result.summary.success,
failed: result.summary.failed,
successRate: `${(result.summary.success / result.summary.total * 100).toFixed(1)}%`,
duration: `${duration}ms`,
avgPerImage: `${(duration / result.summary.total).toFixed(0)}ms`
});

Best Practices

  1. Use batch processing for 3+ images - Single requests are fine for 1-2 images
  2. Implement caching - Store results to avoid re-processing
  3. Handle partial failures gracefully - Don't fail entire batch if one image fails
  4. Respect rate limits - Implement exponential backoff
  5. Monitor costs - Track API usage, especially with GPT-4o Vision
  6. Test with mock provider first - Validate integration before using real API
  7. Pre-generate for static sites - Build-time generation saves runtime costs

Cost Comparison

Gemini vs OpenRouter

ProviderModelCost/ImageSpeedQuality
Geminigemini-1.5-flash$0.001FastGood
OpenRoutergpt-4o$0.01ModerateExcellent
OpenRouterclaude-3-haiku$0.0025FastGood

Recommendation: Use Gemini for development/high-volume, GPT-4o for production quality.


Troubleshooting

Images timing out

Increase timeout:

AI_TIMEOUT_MS=30000  # 30 seconds
BATCH_TIMEOUT_MS=120000 # 2 minutes

Rate limits too restrictive

Adjust rate limits:

AI_RATE_LIMIT_MAX=50  # 50 requests per minute

Gemini API errors

Check API key and quota at https://console.cloud.google.com/

OpenRouter API errors

Check API key and credits at https://openrouter.ai/account


Summary

  • ✅ Process 1-50 images per batch request
  • ✅ 12× faster than individual requests
  • ✅ Automatic retry and error handling
  • ✅ Support for Gemini and OpenRouter
  • ✅ Server owner provides ONE API key
  • ✅ End users only need PhotoSwipe Pro license
  • ✅ Works with static sites and SSR frameworks

Get started: Copy environment variables from docs/ENV-VARIABLES-TEMPLATE.md and start batch processing!