In-depth case study
Case study: Async AI generation pipeline
Designing a resilient integration layer for a third-party image generation API under real production constraints.
Problem
The product needed on-demand image generation with acceptable latency for end users, while upstream providers imposed quotas, variable response times, and intermittent failures.
A naive synchronous integration would couple request threads to provider latency, amplify timeouts under load, and make retries dangerous without idempotency guarantees.
Architecture
The API accepted a creation request, persisted an intent record with deterministic idempotency metadata, and enqueued work. Workers polled provider status, persisted final artifacts to object storage, and transitioned domain state through explicit steps.
The control plane exposed operational visibility: stalled jobs, retry policies, and safe manual intervention points for support workflows.
Trade-offs
Async-first UX: users receive fast acknowledgment while generation completes in the background; the UI must model pending and failure states honestly.
Complexity vs scalability: more moving parts than a single synchronous call, but far better tail latency behavior under burst traffic.
Vendor coupling: the design isolated provider-specific quirks behind adapters so contract changes did not ripple unpredictably.
Design decisions
Idempotency keys for create operations to prevent duplicate generation under retries.
Backoff and jittered retries with caps—avoiding thundering herds when the provider degrades.
Strict state machine for job lifecycle; illegal transitions fail loudly in logs and metrics.
Checksum / metadata checks before marking assets as available to clients.
Performance and scalability thinking
The hottest path stayed O(1) for database access patterns used by the mobile client: paginated lists and single-asset fetches backed by intentional indexes.
Worker concurrency was tunable to match provider limits; the system degraded gracefully by extending queue wait time rather than failing unpredictably.
Large payloads never lived in the primary OLTP tables—metadata in MySQL, bytes in object storage.
<?php
namespace App\Http\Controllers\Api\V1;
final class GenerationController
{
public function store(StoreGenerationRequest $request): JsonResponse
{
$intent = $this->generator->enqueue(
idempotencyKey: $request->idempotencyKey(),
payload: $request->validated(),
);
return response()->json([
'data' => ['id' => $intent->publicId(), 'status' => $intent->status()],
], Response::HTTP_ACCEPTED);
}
}Interested in how this maps to your roadmap?
Share constraints honestly—traffic shape, team size, and risk tolerance—and we can anchor the conversation in real engineering trade-offs.
Contact Md. Banjir Ahammad →