[Field-Proven Complete Guide] Laravel Queue Design and Async Processing — Jobs/Queues/Horizon, Retries & Idempotency, Delays & Priorities, Failure Isolation, External API Integrations, User Notifications, and an Accessible Progress UI

greeden

1 day ago

[Field-Proven Complete Guide] Laravel Queue Design and Async Processing — Jobs/Queues/Horizon, Retries & Idempotency, Delays & Priorities, Failure Isolation, External API Integrations, User Notifications, and an Accessible Progress UI

What you’ll learn (key takeaways)

How to decide what should be queued vs. what can remain synchronous
Core Laravel Jobs/Queues architecture and how to choose Redis vs. DB drivers
Retries (tries/backoff), timeouts, and “dead-letter”-like operational patterns for isolating failures
How to prevent double execution with idempotency, and safely integrate external APIs
How to use priority/queue splitting, delayed jobs, batches, and job chains in real projects
Monitoring/alerts and worker operations in Horizon (restarts, scaling)
An accessible UI for “waiting time” in async flows (role="status", aria-live, non-color cues, retry paths)

Intended audience (who benefits?)

Laravel beginner–intermediate engineers: want to queue emails/exports, but fear failures and double execution
Tech leads / ops owners: want to avoid being late to job delays/failures and incident response
PM/CS: want clear progress/completion notifications to reduce tickets and frustration
Designers/QA/accessibility roles: want a consistent, “anyone-can-understand” system for waiting/completion/failure states

Accessibility level: ★★★★★

Async work inherently introduces “waiting,” so accessibility impact is large. This guide standardizes progress/result messaging with concrete patterns so flows can be completed via screen readers and keyboard navigation.

1. Introduction: Queues Aren’t Only About “Speed” — They’re About “Resilience”

In Laravel, queues aren’t just for making pages feel faster. The real value is decoupling heavy or unstable work from the request lifecycle and reshaping it into something that can retry, be isolated on failure, and be operated calmly.

Email delivery, PDF generation, CSV exports, external API calls, search indexing, and aggregation are often less stable when done synchronously: users wait, requests time out, and failure causes are harder to see. Solid queue design increases success rates, speeds up triage, and makes operations feel predictable.

2. When to Queue (If You’re Unsure, Start Here)

Consider queueing if any of the following apply:

The task is heavy (seconds+, CPU/memory intensive)
It depends on networks (external APIs, email) and fails intermittently
It’s a bulk job (large exports, batch updates)
It can finish later without breaking UX (tens of seconds to minutes)
You want retry + failure isolation (a “recoverable” design)

What should remain synchronous:

Minimum required writes for the page/action to complete (e.g., creating the order itself)
Operations where users must instantly see the result (but extra work behind the scenes can still be queued)

A realistic pattern is: “core write is synchronous; notifications/aggregation/integrations are async.”

3. The Foundation: Driver Choice and Basic Setup

3.1 How to Think About Drivers

Redis: fast and common; excellent with Horizon
Database: easy to start; may not scale as smoothly under heavy load
Cloud queues (SQS, etc.): ops simplicity, but needs careful design, cost, and observability planning

For small-to-mid SaaS, Redis + Horizon is often the most practical start: low barrier, and delays become visible quickly.

3.2 Basic Setup (Example)

.env: QUEUE_CONNECTION=redis
config/queue.php: define queue names, retry intervals, failed-job retention policy
Prepare failed jobs store (failed_jobs), depending on your chosen setup

4. Minimal Job Implementation: Split into Manageable Units

Keeping jobs close to “one job = one purpose” makes retries and failures easier to interpret.

// app/Jobs/SendWelcomeMail.php
namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Support\Facades\Mail;
use App\Models\User;
use App\Mail\WelcomeMail;

class SendWelcomeMail implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public function __construct(public int $userId) {}

    public $tries = 5;
    public $timeout = 60;

    public function backoff(): array
    {
        return [10, 30, 60, 120, 300];
    }

    public function handle(): void
    {
        $user = User::findOrFail($this->userId);
        Mail::to($user->email)->send(new WelcomeMail($user));
    }
}

Notes

Passing an id is safer than passing the entire model (serialization and state-change resilience).
Making tries/timeout/backoff explicit often stabilizes operations overnight.
Don’t swallow exceptions—let them fail so monitoring can catch them.

5. Retry Design: Stronger When You Classify Failures

Not all failures are equal. In practice, splitting them into two types makes decisions faster:

Transient failures (network wobble, temporary external API issues)
- retries often succeed
Permanent failures (invalid input, deleted target, permission issues)
- retries are usually pointless; isolate for humans

5.1 Fail Fast for Permanent Failure Patterns

If an external API returns a 4xx, retries often won’t help. Branch by exception type/status and “give up early” rather than burning worker time.

5.2 Timeouts: The Practical Rule

Too short → lower success rate
Too long → workers clog and delays cascade

A safe starting point is ~2× your normal p95 for that job, then adjust based on observed delay.

6. Idempotency: Prevent Double Execution “By Design”

Queued jobs can run more than once due to:

retries
worker restarts
re-enqueue after timeouts
network flakiness

If you don’t design for this, you’ll see real incidents: “two emails,” “double charge,” “double points.”

6.1 Common Idempotency Patterns

Assign a per-job idempotency_key
Prevent concurrent/double execution via Cache::lock() or DB unique constraints
If already completed, do nothing and exit

Example: prevent sending the same invoice email twice

public function handle(): void
{
    $key = "invoice_mail:{$this->invoiceId}";
    $lock = cache()->lock($key, 120);

    if (!$lock->get()) {
        return; // already running (or just ran)
    }

    try {
        $invoice = Invoice::findOrFail($this->invoiceId);

        if ($invoice->mail_sent_at) {
            return; // already sent → no-op (idempotent)
        }

        Mail::to($invoice->user->email)->send(new InvoiceMail($invoice));

        $invoice->forceFill(['mail_sent_at' => now()])->save();
    } finally {
        $lock->release();
    }
}

Key points

Persisting a “sent” flag in DB makes you resilient against retries and replays.
Cache locks are great, but the final authority should be durable state (DB) when possible.

7. Queue Splitting and Priority: Stop Delay Cascades

With a single queue, a heavy job can block lightweight jobs. A field-proven approach is splitting queues by purpose:

high: close to user actions (notifications, lightweight integrations, urgent)
default: normal work
low: heavy work (aggregation, search indexing, exports)

SendWelcomeMail::dispatch($user->id)->onQueue('high');
RecalcDailyUsage::dispatch($tenantId)->onQueue('low');

Run workers per queue to localize delays and make bottlenecks easier to see.

8. Delays, Chains, and Batches: When to Use Which

8.1 Delayed Jobs (`delay`)

Good for “retry later” or “follow-up notification later.”

SendFollowUpMail::dispatch($user->id)->delay(now()->addMinutes(10));

8.2 Chains (`chain`)

Good when order matters (generate → upload → notify).

Bus::chain([
  new GenerateReport($reportId),
  new UploadReport($reportId),
  new NotifyReportReady($reportId),
])->dispatch();

8.3 Batches (`batch`)

Best for large job sets where you want overall progress and aggregated failure handling—especially exports and bulk updates.

9. External API Integration Pattern: Timeouts, Retries, and Fallbacks

External APIs fail. That’s exactly why they pair well with queues.

define a short-ish timeout
set retry count and spacing
isolate failures + provide a manual recovery path
use fallback when possible (cached value, retry later)

Laravel HTTP client works well inside jobs:

$res = \Illuminate\Support\Facades\Http::timeout(10)
  ->retry(3, 200)
  ->post($url, $payload);

if ($res->failed()) {
  throw new \RuntimeException('external api failed');
}

“Throw on failure” becomes simpler in a job context because retries and isolation are the queue’s job.

10. Failed Job Operations: Isolate and Make It Visible

Queues are less about “zero failures” and more about “recoverable failures.” Key operational needs:

notice growth in failed jobs (alerts)
trace failure reasons (exception, job name, target IDs, trace ID)
have a defined replay procedure (when/how to retry; when to fix data)
turn permanent failures into product fixes or user messaging

Include at least these fields in failure logs:

target IDs (userId, orderId, etc.)
tenant ID (for multi-tenant)
trace_id (if request-originated)
external API response summary (mask sensitive info)

11. Monitoring with Horizon: Visibility Alone Buys Peace of Mind

With Horizon (Redis-based), you can see:

throughput per queue
failures
wait time (delay)
worker state

Operationally useful indicators:

queue_wait_time (rising means user impact risk)
failure rate spikes (external outages or deployment issues)
job duration increases (early signal of clogging)

Start alerts small:

sudden failure spike
wait time above threshold
worker down

That’s usually enough to keep operations manageable.

12. User Notifications and an Accessible Progress UI: Remove Async “Confusion”

From the user’s view, async often looks like “I clicked, nothing happened.”
Fixing this improves UX and reduces support volume.

12.1 Minimum 3 States

Start: acknowledged
In progress: processing (if needed)
Done/Failed: result + next action

12.2 Standard UI Pattern (Start → Done Messaging)

@if(session('status'))
  <div role="status" aria-live="polite" class="border p-3 mb-4">
    {{ session('status') }}
  </div>
@endif

@if(session('error'))
  <div role="alert" class="border p-3 mb-4">
    {{ session('error') }}
  </div>
@endif

Start message example (export)

“Export started. We’ll notify you when it’s complete.”
If possible, add an expectation like “Usually completes within a few minutes,” but avoid over-promising.

12.3 Progress (Polling/Events): Practical Accessibility Rules

show numeric progress in text (e.g., “40%”)
use aria-live="polite" and avoid overly frequent updates
don’t rely on spinner color alone
keep keyboard paths for “Cancel” or “Back”

Example (concept):

<div id="progress" role="status" aria-live="polite">Preparing…</div>

12.4 Failure UX (Critical)

If the UI only says “Error,” the user is stuck. Provide:

retry (button/link)
alternatives (smaller file, narrower date range)
a support/reference ID (trace_id, etc.)

Those three reduce frustration dramatically.

13. Testing: Treat Queues as “Specifications” with Fakes

Instead of running queues in tests, stabilize behavior with Queue::fake() and assert dispatch.

use Illuminate\Support\Facades\Queue;

public function test_export_dispatches_job()
{
    Queue::fake();

    $user = User::factory()->create();
    $this->actingAs($user);

    $this->post('/export', ['range' => 'last_30_days'])
        ->assertRedirect()
        ->assertSessionHas('status');

    Queue::assertPushed(\App\Jobs\ExportCsv::class);
}

For external APIs, use Http::fake() to model success/failure and prevent regressions in retry policy.

14. Common Pitfalls and How to Avoid Them

Workers clog because jobs are too heavy
- Fix: split queues, split job granularity, adjust timeouts, “materialize” aggregation
Double execution duplicates email/payments
- Fix: idempotency keys, “sent” flags, locks
Failures are invisible until they pile up
- Fix: Horizon + alerts + structured logs
External API instability causes retry storms
- Fix: cap retries, backoff, classify permanent failures, introduce circuit-breaker-like damping gradually
Users don’t know what happened
- Fix: start/done/fail messaging, role="status", retry paths

15. Checklist (Shareable)

Design

[ ] Queued workloads are identified (heavy/unstable/delay-tolerant)
[ ] Job granularity is appropriate (1 job ≈ 1 purpose)
[ ] Idempotency exists (sent/completed checks)
[ ] Queue splitting (high/default/low) localizes delays

Operations

[ ] tries/backoff/timeout are explicit
[ ] Failed jobs are visible (monitoring/alerts/logging)
[ ] Replay procedure is documented (conditions, owner)
[ ] External APIs have timeout/retry policy

UX/Accessibility

[ ] Start/done/fail are explained in text
[ ] role="status"/aria-live used where appropriate
[ ] Failure provides retry/alternatives/support ID
[ ] Progress indicators don’t rely on color alone

Testing

[ ] Queue::fake() locks down dispatch behavior
[ ] External APIs use Http::fake() for success/failure
[ ] Critical jobs test failure behavior (isolation/notifications)

16. Wrap-Up

Laravel queues are a powerful way to get both “speed” and “resilience” through async execution. The key is not relying on retries alone: prevent duplicates via idempotency, localize delay with queue splitting, and gain calm operations through Horizon and alerting. And because async introduces waiting, you must communicate clearly to users—start/done/fail states and accessible progress patterns that work with screen readers and keyboards. Start by queueing one workflow carefully—export, email, or an external API integration—and build from there.

References

Laravel official docs
Reliability & operations concepts
- Exponential Backoff (AWS)
- Idempotency Keys (Stripe)
Accessibility

[Field-Proven Complete Guide] Laravel Queue Design and Async Processing — Jobs/Queues/Horizon, Retries & Idempotency, Delays & Priorities, Failure Isolation, External API Integrations, User Notifications, and an Accessible Progress UI

1. Introduction: Queues Aren’t Only About “Speed” — They’re About “Resilience”

2. When to Queue (If You’re Unsure, Start Here)

3. The Foundation: Driver Choice and Basic Setup

3.1 How to Think About Drivers

3.2 Basic Setup (Example)

4. Minimal Job Implementation: Split into Manageable Units

5. Retry Design: Stronger When You Classify Failures

5.1 Fail Fast for Permanent Failure Patterns

5.2 Timeouts: The Practical Rule

6. Idempotency: Prevent Double Execution “By Design”

6.1 Common Idempotency Patterns

7. Queue Splitting and Priority: Stop Delay Cascades

8. Delays, Chains, and Batches: When to Use Which

8.1 Delayed Jobs (delay)

8.2 Chains (chain)

8.3 Batches (batch)

9. External API Integration Pattern: Timeouts, Retries, and Fallbacks

10. Failed Job Operations: Isolate and Make It Visible

11. Monitoring with Horizon: Visibility Alone Buys Peace of Mind

12. User Notifications and an Accessible Progress UI: Remove Async “Confusion”

12.1 Minimum 3 States

12.2 Standard UI Pattern (Start → Done Messaging)

12.3 Progress (Polling/Events): Practical Accessibility Rules

12.4 Failure UX (Critical)

13. Testing: Treat Queues as “Specifications” with Fakes

14. Common Pitfalls and How to Avoid Them

15. Checklist (Shareable)

16. Wrap-Up

References

Share this:

8.1 Delayed Jobs (`delay`)

8.2 Chains (`chain`)

8.3 Batches (`batch`)