Webhook retry logic

When a webhook delivery fails (network timeout, 5xx response, DNS error, connection refused), Hook0 retries with increasing delays. Each failed attempt creates a new request attempt scheduled for later, until the retry limit is reached.

Why retries matter

Most webhook delivery failures are transient. The receiving server was restarting, a load balancer was draining connections, or a brief network partition occurred. A retry a few seconds later usually succeeds.

Without retries, every transient failure becomes a lost event. With naive retries (fixed interval, no limit), you risk overwhelming a recovering server. Hook0 uses a two-phase retry schedule that balances fast recovery with patience.

Two-phase retry schedule

Hook0 uses a configurable two-phase approach instead of a single fixed schedule:

Fast retries -- frequent attempts with increasing delays, to recover from brief outages quickly
Slow retries -- spaced-out attempts at fixed intervals, to handle longer outages without overwhelming the endpoint

Default retry configuration

Every application gets these defaults. Most users never need to change them:

Parameter	Default	Range	Description
`max_fast_retries`	30	0-100	Number of fast-phase retry attempts
`max_slow_retries`	30	0-100	Number of slow-phase retry attempts
`fast_retry_delay_seconds`	5s	1-3600s	Initial delay between fast retries
`max_fast_retry_delay_seconds`	300s (5min)	1-86400s	Maximum delay between fast retries
`slow_retry_delay_seconds`	3600s (1h)	60-604800s	Fixed delay between slow retries

Per-subscription overrides

Each subscription can override any of these parameters. When a subscription does not specify a retry configuration, it inherits the application-level defaults. This means you can:

Set sane defaults for the whole application
Customize retry behavior for specific subscriptions that need it (e.g., a critical integration that needs more aggressive retries, or a low-priority endpoint that can tolerate longer delays)

What happens on failure

When a delivery attempt fails, Hook0 follows this decision process:

Non-retryable errors

Some errors are never retried because retrying would produce the same result:

Invalid header: the webhook signature could not be constructed (e.g., event type contains characters that are invalid in HTTP headers).

Subscription and application checks

Before scheduling a retry, Hook0 checks that the subscription is still enabled, has not been soft-deleted, and that the parent application still exists. If any of these fail, the retry is skipped.

Delivery status flow

Each webhook delivery attempt goes through these states:

More precisely, Hook0 tracks five statuses:

Status	Meaning
Waiting	Scheduled for future delivery (`delay_until` has not elapsed yet)
Pending	Ready to be picked up by a worker
In Progress	Currently being delivered (picked by a worker)
Successful	Delivery succeeded (2xx HTTP response)
Failed	Delivery failed

The request_attempt table stores every attempt with timestamps (created_at, picked_at, succeeded_at, failed_at, delay_until), so you can calculate:

Time to first delivery: picked_at - created_at
Delivery latency: succeeded_at - picked_at
Total time to success: succeeded_at - created_at (including retries)

Each retry creates a new row in the request_attempt table with an incremented retry_count and a delay_until set to the scheduled retry time.

When all retries are exhausted

When the maximum number of retries is reached (or the retry window expires), Hook0 does not create another attempt. The last attempt stays in failed status.

Failed deliveries are not lost. You can:

Inspect all delivery attempts and their responses via the API or dashboard
Replay the event via the API to re-trigger delivery to all matching subscriptions

Replaying an event resets its dispatched_at field. The dispatch trigger then creates new request attempts for all active subscriptions that match the event's type and labels.

Idempotency

Every event in Hook0 has a unique event_id. Consumers should use this as an idempotency key to handle duplicate deliveries.

Duplicates happen when:

The consumer processed the event but returned a non-2xx response (e.g., crashed after processing but before responding)
Network issues caused the response to be lost
Manual replay of an event

Example implementation

-- PostgreSQL example
CREATE TABLE processed_webhooks (
    event_id UUID PRIMARY KEY,
    processed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Before processing:
INSERT INTO processed_webhooks (event_id)
VALUES ($1)
ON CONFLICT (event_id) DO NOTHING
RETURNING event_id;

-- If no row returned, event was already processed -- skip it.

Configuration

The output worker's retry and delivery behavior is configured via environment variables:

Parameter	Default	Description
`MAX_RETRIES`	25	Maximum delivery attempts before giving up
`MAX_RETRY_WINDOW`	8 days	Maximum time window for retries
`CONNECT_TIMEOUT`	5 seconds	Timeout for establishing a TCP connection
`TIMEOUT`	15 seconds	Total HTTP request timeout (including connect)
`CONCURRENT`	1	Number of request attempts handled concurrently

Error types

When a delivery fails, Hook0 records one of these error codes:

Error code	Meaning
`E_TIMEOUT`	The HTTP request timed out
`E_CONNECTION`	Could not establish a connection to the target
`E_HTTP`	The server responded with a non-2xx status code
`E_INVALID_TARGET`	The target URL is invalid or resolves to a forbidden IP
`E_INVALID_HEADER`	A required header value could not be constructed (non-retryable)
`E_UNKNOWN`	An unexpected error occurred

SSRF protection

Hook0 blocks webhook deliveries to private/internal IP addresses by default (loopback, RFC 1918, link-local, etc.). This prevents Server-Side Request Forgery attacks. This check can be disabled with the DISABLE_TARGET_IP_CHECK flag for development environments.

Why retries matter​

Two-phase retry schedule​

Default retry configuration​

Per-subscription overrides​

What happens on failure​

Non-retryable errors​

Subscription and application checks​

Delivery status flow​

When all retries are exhausted​

Idempotency​

Example implementation​

Configuration​

Error types​

SSRF protection​

Further reading​