feat: Add Prometheus counters for token exchange#28453
Open
BGZStephen wants to merge 3 commits intomasterfrom
Open
feat: Add Prometheus counters for token exchange#28453BGZStephen wants to merge 3 commits intomasterfrom
BGZStephen wants to merge 3 commits intomasterfrom
Conversation
…ogin
Exposes six counters under the /metrics endpoint when N8N_METRICS=true:
- n8n_token_exchange_requests_total{result} — success/failure rate
- n8n_token_exchange_failures_total{reason} — failure breakdown by reason
- n8n_embed_login_requests_total{result} — embed login success/failure rate
- n8n_embed_login_failures_total{reason} — embed login failure breakdown
- n8n_token_exchange_jit_provisioning_total — JIT-provisioned user count
- n8n_token_exchange_identity_linked_total — identity-linking event count
Failure reasons are normalised to stable labels (invalid_signature,
unknown_key, token_replay, etc.) so dashboards are not broken by future
error message changes.
Also closes the embed-login failure monitoring blind spot: adds an
embed-login-failed event that is emitted before re-throwing, wired into
the log-streaming relay and audit event registry alongside the existing
token exchange events.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Extract private counter accesses to named consts so the directive immediately precedes the suppressed line, as required by TypeScript.
Contributor
Performance ComparisonComparing current → latest master → 14-day baseline docker-stats
Idle baseline with Instance AI module loaded
Memory consumption baseline with starter plan resources
How to read this table
|
Contributor
There was a problem hiding this comment.
1 issue found across 7 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/cli/src/metrics/prometheus-metrics.service.ts">
<violation number="1" location="packages/cli/src/metrics/prometheus-metrics.service.ts:715">
P2: Token-exchange metric listeners are re-registered on every `init()` call, which can double-count events after reinitialization.</violation>
</file>
Architecture diagram
sequenceDiagram
participant Client as Client Browser
participant Ctrl as EmbedAuthController
participant Svc as TokenExchangeService
participant Events as EventService (Local)
participant Metrics as PrometheusMetricsService
participant Audit as LogStreamingEventRelay
participant Prom as Prometheus Server
Note over Metrics,Events: Initialization (N8N_METRICS=true)
Metrics->>Events: Register listeners for token-exchange and embed-login events
Metrics->>Metrics: NEW: Initialize 6 counters (pre-seed success/failure labels at 0)
Note over Client,Audit: Request Flow: Embed Login
Client->>Ctrl: GET /embed/login?token=...
Ctrl->>Svc: embedLogin(subjectToken)
alt Success Path
Svc-->>Ctrl: User Identity
Ctrl->>Events: emit('embed-login')
Events-->>Metrics: Trigger handler
Metrics->>Metrics: inc(n8n_embed_login_requests_total{result:success})
Ctrl-->>Client: 302 Redirect + Auth Cookie
else CHANGED: Failure Path
Svc-->>Ctrl: Throw Error (e.g. "Unknown key id")
Ctrl->>Events: NEW: emit('embed-login-failed', { failureReason })
par Async Metrics Update
Events-->>Metrics: Trigger handler
Metrics->>Metrics: inc(n8n_embed_login_requests_total{result:failure})
Metrics->>Metrics: NEW: normalizeFailureReason(reason)
Metrics->>Metrics: inc(n8n_embed_login_failures_total{reason:unknown_key})
and Async Audit Log
Events-->>Audit: Trigger relay
Audit->>Audit: NEW: embedLoginFailed()
Note right of Audit: Emits n8n.audit.token-exchange.embed-login-failed
end
Ctrl-->>Client: 500 / Error Response
end
Note over Client,Metrics: Background: Metric Scraping
Prom->>Metrics: GET /metrics
Metrics-->>Prom: Return counters (Requests, Failures, JIT, Linked)
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds Prometheus counters for token exchange and embed login operations so operators can monitor the health of these authentication flows — detecting key misconfiguration, replay attacks, and JIT provisioning bursts without digging through logs.
New metrics (always registered when
N8N_METRICS=true)n8n_token_exchange_requests_totalresult: success|failuren8n_token_exchange_failures_totalreason: <code>n8n_embed_login_requests_totalresult: success|failuren8n_embed_login_failures_totalreason: <code>n8n_token_exchange_jit_provisioning_totaln8n_token_exchange_identity_linked_totalFailure
reasonlabels are stable codes normalised from error messages (invalid_signature,unknown_key,token_replay,token_too_long,token_near_expiry,invalid_format,missing_kid,missing_iss,invalid_claims,internal_error,role_not_allowed,other) — dashboards won't break if error message text changes.Also: embed login failure visibility
The embed auth controller previously let errors propagate silently (no event emitted, no metric). This PR wraps
handleLogin()in a try/catch that emits a newembed-login-failedevent before re-throwing, closing the monitoring blind spot. The event is also wired into the log-streaming relay and audit event registry alongside the existing token exchange events.How to test
Related Linear tickets, Github issues, and Community forum posts
https://linear.app/n8n/issue/IAM-475
Tests
Unit tests added in
packages/cli/src/metrics/__tests__/prometheus-metrics.service.test.tscovering:init()(unconditional, no config flag required)resultlabel combos (success/failure) are pre-seeded at 0 on startuptoken-exchange-succeeded→ increments success countertoken-exchange-failed→ increments failure counter + maps error message to normalized reason label'other'(cardinality safety)'not allowed','Unrecognized role','Cannot provision') map to'role_not_allowed'embed-login→ increments embed login success counterembed-login-failed→ increments embed login failure counter + normalizes reasontoken-exchange-user-provisioned→ increments JIT provisioning countertoken-exchange-identity-linked→ increments identity-linked counterEmbed controller test updated: failure path now asserts
embed-login-failedis emitted andembed-login(success event) is not.Review / Merge checklist
Backport to Beta,Backport to Stable, orBackport to v1(if the PR is an urgent fix that needs to be backported)