fix(editor): Resolve nodes stuck on loading after execution in instance-ai preview#28450
Open
fix(editor): Resolve nodes stuck on loading after execution in instance-ai preview#28450
Conversation
Remove type icons from tab labels, make tabs fill full header height, and replace loading spinners with larger loader-circle icon (80px). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…g (no-changelog) Adds a Playwright e2e test that captures the bug where the last node in the instance AI workflow preview stays in "running" state (spinning border) after execution completes. The test sends a specific prompt to build and execute a 3-node workflow, then asserts that no canvas nodes remain with the .running CSS class. Includes InstanceAiPage page object, navigation helper, and test fixtures with N8N_ENABLED_MODULES=instance-ai. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ionFinished (no-changelog) The event relay watcher only forwarded the last event in the log, so when Vue coalesced multiple ref updates into one callback, intermediate events (e.g. nodeExecuteAfter for the last node) were silently dropped. This left the iframe's executing-node queue with a stale entry, keeping the last node in spinning/running state after the workflow finished. - Track relayed event count so every new event is forwarded, even when the watcher fires once for multiple log additions. - Keep the eventLog intact when executionFinished arrives (instead of clearing it immediately) so the relay can forward pending events before sending the synthetic executionFinished. - Add clearEventLog() to useExecutionPushEvents, called by the relay after all pending events have been forwarded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 16 Playwright e2e tests across 6 spec files covering instance AI workflow preview, artifacts, timeline, sidebar, confirmations, and chat basics. Wire up proxy-aware fetch in the AI SDK model creation so MockServer can intercept Anthropic API calls for recording/replay. - Expand InstanceAiPage page object with 30+ locators - Add InstanceAiSidebar component page object - Add data-test-id to preview close button - Add getProxyFetch() to model-factory.ts and instance-ai.service.ts so @ai-sdk/anthropic respects HTTP_PROXY in e2e containers - Rewrite fixtures with proxy service recording support - Replace single execution-state test with comprehensive suite Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hangelog) Add two-tier trace replay system that records tool I/O during e2e test recording and replays with bidirectional ID remapping in CI. This enables deterministic replay of complex multi-step agent tests where tool execution produces dynamic IDs. - New trace-replay.ts: IdRemapper (ID-field-aware), TraceIndex (per-role cursors), TraceWriter, JSONL I/O helpers, PURE_REPLAY_TOOLS set - Modified langsmith-tracing.ts: replayWrapTool (Tier 1: real execution + ID remap), pureReplayWrapTool (Tier 2: pure replay for external deps), recordWrapTool, createTraceReplayOnlyContext stub for non-LangSmith envs - New test-only controller endpoints: POST/GET/DELETE /test/tool-trace with slug-scoped storage for parallel test isolation - Updated fixture: records trace.jsonl during recording, loads for replay, slug-scoped activate/retrieve/clear lifecycle - 23 unit tests for IdRemapper and TraceIndex - Recorded trace.jsonl files for all 15 instance AI test expectations Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…log) Add subString body matching on the system prompt to disambiguate LLM call types (title generation vs orchestrator vs sub-agent) during proxy replay. Without this, sequential expectations could be served to the wrong call when the call order differs between recording and replay. Re-record all expectations with the body matcher and remove debug logging from trace replay wrappers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When re-recording with a real API key, always use record mode (never load old trace events into the backend). Previously, existing trace files would cause the backend to enter replay mode during re-recording, resulting in trace.jsonl files with only a header and no tool calls. Re-record all trace.jsonl files with proper tool call events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Re-record all proxy expectations after fixing the recording mode logic. Expectations now have subString body matchers on the system prompt and trace.jsonl files have proper tool call events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The proxy's sequential mode sets the last expectation as unlimited (fallback for extra agent turns). Previously this applied to the last file alphabetically which could be a community_nodes GET. Now it finds the last /v1/messages POST expectation specifically. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…elog) Background task completion triggers `startInternalFollowUpRun`, which creates a new trace context. Previously each context got a fresh TraceIndex with cursor at 0, so the follow-up run's first tool call (e.g. list-workflows) would mismatch the first trace event (build-workflow-with-agent) and throw. Fix: store a shared TraceIndex/IdRemapper per test slug on the service. All runs within the same slug reuse the same instances, preserving cursor state across the initial run and any follow-up runs. This fixes the two confirmation e2e tests that rely on suspend/resume. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…(no-changelog) waitForAssistantResponse only waited for the first message element to appear (streaming start), not for the agent to finish. Sidebar operations then raced against the still-running agent. New waitForResponseComplete waits for the send button to reappear, which only renders when isStreaming becomes false. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s between tests (no-changelog) Two preview tests failed because their recorded proxy expectations contained stale LLM responses from previous tests' background task follow-ups. The fixture now cancels leftover background tasks before each test via a new test-only endpoint, preventing future cross-test contamination. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ET (no-changelog) MockServer proxy connections intermittently reset when 4 parallel workers load expectations simultaneously. Add withRetry helper with exponential backoff (3 retries, 500ms base) and re-throw on failure instead of silently swallowing the error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ation (no-changelog) Positional selectors (.last()) break when parallel tests create threads in shared containers. Switch to getThreadByTitle() with LLM-generated titles from recordings. Also handle missing expectations directories gracefully. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ure (no-changelog) Covers the record/replay architecture, ID remapping problem and solution, two-tier tool wrapping strategy, trace format, and troubleshooting guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ant proxy expectations (no-changelog) - Add unit tests for TraceWriter, parseTraceJsonl, model-factory proxy fetch, clearEventLog, and useEventRelay coalesced event handling - Extract test-only trace replay endpoints into InstanceAiTestController, conditionally registered when N8N_INSTANCE_AI_TRACE_REPLAY is set - Extract trace replay state from InstanceAiService into TraceReplayState class - Remove 83 irrelevant api-staging community nodes expectation files - Fix stale test that expected eventLog cleared on executionFinished Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tance AI e2e tests (no-changelog) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…laims (no-changelog) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename restricted `err` identifier to `error` and mark withRetry callbacks async to comply with @typescript-eslint/promise-function-async.
Use nullish coalescing for proxy env vars, move the undici type annotation to a top-level import type, and mark the returned fetch wrapper async so its Promise return type is explicit.
…hangelog) Iterate all workflow executions so background workflows have their buffered event logs cleaned up when they finish, preventing stale replay if the user later switches tabs. Track the last-seen executionId per workflow to detect when useExecutionPushEvents issues a fresh eventLog on a re-execution and reset the relay cursor, which otherwise would skip the new run's events.
…elog) Replace import() type annotation with top-level import, quote 'string' property keys to avoid id-denylist, and move the e2e tests doc under the instance-ai package docs folder.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gelog) clearDir was skipped when recordedExpectations was empty, leaving stale files that subsequent replays would consume as outdated data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-tabs # Conflicts: # packages/cli/src/modules/instance-ai/instance-ai.service.ts
Replace the implicit `jsonParse<TraceEvent>` cast in `parseTraceJsonl` with a real type guard. Each line must be an object with a known `kind` discriminator (`header`, `tool-call`, `tool-suspend`, `tool-resume`); otherwise the parse throws with the offending line number so a malformed expectation file fails loudly instead of corrupting downstream replay. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iew test Adds two layered Playwright runners and a failing test that exercises them: - `pnpm test:local:isolated` — generic local runner with a random port, throwaway `N8N_USER_FOLDER` under the OS temp dir, full `@capability:*` inclusion, and process-group cleanup. Extracted so other modules can reuse it via `N8N_TEST_ENV`. - `pnpm test:local:instance-ai` — thin wrapper that pre-fills the four instance-ai env vars (`N8N_ENABLED_MODULES`, model, key, local-gateway-disabled) over the generic runner. - New `should mark all nodes as success after execution completes` test in the workflow preview spec — currently failing because terminal nodes after a Wait stay in the running/waiting state instead of flipping to success. - `instanceAiProxySetup` fixture now no-ops when there's no `n8nContainer`, so the local runner can hit the real Anthropic API without a proxy stack. - README.md additions cover both runners, env-var levers (`PLAYWRIGHT_ALLOW_CONTAINER_ONLY`, `PLAYWRIGHT_SKIP_WEBSERVER`), and the instance-ai test workflow end-to-end. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a third subsection under "Running Tests" in e2e-tests.md covering the new local-build mode (no docker, real Anthropic key) via `pnpm test:local:instance-ai`, alongside a "when to use which mode" guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ce-ai preview Three root causes fixed: 1. Execution polling for fire-and-forget tool — the execute_workflow tool returns immediately before completion. The preview fetches execution data via API but it may still be running (Wait node pending). Added polling in InstanceAiWorkflowPreview to reload the iframe when execution finishes. 2. Relay desync on Wait resume — when a Wait node resumes, the server sends a second executionStarted with the same execution ID. The event handler was resetting the eventLog, desyncing the relay cursor. Now appends to the existing log when the same execution ID is seen. 3. Wait node permanently showing 'waiting' — the Wait node's executionStatus in run data stays 'waiting' even after the overall execution succeeds. Now promoted to 'success' when the execution completed successfully. Also fixes pre-existing stylelint error (invalid CSS variable name). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-exec-preview Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bundle ReportChanges will increase total bundle size by 44.8kB (0.1%) ⬆️. This is within the configured threshold ✅ Detailed changes
Affected Assets, Files, and Routes:view changes for bundle: editor-ui-esmAssets Changed:
Files in
Files in
|
The test was lost during merge conflict resolution (master's version was taken for the spec file). Adds back the test that verifies all nodes show success after a workflow with a Wait node completes execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Performance ComparisonComparing current → latest master → 14-day baseline Idle baseline with Instance AI module loaded
docker-stats
Memory consumption baseline with starter plan resources
How to read this table
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Regenerated recordings for all 4 workflow preview tests including the new "should mark all nodes as success after execution completes" test. Also excludes expectations/ from biome checks (recorded JSON fixtures can exceed the 1MB file size limit). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…expectations These external API calls to api-staging.n8n.io are not relevant to the instance-ai test recordings and add unnecessary bulk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…es proxy noise Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Wait node already has executionStatus='success' in the database after the execution completes. Only the Set node (after the Wait) was missing data because the execution hadn't finished when the iframe first fetched. The polling fix handles that case — this promotion was not needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When running locally (no Docker container), n8nContainer is null. The fixture now short-circuits proxy setup in that case, matching the behavior before the master merge overwrote it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…re expectations Poll indefinitely while the agent is streaming instead of a fixed 40-attempt cap. Once streaming stops, allow a short grace window (~7.5s) before giving up. Also revert expectation recordings to match master. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…havior Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes nodes staying stuck in loading/waiting state after workflow execution completes in the instance-ai workflow preview panel.
Three root causes identified and fixed:
Execution polling for fire-and-forget tool — The execute_workflow MCP tool returns immediately before the execution completes. The preview iframe fetches execution data via API, but when nodes like Wait are involved, the execution may still be running at fetch time. Added polling in InstanceAiWorkflowPreview that detects incomplete executions and reloads the iframe when they finish.
Relay desync on Wait node resume — When a Wait node resumes, the server sends a second executionStarted with the same execution ID. useExecutionPushEvents was creating a fresh eventLog, which desynced the relay cursor. Now appends to the existing log when the same execution ID is seen again.
How to test
Review / Merge checklist