[WIP] Improve acceptance test performance and reliability#3230
[WIP] Improve acceptance test performance and reliability#3230st3penta wants to merge 19 commits intoconforma:mainfrom
Conversation
Profiling showed all 285 containers accumulated during the test run with zero termination, wasting memory and causing Docker daemon overhead. Each container type (registry, git, wiremock) now stores the testcontainers.Container reference and calls Terminate(ctx) in the sc.After hook, respecting the persist flag. Ref: EC-1710 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each scenario now writes to its own temp file, eliminating log interleaving across parallel goroutines. Removes shouldSuppress() which incorrectly filtered Error-level messages, and removes /dev/tty writes that failed silently in CI. Replaces os.Exit(1) with t.Fatalf() so TestMain cleanup runs properly. Failed scenario log file paths are printed in the test summary. Ref: EC-1710 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sequential per-version task bundle build loop was the largest CI regression at 4.9x slower than local (2m12s vs 27s). Each version produces an independent bundle image, so the builds are safe to run concurrently. Uses errgroup to propagate the first error and cancel remaining builds via context. Ref: EC-1710 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Important Review skippedIgnore keyword(s) in the title. ⛔ Ignored keywords (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds per-scenario file logging, a runtime profiling package with CI artifact upload, container lifecycle tracking for testcontainers, CLI build caching and concurrent task-bundle builds for kind clusters, and acceptance-test failure/log tracking integrated into scenario lifecycle. Changes
Sequence Diagram(s)sequenceDiagram
participant TestRunner
participant Profile
participant ScenarioLogger
participant ContainerManager
TestRunner->>Profile: ScenarioStart(ctx)
TestRunner->>ScenarioLogger: LoggerFor(ctx) -> create temp log file
Note right of ScenarioLogger: per-scenario file opened
TestRunner->>ContainerManager: BeginContainer("type")
ContainerManager->>Profile: record container begin
TestRunner->>TestRunner: execute scenario steps
TestRunner->>Profile: ScenarioEnd(ctx, name, feature)
Profile->>ScenarioLogger: request LogFile()
ScenarioLogger-->>TestRunner: return logFile path
TestRunner->>ContainerManager: terminate container (if not persisted)
ContainerManager-->>TestRunner: stop container (warn on error)
TestRunner->>ScenarioLogger: Close() or os.Remove(logFile) on success
TestRunner->>Profile: Report()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoImprove acceptance test performance and reliability with container cleanup and logging
WalkthroughsDescription• Implement container termination in After hooks to prevent accumulation • Replace testing.T log delegation with per-scenario file-based logging • Parallelize task bundle builds using errgroup for concurrent execution • Add content-hash caching for CLI image builds to skip redundant builds Diagramflowchart LR
A["Acceptance Tests"] --> B["Container Termination"]
A --> C["File-Based Logging"]
A --> D["Parallel Builds"]
A --> E["Image Caching"]
B --> B1["Git Container"]
B --> B2["Registry Container"]
B --> B3["WireMock Container"]
C --> C1["Per-Scenario Temp Files"]
C --> C2["Log File Paths in Summary"]
D --> D1["Task Bundle Builds"]
E --> E1["CLI Image Hash Cache"]
File Changes1. acceptance/acceptance_test.go
|
Code Review by Qodo
|
| g, gCtx := errgroup.WithContext(ctx) | ||
| for version, tasks := range taskBundles { | ||
| tasksPath := strings.Join(tasks, ",") | ||
| cmd := exec.CommandContext(ctx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */ | ||
| if out, err := cmd.CombinedOutput(); err != nil { | ||
| fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err) | ||
| fmt.Print(string(out)) | ||
| return err | ||
| } | ||
| g.Go(func() error { | ||
| tasksPath := strings.Join(tasks, ",") | ||
| cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */ | ||
| if out, err := cmd.CombinedOutput(); err != nil { | ||
| fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err) | ||
| fmt.Print(string(out)) | ||
| return err | ||
| } | ||
| return nil | ||
| }) |
There was a problem hiding this comment.
1. Loop vars captured 🐞 Bug ≡ Correctness
buildTaskBundleImage launches goroutines that close over the range variables version and tasks; because those variables are reused across iterations, goroutines can run with the wrong TASK_TAG/TASKS and push incorrect bundle images.
Agent Prompt
## Issue description
`buildTaskBundleImage` starts goroutines inside a `for version, tasks := range taskBundles` loop, but the goroutine closure captures the loop variables. This can cause all goroutines to use the wrong `version`/`tasks` values, producing incorrect `TASK_TAG` and `TASKS` when running concurrently.
## Issue Context
This is a classic Go `for range` closure-capture bug; the fix is to create per-iteration copies (shadow variables) before launching the goroutine.
## Fix Focus Areas
- acceptance/kubernetes/kind/image.go[229-240]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
acceptance/registry/registry.go (1)
112-127:⚠️ Potential issue | 🟠 MajorStore the registry container handle before
MappedPort.If
GenericContainersucceeds and Line 121 fails,state.Containeris still nil, so the newAfterhook cannot terminate the leaked registry container.Suggested fix
registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ ContainerRequest: req, Started: true, Logger: logger, }) if err != nil { return ctx, err } + state.Container = registry port, err := registry.MappedPort(ctx, "5000/tcp") if err != nil { return ctx, err } state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int()) - state.Container = registry🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/registry/registry.go` around lines 112 - 127, After successfully creating the container with testcontainers.GenericContainer (the registry variable), immediately store the handle into state.Container before calling registry.MappedPort so that if MappedPort or any subsequent step fails the After hook can still terminate the container; update the code around GenericContainer and state.Container assignment (and ensure state.Container is set prior to calling registry.MappedPort) so the registry handle is never left nil on error.acceptance/git/git.go (1)
192-199:⚠️ Potential issue | 🟠 MajorPartially started git containers can still leak.
state.Containeris only recorded afterMappedPortsucceeds, and Line 330 skips cleanup whenstate.Up()is false. If the container starts but startup fails before both state fields are populated, this path still leaves it running.Suggested fix
git, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ ContainerRequest: req, Started: true, Logger: logger, }) if err != nil { return ctx, err } + state.Container = git port, err := git.MappedPort(ctx, "443/tcp") if err != nil { return ctx, err } state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int()) state.RepositoriesDir = repositories - state.Container = git return ctx, nil- if !state.Up() { - return ctx, nil - } - if state.Container != nil { if err := state.Container.Terminate(ctx); err != nil { logger, _ := log.LoggerFor(ctx) logger.Warnf("failed to terminate git container: %v", err) } } + + if state.RepositoriesDir == "" { + return ctx, nil + } os.RemoveAll(state.RepositoriesDir)Also applies to: 330-339
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/git/git.go` around lines 192 - 199, The container can leak because state.Container is only set after git.MappedPort succeeds; set state.Container as soon as the container is created (assign state.Container = git immediately after creation) so the cleanup path that checks state.Up() will see the running container even if MappedPort fails, and ensure HostAndPort/RepositoriesDir are only set after MappedPort succeeds; alternatively, register a deferred cleanup that references the created container on error. Update the logic around git (the created container), git.MappedPort, state.Container, state.HostAndPort, state.RepositoriesDir and the state.Up() check so partial startups still get cleaned up.acceptance/acceptance_test.go (1)
149-157:⚠️ Potential issue | 🟠 MajorKeep the log file until
Persistsucceeds.When
scenarioErris nil, Line 153 deletes the log beforetestenv.Persist(ctx)runs. If persistence then fails, the scenario becomes failed but its log is already gone and never shows up in the failure summary.Suggested fix
- if scenarioErr != nil { - tracker.addFailure(scenario.Name, scenario.Uri, logFile, scenarioErr) - } else { - // Clean up log files for passing scenarios - os.Remove(logFile) - } - - _, err := testenv.Persist(ctx) - return ctx, err + _, err := testenv.Persist(ctx) + if err != nil { + tracker.addFailure(scenario.Name, scenario.Uri, logFile, err) + return ctx, err + } + + if scenarioErr != nil { + tracker.addFailure(scenario.Name, scenario.Uri, logFile, scenarioErr) + return ctx, nil + } + + // Clean up log files for passing scenarios only after all scenario teardown succeeded. + _ = os.Remove(logFile) + return ctx, nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/acceptance_test.go` around lines 149 - 157, The code currently removes the scenario log (os.Remove(logFile)) when scenarioErr is nil before calling testenv.Persist(ctx), which can lose logs if Persist fails; change the flow so you only delete the log file after Persist succeeds: remove the os.Remove(logFile) from the branch where scenarioErr == nil and instead call os.Remove(logFile) only after testenv.Persist(ctx) returns no error (and if Persist returns an error, ensure the log is retained and that tracker.addFailure or equivalent is invoked so the failure summary can include the log); reference functions/variables: scenarioErr, logFile, tracker.addFailure, and testenv.Persist.acceptance/wiremock/wiremock.go (1)
220-236:⚠️ Potential issue | 🟠 MajorStore the WireMock container handle before port lookup.
If
GenericContainersucceeds and Line 229 fails,state.Containeris never set, so the new scenario cleanup cannot terminate the leaked WireMock container.Suggested fix
w, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ ContainerRequest: req, Started: true, Logger: logger, }) if err != nil { return ctx, fmt.Errorf("unable to run GenericContainer: %v", err) } + state.Container = w port, err := w.MappedPort(ctx, "8080/tcp") if err != nil { return ctx, err } url := fmt.Sprintf("http://localhost:%d", port.Int()) state.URL = url - state.Container = w🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/wiremock/wiremock.go` around lines 220 - 236, The container handle returned by testcontainers.GenericContainer (variable w) must be stored to state.Container immediately after creation so cleanup can always find and terminate the container even if the subsequent port lookup (w.MappedPort) fails; modify the block in wiremock.go to set state.Container = w right after checking err from GenericContainer (before calling w.MappedPort or fmt.Sprintf) and keep the existing error returns for port lookup so leaked containers are avoidable by the cleanup routine.acceptance/kubernetes/kind/image.go (1)
53-76:⚠️ Potential issue | 🟠 MajorAdd image existence validation before accepting cache hit.
Lines 60-62 skip the image build based only on source-hash comparison against a local marker file. The code never verifies that
localhost:<registryPort>/clistill exists in the registry behind that port. If the registry container restarts, gets cleared, or the port is reused with a fresh registry state, a cache hit will skipmake push-imagewhile the expected image is missing, causing later pulls to fail.Add a registry image existence check (e.g.,
remote.Head()) before the cache-hit early return at line 62.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/kubernetes/kind/image.go` around lines 53 - 76, The cache-hit logic in computeSourceHash()/image build flow relies only on a local cacheFile and may skip building a missing registry image; before returning early where it currently prints "[INFO] CLI image cache hit, skipping build", perform an existence check against the registry (e.g., call a registry head/get method such as remote.Head() or equivalent HTTP HEAD to localhost:<k.registryPort>/v2/cli/manifests/latest) for the image repo "localhost:<registryPort>/cli" and only treat it as a true cache hit if the registry confirms the image exists; if the head request fails or returns 404, fall back to running the existing build path (exec.CommandContext(... "make", "push-image"...)) and only write cacheFile after that successful push.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acceptance/kubernetes/kind/image.go`:
- Around line 229-243: The current loop spawns an unbounded errgroup goroutine
per version (using errgroup.WithContext and g.Go) which can overwhelm CI;
instead limit concurrency by creating a bounded worker pool or semaphore and
acquire before launching each goroutine (release when done) so at most N
concurrent make task-bundle invocations run; update the block that iterates over
taskBundles (the loop using g.Go, tasksPath, exec.CommandContext and
cmd.CombinedOutput) to acquire the semaphore, spawn the goroutine that defers
release, and use the same errgroup to collect errors, exposing the concurrency
limit as a constant or field (e.g., maxConcurrentTaskBundles) so it can be tuned
for CI.
---
Outside diff comments:
In `@acceptance/acceptance_test.go`:
- Around line 149-157: The code currently removes the scenario log
(os.Remove(logFile)) when scenarioErr is nil before calling
testenv.Persist(ctx), which can lose logs if Persist fails; change the flow so
you only delete the log file after Persist succeeds: remove the
os.Remove(logFile) from the branch where scenarioErr == nil and instead call
os.Remove(logFile) only after testenv.Persist(ctx) returns no error (and if
Persist returns an error, ensure the log is retained and that tracker.addFailure
or equivalent is invoked so the failure summary can include the log); reference
functions/variables: scenarioErr, logFile, tracker.addFailure, and
testenv.Persist.
In `@acceptance/git/git.go`:
- Around line 192-199: The container can leak because state.Container is only
set after git.MappedPort succeeds; set state.Container as soon as the container
is created (assign state.Container = git immediately after creation) so the
cleanup path that checks state.Up() will see the running container even if
MappedPort fails, and ensure HostAndPort/RepositoriesDir are only set after
MappedPort succeeds; alternatively, register a deferred cleanup that references
the created container on error. Update the logic around git (the created
container), git.MappedPort, state.Container, state.HostAndPort,
state.RepositoriesDir and the state.Up() check so partial startups still get
cleaned up.
In `@acceptance/kubernetes/kind/image.go`:
- Around line 53-76: The cache-hit logic in computeSourceHash()/image build flow
relies only on a local cacheFile and may skip building a missing registry image;
before returning early where it currently prints "[INFO] CLI image cache hit,
skipping build", perform an existence check against the registry (e.g., call a
registry head/get method such as remote.Head() or equivalent HTTP HEAD to
localhost:<k.registryPort>/v2/cli/manifests/latest) for the image repo
"localhost:<registryPort>/cli" and only treat it as a true cache hit if the
registry confirms the image exists; if the head request fails or returns 404,
fall back to running the existing build path (exec.CommandContext(... "make",
"push-image"...)) and only write cacheFile after that successful push.
In `@acceptance/registry/registry.go`:
- Around line 112-127: After successfully creating the container with
testcontainers.GenericContainer (the registry variable), immediately store the
handle into state.Container before calling registry.MappedPort so that if
MappedPort or any subsequent step fails the After hook can still terminate the
container; update the code around GenericContainer and state.Container
assignment (and ensure state.Container is set prior to calling
registry.MappedPort) so the registry handle is never left nil on error.
In `@acceptance/wiremock/wiremock.go`:
- Around line 220-236: The container handle returned by
testcontainers.GenericContainer (variable w) must be stored to state.Container
immediately after creation so cleanup can always find and terminate the
container even if the subsequent port lookup (w.MappedPort) fails; modify the
block in wiremock.go to set state.Container = w right after checking err from
GenericContainer (before calling w.MappedPort or fmt.Sprintf) and keep the
existing error returns for port lookup so leaked containers are avoidable by the
cleanup routine.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: a4add62f-86ca-444e-970b-ae1ac872bc1a
📒 Files selected for processing (8)
acceptance/acceptance_test.goacceptance/git/git.goacceptance/go.modacceptance/kubernetes/kind/image.goacceptance/log/log.goacceptance/log/log_test.goacceptance/registry/registry.goacceptance/wiremock/wiremock.go
| g, gCtx := errgroup.WithContext(ctx) | ||
| for version, tasks := range taskBundles { | ||
| tasksPath := strings.Join(tasks, ",") | ||
| cmd := exec.CommandContext(ctx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */ | ||
| if out, err := cmd.CombinedOutput(); err != nil { | ||
| fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err) | ||
| fmt.Print(string(out)) | ||
| return err | ||
| } | ||
| g.Go(func() error { | ||
| tasksPath := strings.Join(tasks, ",") | ||
| cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */ | ||
| if out, err := cmd.CombinedOutput(); err != nil { | ||
| fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err) | ||
| fmt.Print(string(out)) | ||
| return err | ||
| } | ||
| return nil | ||
| }) | ||
| } | ||
|
|
||
| return nil | ||
| return g.Wait() |
There was a problem hiding this comment.
Cap the task-bundle fan-out.
This launches one make task-bundle per version with no limit. On CI, that can swamp the image builder/registry and reduce reliability instead of improving it.
Suggested fix
import (
"archive/tar"
"compress/gzip"
"context"
"crypto/sha256"
"encoding/hex"
"errors"
"fmt"
"io"
"os"
"os/exec"
"path/filepath"
+ "runtime"
"strings"
@@
g, gCtx := errgroup.WithContext(ctx)
+ g.SetLimit(runtime.NumCPU())
for version, tasks := range taskBundles {
g.Go(func() error {
tasksPath := strings.Join(tasks, ",")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| g, gCtx := errgroup.WithContext(ctx) | |
| for version, tasks := range taskBundles { | |
| tasksPath := strings.Join(tasks, ",") | |
| cmd := exec.CommandContext(ctx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */ | |
| if out, err := cmd.CombinedOutput(); err != nil { | |
| fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err) | |
| fmt.Print(string(out)) | |
| return err | |
| } | |
| g.Go(func() error { | |
| tasksPath := strings.Join(tasks, ",") | |
| cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */ | |
| if out, err := cmd.CombinedOutput(); err != nil { | |
| fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err) | |
| fmt.Print(string(out)) | |
| return err | |
| } | |
| return nil | |
| }) | |
| } | |
| return nil | |
| return g.Wait() | |
| g, gCtx := errgroup.WithContext(ctx) | |
| g.SetLimit(runtime.NumCPU()) | |
| for version, tasks := range taskBundles { | |
| g.Go(func() error { | |
| tasksPath := strings.Join(tasks, ",") | |
| cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* `#nosec` */ | |
| if out, err := cmd.CombinedOutput(); err != nil { | |
| fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err) | |
| fmt.Print(string(out)) | |
| return err | |
| } | |
| return nil | |
| }) | |
| } | |
| return g.Wait() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/kubernetes/kind/image.go` around lines 229 - 243, The current loop
spawns an unbounded errgroup goroutine per version (using errgroup.WithContext
and g.Go) which can overwhelm CI; instead limit concurrency by creating a
bounded worker pool or semaphore and acquire before launching each goroutine
(release when done) so at most N concurrent make task-bundle invocations run;
update the block that iterates over taskBundles (the loop using g.Go, tasksPath,
exec.CommandContext and cmd.CombinedOutput) to acquire the semaphore, spawn the
goroutine that defers release, and use the same errgroup to collect errors,
exposing the concurrency limit as a constant or field (e.g.,
maxConcurrentTaskBundles) so it can be tuned for CI.
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
acceptance/profile/profile.go (1)
36-36: Consider makingenabledconfigurable at runtime.The
enabledflag is hardcoded totrue. Based on the workflow,E2E_INSTRUMENTATION=trueis set, suggesting profiling should be optional. Consider reading from an environment variable:var enabled = os.Getenv("E2E_INSTRUMENTATION") == "true"This would allow profiling to be disabled by default and only enabled when explicitly requested.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/profile/profile.go` at line 36, The hardcoded package-level variable enabled should be made runtime-configurable: change the initialization of enabled to read the E2E_INSTRUMENTATION environment variable (e.g., set enabled = os.Getenv("E2E_INSTRUMENTATION") == "true") and add the necessary import for the os package; update any tests or callers of enabled in this package (functions in profile.go referencing enabled) if they assume a constant value so they still behave correctly when toggled at runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acceptance/profile/profile.go`:
- Around line 148-156: The activeContainers counter is only incremented
(activeContainers.Add(1)) when a container finishes starting and never
decremented, so it reports total started rather than current active; update the
logic in the container lifecycle (e.g., move the increment into BeginContainer
before the closure that runs the container and add a corresponding
activeContainers.Add(-1) when the container terminates/cleanup runs), ensure
peakContainers is updated against the concurrent value, and update any uses of
activeContainers/“Final active containers” in logEvent or rename the metric to
totalContainersStarted if you intend to track starts instead of concurrency.
---
Nitpick comments:
In `@acceptance/profile/profile.go`:
- Line 36: The hardcoded package-level variable enabled should be made
runtime-configurable: change the initialization of enabled to read the
E2E_INSTRUMENTATION environment variable (e.g., set enabled =
os.Getenv("E2E_INSTRUMENTATION") == "true") and add the necessary import for the
os package; update any tests or callers of enabled in this package (functions in
profile.go referencing enabled) if they assume a constant value so they still
behave correctly when toggled at runtime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 24056f50-fc6e-4eb4-952f-8fbc57b0cf50
📒 Files selected for processing (7)
.github/workflows/checks-codecov.yamlacceptance/acceptance_test.goacceptance/git/git.goacceptance/kubernetes/kind/kind.goacceptance/profile/profile.goacceptance/registry/registry.goacceptance/wiremock/wiremock.go
🚧 Files skipped from review as they are similar to previous changes (2)
- acceptance/git/git.go
- acceptance/registry/registry.go
| current := activeContainers.Add(1) | ||
| mu.Lock() | ||
| if current > peakContainers { | ||
| peakContainers = current | ||
| } | ||
| mu.Unlock() | ||
|
|
||
| logEvent(map[string]any{ | ||
| "event": "container", "type": containerType, |
There was a problem hiding this comment.
activeContainers metric appears misleading—never decremented.
The activeContainers counter is incremented when a container finishes starting but is never decremented when containers terminate. This means "Final active containers" in the report will always equal the total containers started, not the actual active count.
If tracking concurrent container startups is the intent, the increment should occur before the container starts (in BeginContainer before the closure), and the decrement should occur when containers terminate. Otherwise, consider renaming to totalContainersStarted to reflect actual behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/profile/profile.go` around lines 148 - 156, The activeContainers
counter is only incremented (activeContainers.Add(1)) when a container finishes
starting and never decremented, so it reports total started rather than current
active; update the logic in the container lifecycle (e.g., move the increment
into BeginContainer before the closure that runs the container and add a
corresponding activeContainers.Add(-1) when the container terminates/cleanup
runs), ensure peakContainers is updated against the concurrent value, and update
any uses of activeContainers/“Final active containers” in logEvent or rename the
metric to totalContainersStarted if you intend to track starts instead of
concurrency.
The make push-image step takes 1m41s locally and 3m26s in CI even when source hasn't changed. A SHA-256 hash of all build inputs (Go source, go.mod, go.sum, Dockerfile, build.sh, Makefile, and hack/reduce-snapshot.sh) is now computed and compared against a per-registry-port cache marker file. When the hash matches, the build is skipped entirely with a cache hit log message. Ref: EC-1710 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
acceptance/kubernetes/kind/image.go (1)
229-243:⚠️ Potential issue | 🟠 MajorCap the task-bundle fan-out.
The errgroup launches one goroutine per version without a concurrency limit. On CI with many task versions, this can saturate the image builder and registry, potentially reducing reliability rather than improving it.
Suggested fix
+import "runtime" + // ... in buildTaskBundleImage: g, gCtx := errgroup.WithContext(ctx) + g.SetLimit(runtime.NumCPU()) for version, tasks := range taskBundles { g.Go(func() error {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acceptance/kubernetes/kind/image.go` around lines 229 - 243, The loop spawns an unbounded goroutine per entry in taskBundles via errgroup.WithContext and g.Go, which can saturate resources; limit concurrency by introducing a bounded semaphore (e.g., sem := make(chan struct{}, N)) and acquire before starting each work item and release after it finishes, then wrap the build/push logic currently inside g.Go with a closure that captures the loop variables (copy version, tasks into locals) to avoid loop-variable capture bugs; replace the current g.Go calls to first do sem <- struct{}{} then g.Go(func() error { defer func(){ <-sem }(); ... }) and return g.Wait() as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acceptance/git/git.go`:
- Around line 338-344: The code currently calls profile.ContainerTerminated()
unconditionally after attempting to terminate state.Container, which masks
failures; change the logic in the block handling state.Container to call
profile.ContainerTerminated() only when state.Container.Terminate(ctx) returns
nil (i.e., success), otherwise log the error as is and do not mark the profile
as terminated—locate the state.Container conditional and the call to
Terminate(ctx) and move or guard profile.ContainerTerminated() so it executes
only on successful termination.
- Around line 184-203: The profile.BeginContainer("git") call is marked as
started even when subsequent steps fail and the container may leak; change the
flow so you only call the returned endContainer() after successful startup and
registration: call git.MappedPort(ctx) and on success set state.HostAndPort,
state.RepositoriesDir and state.Container = git, then call endContainer(); if
MappedPort or any step after container creation fails, ensure you terminate the
created container (e.g., call git.Terminate(ctx) or otherwise store it so the
existing cleanup can run) instead of leaving it running while skipping
endContainer().
In `@acceptance/registry/registry.go`:
- Around line 342-349: The call to profile.ContainerTerminated() is executed
unconditionally after attempting to terminate the container, which incorrectly
updates profiling metrics even when state.Container.Terminate(ctx) fails; modify
the code in the registry teardown (where testenv.FetchState[registryState](ctx)
and state.Container.Terminate(ctx) are used) so that
profile.ContainerTerminated() is only invoked when Terminate returns nil (i.e.,
on the success path), leaving the error-handling/logging path to log the failure
without updating the profile counts.
- Around line 113-131: The profiling call BeginContainer("registry") is opened
too early and endContainer() is invoked even when setup fails, causing
active-container inflation and leaks; instead, call
profile.BeginContainer("registry") only after successful port resolution and
assignment to state.Container: create the container with
testcontainers.GenericContainer, if that returns an error just return; then call
registry.MappedPort(ctx, "5000/tcp"), and if MappedPort fails call
registry.Terminate(ctx) (or registry.Stop/Remove equivalent) and return; only
after MappedPort succeeds set state.HostAndPort and state.Container and then
call profile.BeginContainer("registry") to get endContainer for later cleanup.
In `@acceptance/wiremock/wiremock.go`:
- Around line 331-338: The code currently calls profile.ContainerTerminated()
unconditionally after attempting to terminate the container; change it so
profile.ContainerTerminated() is only invoked when
state.Container.Terminate(ctx) returns nil (i.e., termination succeeded).
Specifically, inside the block that checks state.Container != nil, call
state.Container.Terminate(ctx), log any error with logger.Warnf("failed to
terminate wiremock container: %v", err) as before, and only call
profile.ContainerTerminated() in the branch where err == nil; do not call it
when termination fails or when state.Container is nil.
- Around line 222-241: The profiling endContainer() call is being invoked on
failure paths too early and a started container w can be orphaned if MappedPort
fails; update the success/failure handling so endContainer() is only called
after both testcontainers.GenericContainer (w) and w.MappedPort succeed, assign
state.Container = w before calling endContainer(), and if MappedPort returns an
error terminate w explicitly (call w.Terminate(ctx)) before returning the error;
locate and adjust logic around the GenericContainer call, the MappedPort call,
the endContainer() invocation, and the assignments to state.Container to
implement this flow.
---
Duplicate comments:
In `@acceptance/kubernetes/kind/image.go`:
- Around line 229-243: The loop spawns an unbounded goroutine per entry in
taskBundles via errgroup.WithContext and g.Go, which can saturate resources;
limit concurrency by introducing a bounded semaphore (e.g., sem := make(chan
struct{}, N)) and acquire before starting each work item and release after it
finishes, then wrap the build/push logic currently inside g.Go with a closure
that captures the loop variables (copy version, tasks into locals) to avoid
loop-variable capture bugs; replace the current g.Go calls to first do sem <-
struct{}{} then g.Go(func() error { defer func(){ <-sem }(); ... }) and return
g.Wait() as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: b44b02d0-624c-4ff9-935d-2c257350e413
📒 Files selected for processing (10)
.github/workflows/checks-codecov.yamlacceptance/acceptance_test.goacceptance/git/git.goacceptance/go.modacceptance/kubernetes/kind/image.goacceptance/kubernetes/kind/kind.goacceptance/log/log_test.goacceptance/profile/profile.goacceptance/registry/registry.goacceptance/wiremock/wiremock.go
✅ Files skipped from review due to trivial changes (2)
- acceptance/kubernetes/kind/kind.go
- acceptance/profile/profile.go
🚧 Files skipped from review as they are similar to previous changes (2)
- acceptance/go.mod
- acceptance/log/log_test.go
| endContainer := profile.BeginContainer("git") | ||
| git, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ | ||
| ContainerRequest: req, | ||
| Started: true, | ||
| Logger: logger, | ||
| }) | ||
| if err != nil { | ||
| endContainer() | ||
| return ctx, err | ||
| } | ||
|
|
||
| port, err := git.MappedPort(ctx, "443/tcp") | ||
| endContainer() | ||
| if err != nil { | ||
| return ctx, err | ||
| } | ||
|
|
||
| state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int()) | ||
| state.RepositoriesDir = repositories | ||
| state.Container = git |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the actual code at the specified lines
cat -n acceptance/git/git.go | sed -n '184,203p'Repository: conforma/cli
Length of output: 693
🏁 Script executed:
# Also get more context around this section to understand the flow
cat -n acceptance/git/git.go | sed -n '170,210p'Repository: conforma/cli
Length of output: 1453
🏁 Script executed:
# Check if profile.BeginContainer is used elsewhere in the file to understand the pattern
rg "profile\.BeginContainer" acceptance/git/git.go -A 15Repository: conforma/cli
Length of output: 430
🏁 Script executed:
# Let's also check the cleanup hook mentioned in the scratchpad
rg "sc\.After|profile\.ContainerTerminated" acceptance/git/git.go -B 2 -A 5Repository: conforma/cli
Length of output: 488
🏁 Script executed:
# Check for other instances of profile.BeginContainer in this file and similar patterns
rg "profile\.BeginContainer" acceptance/git/git.go -B 1 -A 20Repository: conforma/cli
Length of output: 555
🏁 Script executed:
# Check for similar patterns in other container setup files
fd "\.go$" acceptance/ --type f | xargs rg "profile\.BeginContainer" -lRepository: conforma/cli
Length of output: 141
🏁 Script executed:
# Let's also verify the structure of the testcontainers.GenericContainer and MappedPort calls
rg "testcontainers\.GenericContainer|\.MappedPort" acceptance/git/git.go -B 2 -A 3Repository: conforma/cli
Length of output: 375
🏁 Script executed:
# Check wiremock.go for the same pattern
cat -n acceptance/wiremock/wiremock.go | grep -A 25 "profile\.BeginContainer"Repository: conforma/cli
Length of output: 917
🏁 Script executed:
# Check registry.go for the same pattern
cat -n acceptance/registry/registry.go | grep -A 25 "profile\.BeginContainer"Repository: conforma/cli
Length of output: 961
Only mark the git container as started on the success path.
Lines 191 and 196 call profile.BeginContainer() on failure paths, causing failed starts to appear active in profiling. Additionally, if git.MappedPort() fails, the container is not stored in state.Container and leaks because the cleanup hook cannot terminate it.
Suggested fix
endContainer := profile.BeginContainer("git")
git, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
- endContainer()
return ctx, err
}
port, err := git.MappedPort(ctx, "443/tcp")
-endContainer()
if err != nil {
- return ctx, err
+ if termErr := git.Terminate(ctx); termErr != nil {
+ return ctx, fmt.Errorf("map git port: %w; terminate container: %v", err, termErr)
+ }
+ return ctx, err
}
+endContainer()
state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
state.RepositoriesDir = repositories
state.Container = git🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/git/git.go` around lines 184 - 203, The
profile.BeginContainer("git") call is marked as started even when subsequent
steps fail and the container may leak; change the flow so you only call the
returned endContainer() after successful startup and registration: call
git.MappedPort(ctx) and on success set state.HostAndPort, state.RepositoriesDir
and state.Container = git, then call endContainer(); if MappedPort or any step
after container creation fails, ensure you terminate the created container
(e.g., call git.Terminate(ctx) or otherwise store it so the existing cleanup can
run) instead of leaving it running while skipping endContainer().
| if state.Container != nil { | ||
| if err := state.Container.Terminate(ctx); err != nil { | ||
| logger, _ := log.LoggerFor(ctx) | ||
| logger.Warnf("failed to terminate git container: %v", err) | ||
| } | ||
| profile.ContainerTerminated() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n acceptance/git/git.go | sed -n '335,350p'Repository: conforma/cli
Length of output: 484
Report container termination success only when it actually succeeds.
Line 343 unconditionally calls profile.ContainerTerminated() after the termination attempt, even when the preceding error branch indicates termination failed. This masks cleanup problems in profiling reports.
Suggested fix
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate git container: %v", err)
- }
- profile.ContainerTerminated()
+ } else {
+ profile.ContainerTerminated()
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if state.Container != nil { | |
| if err := state.Container.Terminate(ctx); err != nil { | |
| logger, _ := log.LoggerFor(ctx) | |
| logger.Warnf("failed to terminate git container: %v", err) | |
| } | |
| profile.ContainerTerminated() | |
| } | |
| if state.Container != nil { | |
| if err := state.Container.Terminate(ctx); err != nil { | |
| logger, _ := log.LoggerFor(ctx) | |
| logger.Warnf("failed to terminate git container: %v", err) | |
| } else { | |
| profile.ContainerTerminated() | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/git/git.go` around lines 338 - 344, The code currently calls
profile.ContainerTerminated() unconditionally after attempting to terminate
state.Container, which masks failures; change the logic in the block handling
state.Container to call profile.ContainerTerminated() only when
state.Container.Terminate(ctx) returns nil (i.e., success), otherwise log the
error as is and do not mark the profile as terminated—locate the state.Container
conditional and the call to Terminate(ctx) and move or guard
profile.ContainerTerminated() so it executes only on successful termination.
| endContainer := profile.BeginContainer("registry") | ||
| registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ | ||
| ContainerRequest: req, | ||
| Started: true, | ||
| Logger: logger, | ||
| }) | ||
| if err != nil { | ||
| endContainer() | ||
| return ctx, err | ||
| } | ||
|
|
||
| port, err := registry.MappedPort(ctx, "5000/tcp") | ||
| endContainer() | ||
| if err != nil { | ||
| return ctx, err | ||
| } | ||
|
|
||
| state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int()) | ||
| state.Container = registry |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n acceptance/registry/registry.go | sed -n '110,135p'Repository: conforma/cli
Length of output: 776
🏁 Script executed:
# Check the profile package to understand BeginContainer contract
rg -t go "func.*BeginContainer" -A 10 acceptance/Repository: conforma/cli
Length of output: 643
🏁 Script executed:
cat -n acceptance/profile/profile.go | head -60Repository: conforma/cli
Length of output: 1821
🏁 Script executed:
cat -n acceptance/profile/profile.go | sed -n '60,130p'Repository: conforma/cli
Length of output: 2170
🏁 Script executed:
cat -n acceptance/profile/profile.go | sed -n '125,160p'Repository: conforma/cli
Length of output: 1135
🏁 Script executed:
cat -n acceptance/profile/profile.go | sed -n '125,165p'Repository: conforma/cli
Length of output: 1281
🏁 Script executed:
cat -n acceptance/profile/profile.go | sed -n '163,180p'Repository: conforma/cli
Length of output: 635
🏁 Script executed:
# Check if there's an sc.After hook mentioned in the scratchpad for cleanup
rg -t go "sc\.After\|registry\.Terminate" acceptance/registry/registry.go -B 2 -A 2Repository: conforma/cli
Length of output: 38
🏁 Script executed:
cat -n acceptance/registry/registry.go | wc -lRepository: conforma/cli
Length of output: 59
🏁 Script executed:
# Check the full startStubRegistry function to see complete context
cat -n acceptance/registry/registry.go | sed -n '100,140p'Repository: conforma/cli
Length of output: 1430
🏁 Script executed:
# Check if there's a cleanup hook for the registry container
grep -n "ContainerTerminated\|Terminate" acceptance/registry/registry.goRepository: conforma/cli
Length of output: 154
🏁 Script executed:
cat -n acceptance/registry/registry.go | sed -n '340,353p'Repository: conforma/cli
Length of output: 467
🏁 Script executed:
# Check wiremock file mentioned in scratchpad for comparison of similar pattern
find acceptance -name "*wiremock*" -type fRepository: conforma/cli
Length of output: 87
🏁 Script executed:
# Check the wiremock implementation for similar pattern
cat -n acceptance/wiremock/wiremock.go | grep -A 20 "BeginContainer"Repository: conforma/cli
Length of output: 724
🏁 Script executed:
# Check wiremock cleanup hook for comparison
grep -n "ContainerTerminated\|Terminate" acceptance/wiremock/wiremock.goRepository: conforma/cli
Length of output: 154
Only mark the registry container as started after port resolution succeeds.
Calling endContainer() at lines 120 and 125 inflates the active-container profile: line 120 counts failed GenericContainer calls as active, and line 125 counts as active even if MappedPort fails. When MappedPort fails, the container is leaked because state.Container is assigned only after port resolution succeeds, leaving no cleanup path.
Suggested fix
endContainer := profile.BeginContainer("registry")
registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
- endContainer()
return ctx, err
}
port, err := registry.MappedPort(ctx, "5000/tcp")
-endContainer()
if err != nil {
- return ctx, err
+ if termErr := registry.Terminate(ctx); termErr != nil {
+ return ctx, fmt.Errorf("map registry port: %w; terminate container: %v", err, termErr)
+ }
+ return ctx, err
}
+endContainer()
state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
state.Container = registry📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| endContainer := profile.BeginContainer("registry") | |
| registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ | |
| ContainerRequest: req, | |
| Started: true, | |
| Logger: logger, | |
| }) | |
| if err != nil { | |
| endContainer() | |
| return ctx, err | |
| } | |
| port, err := registry.MappedPort(ctx, "5000/tcp") | |
| endContainer() | |
| if err != nil { | |
| return ctx, err | |
| } | |
| state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int()) | |
| state.Container = registry | |
| endContainer := profile.BeginContainer("registry") | |
| registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ | |
| ContainerRequest: req, | |
| Started: true, | |
| Logger: logger, | |
| }) | |
| if err != nil { | |
| return ctx, err | |
| } | |
| port, err := registry.MappedPort(ctx, "5000/tcp") | |
| if err != nil { | |
| if termErr := registry.Terminate(ctx); termErr != nil { | |
| return ctx, fmt.Errorf("map registry port: %w; terminate container: %v", err, termErr) | |
| } | |
| return ctx, err | |
| } | |
| endContainer() | |
| state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int()) | |
| state.Container = registry |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/registry/registry.go` around lines 113 - 131, The profiling call
BeginContainer("registry") is opened too early and endContainer() is invoked
even when setup fails, causing active-container inflation and leaks; instead,
call profile.BeginContainer("registry") only after successful port resolution
and assignment to state.Container: create the container with
testcontainers.GenericContainer, if that returns an error just return; then call
registry.MappedPort(ctx, "5000/tcp"), and if MappedPort fails call
registry.Terminate(ctx) (or registry.Stop/Remove equivalent) and return; only
after MappedPort succeeds set state.HostAndPort and state.Container and then
call profile.BeginContainer("registry") to get endContainer for later cleanup.
| state := testenv.FetchState[registryState](ctx) | ||
| if state.Container != nil { | ||
| if err := state.Container.Terminate(ctx); err != nil { | ||
| logger, _ := log.LoggerFor(ctx) | ||
| logger.Warnf("failed to terminate registry container: %v", err) | ||
| } | ||
| profile.ContainerTerminated() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n acceptance/registry/registry.go | sed -n '340,355p'Repository: conforma/cli
Length of output: 467
🏁 Script executed:
rg -n "ContainerTerminated" acceptance/registry/registry.goRepository: conforma/cli
Length of output: 92
🏁 Script executed:
rg -n "profile\s*:=" acceptance/registry/registry.go | head -20Repository: conforma/cli
Length of output: 38
🏁 Script executed:
rg -n "ContainerTerminated" -A 3 --max-count=10Repository: conforma/cli
Length of output: 934
🏁 Script executed:
cat -n acceptance/profile/profile.go | sed -n '160,180p'Repository: conforma/cli
Length of output: 662
🏁 Script executed:
rg -B 5 "profile.ContainerTerminated" acceptance/wiremock/wiremock.go | head -20Repository: conforma/cli
Length of output: 285
🏁 Script executed:
rg -B 5 "profile.ContainerTerminated" acceptance/git/git.go | head -20Repository: conforma/cli
Length of output: 280
Call profile.ContainerTerminated() only after successful termination.
Line 348 calls profile.ContainerTerminated() unconditionally, causing incorrect profiling metrics even when Container.Terminate() fails. The function decrements the active container count and increments the terminated count, so it should only be called on the success path.
Suggested fix
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate registry container: %v", err)
- }
- profile.ContainerTerminated()
+ } else {
+ profile.ContainerTerminated()
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| state := testenv.FetchState[registryState](ctx) | |
| if state.Container != nil { | |
| if err := state.Container.Terminate(ctx); err != nil { | |
| logger, _ := log.LoggerFor(ctx) | |
| logger.Warnf("failed to terminate registry container: %v", err) | |
| } | |
| profile.ContainerTerminated() | |
| } | |
| state := testenv.FetchState[registryState](ctx) | |
| if state.Container != nil { | |
| if err := state.Container.Terminate(ctx); err != nil { | |
| logger, _ := log.LoggerFor(ctx) | |
| logger.Warnf("failed to terminate registry container: %v", err) | |
| } else { | |
| profile.ContainerTerminated() | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/registry/registry.go` around lines 342 - 349, The call to
profile.ContainerTerminated() is executed unconditionally after attempting to
terminate the container, which incorrectly updates profiling metrics even when
state.Container.Terminate(ctx) fails; modify the code in the registry teardown
(where testenv.FetchState[registryState](ctx) and state.Container.Terminate(ctx)
are used) so that profile.ContainerTerminated() is only invoked when Terminate
returns nil (i.e., on the success path), leaving the error-handling/logging path
to log the failure without updating the profile counts.
| endContainer := profile.BeginContainer("wiremock") | ||
| w, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{ | ||
| ContainerRequest: req, | ||
| Started: true, | ||
| Logger: logger, | ||
| }) | ||
| if err != nil { | ||
| endContainer() | ||
| return ctx, fmt.Errorf("unable to run GenericContainer: %v", err) | ||
| } | ||
|
|
||
| port, err := w.MappedPort(ctx, "8080/tcp") | ||
| endContainer() | ||
| if err != nil { | ||
| return ctx, err | ||
| } | ||
|
|
||
| url := fmt.Sprintf("http://localhost:%d", port.Int()) | ||
| state.URL = url | ||
| state.Container = w |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd acceptance/wiremock && cat -n wiremock.go | sed -n '220,245p'Repository: conforma/cli
Length of output: 815
🏁 Script executed:
cat -n acceptance/profile/profile.goRepository: conforma/cli
Length of output: 11021
🏁 Script executed:
git log --oneline acceptance/wiremock/wiremock.go | head -5Repository: conforma/cli
Length of output: 122
Only mark WireMock as started on the success path.
Lines 229 and 234 call endContainer() on failure paths, which incorrectly increments profiling counters for failed startups. Additionally, if MappedPort() fails at line 235, the container w remains running but is never stored in state.Container, leaving it orphaned since cleanup only terminates state.Container. Move endContainer() to after both GenericContainer() and MappedPort() succeed, and explicitly terminate w on MappedPort() failure.
Suggested fix
endContainer := profile.BeginContainer("wiremock")
w, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
- endContainer()
return ctx, fmt.Errorf("unable to run GenericContainer: %v", err)
}
port, err := w.MappedPort(ctx, "8080/tcp")
-endContainer()
if err != nil {
- return ctx, err
+ if termErr := w.Terminate(ctx); termErr != nil {
+ return ctx, fmt.Errorf("map wiremock port: %w; terminate container: %v", err, termErr)
+ }
+ return ctx, err
}
+endContainer()
url := fmt.Sprintf("http://localhost:%d", port.Int())
state.URL = url
state.Container = w🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/wiremock/wiremock.go` around lines 222 - 241, The profiling
endContainer() call is being invoked on failure paths too early and a started
container w can be orphaned if MappedPort fails; update the success/failure
handling so endContainer() is only called after both
testcontainers.GenericContainer (w) and w.MappedPort succeed, assign
state.Container = w before calling endContainer(), and if MappedPort returns an
error terminate w explicitly (call w.Terminate(ctx)) before returning the error;
locate and adjust logic around the GenericContainer call, the MappedPort call,
the endContainer() invocation, and the assignments to state.Container to
implement this flow.
| state := testenv.FetchState[wiremockState](ctx) | ||
| if state.Container != nil { | ||
| if err := state.Container.Terminate(ctx); err != nil { | ||
| logger, _ := log.LoggerFor(ctx) | ||
| logger.Warnf("failed to terminate wiremock container: %v", err) | ||
| } | ||
| profile.ContainerTerminated() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n acceptance/wiremock/wiremock.go | sed -n '325,345p'Repository: conforma/cli
Length of output: 691
🏁 Script executed:
rg "ContainerTerminated" --type go -B 3 -A 3Repository: conforma/cli
Length of output: 1542
🏁 Script executed:
cat -n acceptance/profile/profile.go | head -50Repository: conforma/cli
Length of output: 1631
🏁 Script executed:
rg "ContainerTerminated" -A 10 acceptance/profile/profile.goRepository: conforma/cli
Length of output: 383
🏁 Script executed:
rg -A 5 "if err := state.Container.Terminate" acceptance/registry/registry.go acceptance/git/git.goRepository: conforma/cli
Length of output: 785
Only call profile.ContainerTerminated() when termination succeeds.
Line 337 calls profile.ContainerTerminated() unconditionally, even when state.Container.Terminate(ctx) fails on line 333. This reports the container as terminated in the profiling metrics when it may still be running.
Suggested fix
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate wiremock container: %v", err)
- }
- profile.ContainerTerminated()
+ } else {
+ profile.ContainerTerminated()
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| state := testenv.FetchState[wiremockState](ctx) | |
| if state.Container != nil { | |
| if err := state.Container.Terminate(ctx); err != nil { | |
| logger, _ := log.LoggerFor(ctx) | |
| logger.Warnf("failed to terminate wiremock container: %v", err) | |
| } | |
| profile.ContainerTerminated() | |
| } | |
| state := testenv.FetchState[wiremockState](ctx) | |
| if state.Container != nil { | |
| if err := state.Container.Terminate(ctx); err != nil { | |
| logger, _ := log.LoggerFor(ctx) | |
| logger.Warnf("failed to terminate wiremock container: %v", err) | |
| } else { | |
| profile.ContainerTerminated() | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@acceptance/wiremock/wiremock.go` around lines 331 - 338, The code currently
calls profile.ContainerTerminated() unconditionally after attempting to
terminate the container; change it so profile.ContainerTerminated() is only
invoked when state.Container.Terminate(ctx) returns nil (i.e., termination
succeeded). Specifically, inside the block that checks state.Container != nil,
call state.Container.Terminate(ctx), log any error with logger.Warnf("failed to
terminate wiremock container: %v", err) as before, and only call
profile.ContainerTerminated() in the branch where err == nil; do not call it
when termination fails or when state.Container is nil.
Adds a profile package to the acceptance tests that tracks per-scenario wall-clock timing, container startup timing by type (registry, git, wiremock), Kind cluster creation sub-phases, and peak container concurrency. The profiling report is printed to stderr at the end of the test run, making it visible in GitHub Actions logs. This is temporary instrumentation for the EC-1710 performance spike. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of running make push-image (full multi-stage Dockerfile that compiles Go inside the container), build the ec and kubectl binaries locally using the host's Go cache, then inject them into a minimal ubi-minimal base image via a new Dockerfile.acceptance. This avoids redundant Go compilation inside the container and leverages the host's build cache. Expected CI savings: ~2-3min (3m24s -> ~30s). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After applying cluster resources, wait only for the in-cluster registry before starting image builds. The CLI image build and task bundle build now run concurrently with each other and with the Tekton Pipelines deployment, since they only need the registry to push images. Expected CI savings: ~1m40s (builds overlap with Tekton deployment, and with each other). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This improves UX by giving the user clear indications about when the test suite started, so he can estimate when it will finish. It also outputs the total elapsed time at the end, for informational purposes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Ref: https://redhat.atlassian.net/browse/EC-1710
The acceptance job's Go build cache was broken: it used actions/cache/restore (read-only) with path '**' and key 'main', which didn't match the actual cache paths (~/.cache/go-build, ~/go/pkg/mod). Every CI run compiled from scratch, causing kind-build-cli-image (~4m41s) and kind-build-task-bundle (~4m16s) to dominate wall-clock time. Fix by switching to read-write actions/cache targeting the correct directories with a content-based key derived from go.sum files. Also download a pre-built tkn binary (version extracted from tools/go.mod) instead of compiling the Tekton CLI from source on every run. The Makefile task-bundle target now uses tkn from PATH when available, falling back to go run for local development. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The conforma/github-workflows/golang-version-check action scans all files matching Dockerfile* for a Go builder stage. Our minimal acceptance Dockerfile has no build stage (binaries are pre-built on the host), causing the check to fail. Rename to acceptance.Dockerfile so it doesn't match the Dockerfile* glob. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three changes to make test failures immediately visible: - Gate logExecution on scenario failure so passing scenarios are silent - Gate conftest debug output on command failure - Switch godog formatter from "pretty" to "progress" (override with -format flag or EC_ACCEPTANCE_FORMAT env var) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Route snapshot artifact debug prints (file descriptor, manifest, repo, digest) through the per-scenario file logger instead of stdout - Suppress k8s client-side throttling warnings by redirecting klog output to io.Discard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
klog defaults to logtostderr=true, which bypasses SetOutput entirely. Must call LogToStderr(false) first so SetOutput(io.Discard) takes effect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The shell runs with -e (set in .SHELLFLAGS), so a go test failure aborts before the timing echo. Capture the exit code and re-propagate it after printing duration and collecting coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove duplicated error from failed scenarios summary since go test already prints the full error via --- FAIL: lines. Move summary and profiling report to TestMain so they appear after all go test output. Gate verbose execution details behind -verbose flag, use diff for stderr assertions, and suppress progress formatter output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instrument key scenario steps (create-namespace, create-policy, create-snapshot, run-task, await-task-done, build-snapshot-artifact) and extract retrospective Tekton TaskRun timing data (scheduling overhead, per-step execution, overall execution time) to identify performance bottlenecks in Kind scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip CPU resource requests from task bundle steps in acceptance tests to eliminate Tekton scheduling waterfall. Each TaskRun pod requested 2600m CPU, limiting concurrent pods to 1-2 on CI (3.5 allocatable CPUs), effectively serializing 26 Kind scenarios. Removing requests allows all concurrent pods to schedule immediately. Also set imagePullPolicy: IfNotPresent on CLI image steps as a defensive measure, and parallelize namespaced resource application in applyResources by applying cluster-scoped resources first then namespaced resources concurrently via errgroup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Download the official kubectl binary from dl.k8s.io in the CI workflow, version-matched to tools/kubectl/go.mod, avoiding compilation of the entire k8s.io/kubernetes dependency tree. In image.go, prefer a PATH-available kubectl over go build, falling back to source build for local development without kubectl installed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The require directive is a single-line format where awk $2 gives the module path, not the version. Use grep -oE to extract the version pattern directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the top-priority improvements identified by the EC-1710 spike
investigation, which profiled acceptance tests in both local (22-core,
6m56s) and CI (4 vCPU, 17m34s) environments. All changes are scoped to
the acceptance test infrastructure -- no production code is affected.
Four independent improvements:
Container termination in After hooks (WI-3): Store testcontainers
references in scenario state and call Terminate() in sc.After hooks,
respecting the persist flag. Previously all 285 containers accumulated
with zero cleanup during test runs.
Per-scenario file-based logging (WI-4): Replace testing.T log
delegation with per-scenario temp files, eliminating log interleaving
across parallel goroutines. Remove shouldSuppress() and /dev/tty
workarounds. Replace os.Exit(1) with t.Fatalf for proper test cleanup.
Failed scenario log file paths now appear in the summary.
Parallelize task bundle builds (WI-1): Convert sequential
buildTaskBundleImage loop to use errgroup for concurrent execution.
Each version produces an independent image tag. Target: reduce
kind-build-task-bundle CI time by 50%+ (from 2m12s baseline).
CLI image caching (WI-2): Compute a SHA-256 content hash of Go source
files, Dockerfile, and build scripts before invoking make push-image.
Skip the build when the hash matches a cached marker file. Cache is
per-registry-port for concurrent run safety.
Profiling baselines from the spike exist in both local and CI
environments. Post-merge profiling comparison is planned to validate
the improvements.
Ref: https://issues.redhat.com/browse/EC-1710
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com