Skip to content

[WIP] Improve acceptance test performance and reliability#3230

Open
st3penta wants to merge 19 commits intoconforma:mainfrom
st3penta:EC-1710-impl
Open

[WIP] Improve acceptance test performance and reliability#3230
st3penta wants to merge 19 commits intoconforma:mainfrom
st3penta:EC-1710-impl

Conversation

@st3penta
Copy link
Copy Markdown
Contributor

Implements the top-priority improvements identified by the EC-1710 spike
investigation, which profiled acceptance tests in both local (22-core,
6m56s) and CI (4 vCPU, 17m34s) environments. All changes are scoped to
the acceptance test infrastructure -- no production code is affected.

Four independent improvements:

  1. Container termination in After hooks (WI-3): Store testcontainers
    references in scenario state and call Terminate() in sc.After hooks,
    respecting the persist flag. Previously all 285 containers accumulated
    with zero cleanup during test runs.

  2. Per-scenario file-based logging (WI-4): Replace testing.T log
    delegation with per-scenario temp files, eliminating log interleaving
    across parallel goroutines. Remove shouldSuppress() and /dev/tty
    workarounds. Replace os.Exit(1) with t.Fatalf for proper test cleanup.
    Failed scenario log file paths now appear in the summary.

  3. Parallelize task bundle builds (WI-1): Convert sequential
    buildTaskBundleImage loop to use errgroup for concurrent execution.
    Each version produces an independent image tag. Target: reduce
    kind-build-task-bundle CI time by 50%+ (from 2m12s baseline).

  4. CLI image caching (WI-2): Compute a SHA-256 content hash of Go source
    files, Dockerfile, and build scripts before invoking make push-image.
    Skip the build when the hash matches a cached marker file. Cache is
    per-registry-port for concurrent run safety.

Profiling baselines from the spike exist in both local and CI
environments. Post-merge profiling comparison is planned to validate
the improvements.

Ref: https://issues.redhat.com/browse/EC-1710

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

st3penta and others added 3 commits April 13, 2026 18:13
Profiling showed all 285 containers accumulated during the test run
with zero termination, wasting memory and causing Docker daemon
overhead. Each container type (registry, git, wiremock) now stores
the testcontainers.Container reference and calls Terminate(ctx) in
the sc.After hook, respecting the persist flag.

Ref: EC-1710

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each scenario now writes to its own temp file, eliminating log
interleaving across parallel goroutines. Removes shouldSuppress()
which incorrectly filtered Error-level messages, and removes
/dev/tty writes that failed silently in CI. Replaces os.Exit(1)
with t.Fatalf() so TestMain cleanup runs properly. Failed scenario
log file paths are printed in the test summary.

Ref: EC-1710

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sequential per-version task bundle build loop was the largest CI
regression at 4.9x slower than local (2m12s vs 27s). Each version
produces an independent bundle image, so the builds are safe to run
concurrently. Uses errgroup to propagate the first error and cancel
remaining builds via context.

Ref: EC-1710

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

Important

Review skipped

Ignore keyword(s) in the title.

⛔ Ignored keywords (1)
  • WIP

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 3a719229-21a7-45b2-a753-6dbcb5dd061a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds per-scenario file logging, a runtime profiling package with CI artifact upload, container lifecycle tracking for testcontainers, CLI build caching and concurrent task-bundle builds for kind clusters, and acceptance-test failure/log tracking integrated into scenario lifecycle.

Changes

Cohort / File(s) Summary
Logging System
acceptance/log/log.go, acceptance/log/log_test.go
Replaced testing.T-backed delegation with per-scenario file-based logger. Logger now exposes Close() and LogFile(). Removed suppression logic; tests rewritten to assert on log files, caching, uniqueness and isolation.
Profiling Package & CI
acceptance/profile/profile.go, .github/workflows/checks-codecov.yaml
Added profile package (Init/Begin/BeginContainer/ScenarioStart/ScenarioEnd/Report) that emits JSONL events and a summary report; CI workflow now collects /tmp/ec-profile and report.txt as artifacts.
Acceptance Test Harness
acceptance/acceptance_test.go
Extended failedScenario with LogFile; integrated profile.ScenarioStart/ScenarioEnd into scenario hooks; use scenario logger LogFile() and Close() on teardown; record or remove per-scenario logs; non-zero godog exit now fails test via t.Fatalf.
Container Lifecycle Management
acceptance/git/git.go, acceptance/registry/registry.go, acceptance/wiremock/wiremock.go
State structs now hold running testcontainers.Container. Container startup is wrapped in profile spans; started container handles are saved; sc.After hooks terminate containers for non-persisted runs and signal profile.ContainerTerminated() (warnings logged on termination errors).
Kubernetes / kind: Build & Profiling
acceptance/kubernetes/kind/image.go, acceptance/kubernetes/kind/kind.go
Added SHA-256 hashing of CLI build inputs and marker-file cache to skip redundant make push-image; buildTaskBundleImage now runs builds concurrently with errgroup; cluster start flow instrumented with profiling spans and explicit span endings.
Module Dependencies
acceptance/go.mod
Promoted golang.org/x/sync v0.19.0 to a direct require; removed indirect golang.org/x/sync and github.com/stretchr/objx entries.
Other tests / CI
acceptance/acceptance_test.go, acceptance/log/log_test.go
Updated tests and test harness to use file-based logs and integrate profiling lifecycle calls.

Sequence Diagram(s)

sequenceDiagram
    participant TestRunner
    participant Profile
    participant ScenarioLogger
    participant ContainerManager

    TestRunner->>Profile: ScenarioStart(ctx)
    TestRunner->>ScenarioLogger: LoggerFor(ctx) -> create temp log file
    Note right of ScenarioLogger: per-scenario file opened
    TestRunner->>ContainerManager: BeginContainer("type")
    ContainerManager->>Profile: record container begin
    TestRunner->>TestRunner: execute scenario steps
    TestRunner->>Profile: ScenarioEnd(ctx, name, feature)
    Profile->>ScenarioLogger: request LogFile()
    ScenarioLogger-->>TestRunner: return logFile path
    TestRunner->>ContainerManager: terminate container (if not persisted)
    ContainerManager-->>TestRunner: stop container (warn on error)
    TestRunner->>ScenarioLogger: Close() or os.Remove(logFile) on success
    TestRunner->>Profile: Report()
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 67.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The description comprehensively details all four improvements (container termination, per-scenario logging, parallelized task bundle builds, and CLI image caching), provides context from the EC-1710 spike profiling, and explicitly states that only acceptance test infrastructure is affected with no production code changes.
Title check ✅ Passed The title clearly summarizes the main objective: improving acceptance test performance and reliability through infrastructure optimizations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Improve acceptance test performance and reliability with container cleanup and logging

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Implement container termination in After hooks to prevent accumulation
• Replace testing.T log delegation with per-scenario file-based logging
• Parallelize task bundle builds using errgroup for concurrent execution
• Add content-hash caching for CLI image builds to skip redundant builds
Diagram
flowchart LR
  A["Acceptance Tests"] --> B["Container Termination"]
  A --> C["File-Based Logging"]
  A --> D["Parallel Builds"]
  A --> E["Image Caching"]
  B --> B1["Git Container"]
  B --> B2["Registry Container"]
  B --> B3["WireMock Container"]
  C --> C1["Per-Scenario Temp Files"]
  C --> C2["Log File Paths in Summary"]
  D --> D1["Task Bundle Builds"]
  E --> E1["CLI Image Hash Cache"]
Loading

Grey Divider

File Changes

1. acceptance/acceptance_test.go ✨ Enhancement +15/-21

Track and report scenario log files in test summary

• Add LogFile field to failedScenario struct to track per-scenario log files
• Update addFailure method to accept and store log file path
• Print log file paths in failure summary output
• Replace os.Exit(1) with t.Fatalf() for proper test cleanup
• Remove /dev/tty workaround and file-based logging in After hook
• Clean up log files for passing scenarios

acceptance/acceptance_test.go


2. acceptance/git/git.go ✨ Enhancement +10/-1

Terminate git container in After hook

• Add Container field to gitState struct to store testcontainers reference
• Store container reference in startStubGitServer function
• Add container termination logic in After hook respecting persist flag
• Log warnings if container termination fails

acceptance/git/git.go


3. acceptance/kubernetes/kind/image.go ✨ Enhancement +100/-9

Add CLI image caching and parallelize task bundle builds

• Add SHA-256 content hash computation for build inputs (Go source, go.mod, go.sum, Dockerfile,
 build.sh, Makefile, hack/reduce-snapshot.sh)
• Implement cache marker file per registry port for concurrent safety
• Skip CLI image build when hash matches cached value
• Write cache hash after successful build
• Convert sequential task bundle builds to concurrent execution using errgroup
• Use context cancellation to propagate errors across parallel builds

acceptance/kubernetes/kind/image.go


View more (5)
4. acceptance/log/log.go ✨ Enhancement +58/-69

Implement per-scenario file-based logging system

• Replace testing.T log delegation with per-scenario temporary file logging
• Implement fileLogger struct for file-based output with mutex protection
• Remove shouldSuppress() function that filtered log messages incorrectly
• Add Close() and LogFile() methods to Logger interface
• Create unique temp file per scenario in LoggerFor() function
• Remove dependency on testenv.TestingT context value

acceptance/log/log.go


5. acceptance/log/log_test.go 🧪 Tests +86/-25

Rewrite tests for file-based logging implementation

• Replace mock-based tests with file-based integration tests
• Add test for logger writing to file with various log levels
• Add test for logger caching behavior with same context
• Add test for logger uniqueness across different contexts
• Add test for log isolation between scenarios
• Add test for temporary file creation and naming

acceptance/log/log_test.go


6. acceptance/registry/registry.go ✨ Enhancement +22/-0

Terminate registry container in After hook

• Add Container field to registryState struct to store testcontainers reference
• Store container reference in startStubRegistry function
• Add container termination logic in After hook respecting persist flag
• Log warnings if container termination fails

acceptance/registry/registry.go


7. acceptance/wiremock/wiremock.go ✨ Enhancement +26/-4

Terminate wiremock container in After hook

• Add Container field to wiremockState struct to store testcontainers reference
• Store container reference in StartWiremock function
• Add container termination logic in separate After hook respecting persist flag
• Log warnings if container termination fails
• Fix typo in comment from "nay" to "any"

acceptance/wiremock/wiremock.go


8. acceptance/go.mod Dependencies +1/-2

Add golang.org/x/sync dependency for errgroup

• Add golang.org/x/sync v0.19.0 as direct dependency for errgroup usage
• Remove github.com/stretchr/objx v0.5.2 from indirect dependencies (no longer needed)
• Move golang.org/x/sync v0.19.0 from indirect to direct dependencies

acceptance/go.mod


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown
Contributor

qodo-code-review bot commented Apr 13, 2026

Code Review by Qodo

🐞 Bugs (2)   📘 Rule violations (0)   📎 Requirement gaps (0)
🐞\ ≡ Correctness (1) ☼ Reliability (1)

Grey Divider


Action required

1. Loop vars captured 🐞
Description
buildTaskBundleImage launches goroutines that close over the range variables version and tasks;
because those variables are reused across iterations, goroutines can run with the wrong
TASK_TAG/TASKS and push incorrect bundle images.
Code

acceptance/kubernetes/kind/image.go[R229-240]

+	g, gCtx := errgroup.WithContext(ctx)
for version, tasks := range taskBundles {
-		tasksPath := strings.Join(tasks, ",")
-		cmd := exec.CommandContext(ctx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
-		if out, err := cmd.CombinedOutput(); err != nil {
-			fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
-			fmt.Print(string(out))
-			return err
-		}
+		g.Go(func() error {
+			tasksPath := strings.Join(tasks, ",")
+			cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
+			if out, err := cmd.CombinedOutput(); err != nil {
+				fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
+				fmt.Print(string(out))
+				return err
+			}
+			return nil
+		})
Evidence
In Go, the iteration variables in a for range loop are reused; the closure passed to g.Go
references version and tasks directly, so concurrent execution can observe the final (or
otherwise incorrect) values instead of the intended per-iteration values.

acceptance/kubernetes/kind/image.go[229-240]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`buildTaskBundleImage` starts goroutines inside a `for version, tasks := range taskBundles` loop, but the goroutine closure captures the loop variables. This can cause all goroutines to use the wrong `version`/`tasks` values, producing incorrect `TASK_TAG` and `TASKS` when running concurrently.
## Issue Context
This is a classic Go `for range` closure-capture bug; the fix is to create per-iteration copies (shadow variables) before launching the goroutine.
## Fix Focus Areas
- acceptance/kubernetes/kind/image.go[229-240]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Cache hash incomplete🐞
Description
computeSourceHash only hashes .go files plus a small hardcoded file list, but the CLI Docker build
context copies the whole repository and the binary embeds/depends on non-.go files and additional
scripts, so the cache can incorrectly skip rebuilding when those inputs change.
Code

acceptance/kubernetes/kind/image.go[R82-129]

+// computeSourceHash computes a SHA-256 hash of all build inputs for the CLI
+// image: Go source files, go.mod, go.sum, Dockerfile, build.sh, Makefile, and
+// hack/reduce-snapshot.sh. Returns a hex-encoded digest string.
+func computeSourceHash() (string, error) {
+	h := sha256.New()
+
+	// Hash individual build files
+	buildFiles := []string{
+		"go.mod",
+		"go.sum",
+		"Dockerfile",
+		"build.sh",
+		"Makefile",
+		"hack/reduce-snapshot.sh",
+	}
+	for _, f := range buildFiles {
+		if err := hashFile(h, f); err != nil {
+			if errors.Is(err, os.ErrNotExist) {
+				continue
+			}
+			return "", fmt.Errorf("hashing %s: %w", f, err)
+		}
+	}
+
+	// Hash all .go source files
+	if err := filepath.WalkDir(".", func(path string, d os.DirEntry, err error) error {
+		if err != nil {
+			return err
+		}
+
+		// Skip vendor, .git, and acceptance test directories
+		if d.IsDir() && (d.Name() == "vendor" || d.Name() == ".git" || d.Name() == "acceptance") {
+			return filepath.SkipDir
+		}
+
+		if !d.IsDir() && strings.HasSuffix(path, ".go") {
+			if err := hashFile(h, path); err != nil {
+				return err
+			}
+		}
+
+		return nil
+	}); err != nil {
+		return "", fmt.Errorf("walking source tree: %w", err)
+	}
+
+	return hex.EncodeToString(h.Sum(nil)), nil
+}
Evidence
The cache decision is based on computeSourceHash, which hashes only .go files and a few named
files. However, the Docker build copies the entire repo (COPY . .) and the CLI build incorporates
non-.go inputs (e.g., go:embed templates/JSON) and scripts like hack/derive-version.sh
(invoked by build.sh), so changes to these inputs won’t change the computed hash and can lead to a
stale CLI image being reused.

acceptance/kubernetes/kind/image.go[82-129]
Dockerfile[31-42]
build.sh[43-47]
internal/applicationsnapshot/report.go[324-326]
pkg/schema/schema.go[19-30]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`computeSourceHash()` is used to decide whether to skip rebuilding the CLI image, but it does not cover all build inputs included in the Docker build context (and thus can skip when the resulting image would change).
## Issue Context
- Dockerfile copies the entire repository (`COPY . .`).
- The codebase embeds non-`.go` assets via `go:embed` (templates/JSON).
- `build.sh` relies on additional scripts (`hack/derive-version.sh`, etc.).
A safe cache key needs to include all files that can affect the built image/binary (or an equivalent proxy like a git tree hash of tracked files).
## Fix Focus Areas
- acceptance/kubernetes/kind/image.go[82-129]
- Dockerfile[31-42]
- build.sh[43-47]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Suite logger leaks files 🐞
Description
LoggerFor now creates a temp log file whenever the context lacks a logger, but suite-level code
paths (e.g., AfterSuite calling kind.Destroy) call LoggerFor without closing/removing the created
file, leaving extra temp files (and an open fd until process exit).
Code

acceptance/log/log.go[R158-179]

func LoggerFor(ctx context.Context) (Logger, context.Context) {
if logger, ok := ctx.Value(loggerKey).(Logger); ok {
return logger, ctx
}
-	delegate, ok := ctx.Value(testenv.TestingT).(DelegateLogger)
-	if !ok {
-		panic("No testing.T found in context")
+	id := counter.Add(1)
+
+	f, err := os.CreateTemp("", fmt.Sprintf("scenario-%010d-*.log", id))
+	if err != nil {
+		panic(fmt.Sprintf("failed to create scenario log file: %v", err))
}
+	delegate := &fileLogger{file: f}
+
logger := logger{
t:    delegate,
-		id:   counter.Add(1),
+		id:   id,
name: "*",
+		path: f.Name(),
}
return &logger, context.WithValue(ctx, loggerKey, &logger)
Evidence
LoggerFor unconditionally allocates a temp file on first use for a given context and returns a
logger that must be closed to release the file handle. The suite teardown calls kind.Destroy(ctx)
from AfterSuite using the suite context (not a scenario context), and Destroy obtains a logger
via LoggerFor but never closes it, so this path creates a temp file that is not cleaned up by the
per-scenario After hook.

acceptance/log/log.go[156-180]
acceptance/kubernetes/kind/kind.go[412-438]
acceptance/kubernetes/kubernetes.go[551-556]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`LoggerFor` now creates a temp file-backed logger for any context without an existing logger, but some non-scenario contexts (notably suite teardown) call `LoggerFor` and never `Close()` / remove the file. This leaves extra temp files behind and keeps file descriptors open until process exit.
## Issue Context
- Scenario contexts are cleaned up by `acceptance_test.go`’s `sc.After`, but suite context used by `AfterSuite` is not.
- `kind.Destroy` calls `log.LoggerFor(ctx)` and does not call `Close()`.
## Fix Focus Areas
- acceptance/log/log.go[156-180]
- acceptance/kubernetes/kind/kind.go[412-438]
- acceptance/kubernetes/kubernetes.go[551-556]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +229 to +240
g, gCtx := errgroup.WithContext(ctx)
for version, tasks := range taskBundles {
tasksPath := strings.Join(tasks, ",")
cmd := exec.CommandContext(ctx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
if out, err := cmd.CombinedOutput(); err != nil {
fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
fmt.Print(string(out))
return err
}
g.Go(func() error {
tasksPath := strings.Join(tasks, ",")
cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
if out, err := cmd.CombinedOutput(); err != nil {
fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
fmt.Print(string(out))
return err
}
return nil
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Loop vars captured 🐞 Bug ≡ Correctness

buildTaskBundleImage launches goroutines that close over the range variables version and tasks;
because those variables are reused across iterations, goroutines can run with the wrong
TASK_TAG/TASKS and push incorrect bundle images.
Agent Prompt
## Issue description
`buildTaskBundleImage` starts goroutines inside a `for version, tasks := range taskBundles` loop, but the goroutine closure captures the loop variables. This can cause all goroutines to use the wrong `version`/`tasks` values, producing incorrect `TASK_TAG` and `TASKS` when running concurrently.

## Issue Context
This is a classic Go `for range` closure-capture bug; the fix is to create per-iteration copies (shadow variables) before launching the goroutine.

## Fix Focus Areas
- acceptance/kubernetes/kind/image.go[229-240]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
acceptance/registry/registry.go (1)

112-127: ⚠️ Potential issue | 🟠 Major

Store the registry container handle before MappedPort.

If GenericContainer succeeds and Line 121 fails, state.Container is still nil, so the new After hook cannot terminate the leaked registry container.

Suggested fix
 	registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
 		ContainerRequest: req,
 		Started:          true,
 		Logger:           logger,
 	})
 	if err != nil {
 		return ctx, err
 	}
+	state.Container = registry

 	port, err := registry.MappedPort(ctx, "5000/tcp")
 	if err != nil {
 		return ctx, err
 	}

 	state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
-	state.Container = registry
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/registry/registry.go` around lines 112 - 127, After successfully
creating the container with testcontainers.GenericContainer (the registry
variable), immediately store the handle into state.Container before calling
registry.MappedPort so that if MappedPort or any subsequent step fails the After
hook can still terminate the container; update the code around GenericContainer
and state.Container assignment (and ensure state.Container is set prior to
calling registry.MappedPort) so the registry handle is never left nil on error.
acceptance/git/git.go (1)

192-199: ⚠️ Potential issue | 🟠 Major

Partially started git containers can still leak.

state.Container is only recorded after MappedPort succeeds, and Line 330 skips cleanup when state.Up() is false. If the container starts but startup fails before both state fields are populated, this path still leaves it running.

Suggested fix
  git, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
  	ContainerRequest: req,
  	Started:          true,
  	Logger:           logger,
  })
  if err != nil {
  	return ctx, err
  }
+	state.Container = git

  port, err := git.MappedPort(ctx, "443/tcp")
  if err != nil {
  	return ctx, err
  }

  state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
  state.RepositoriesDir = repositories
-	state.Container = git

  return ctx, nil
-		if !state.Up() {
-			return ctx, nil
-		}
-
 		if state.Container != nil {
 			if err := state.Container.Terminate(ctx); err != nil {
 				logger, _ := log.LoggerFor(ctx)
 				logger.Warnf("failed to terminate git container: %v", err)
 			}
 		}
+
+		if state.RepositoriesDir == "" {
+			return ctx, nil
+		}

 		os.RemoveAll(state.RepositoriesDir)

Also applies to: 330-339

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/git/git.go` around lines 192 - 199, The container can leak because
state.Container is only set after git.MappedPort succeeds; set state.Container
as soon as the container is created (assign state.Container = git immediately
after creation) so the cleanup path that checks state.Up() will see the running
container even if MappedPort fails, and ensure HostAndPort/RepositoriesDir are
only set after MappedPort succeeds; alternatively, register a deferred cleanup
that references the created container on error. Update the logic around git (the
created container), git.MappedPort, state.Container, state.HostAndPort,
state.RepositoriesDir and the state.Up() check so partial startups still get
cleaned up.
acceptance/acceptance_test.go (1)

149-157: ⚠️ Potential issue | 🟠 Major

Keep the log file until Persist succeeds.

When scenarioErr is nil, Line 153 deletes the log before testenv.Persist(ctx) runs. If persistence then fails, the scenario becomes failed but its log is already gone and never shows up in the failure summary.

Suggested fix
-		if scenarioErr != nil {
-			tracker.addFailure(scenario.Name, scenario.Uri, logFile, scenarioErr)
-		} else {
-			// Clean up log files for passing scenarios
-			os.Remove(logFile)
-		}
-
-		_, err := testenv.Persist(ctx)
-		return ctx, err
+		_, err := testenv.Persist(ctx)
+		if err != nil {
+			tracker.addFailure(scenario.Name, scenario.Uri, logFile, err)
+			return ctx, err
+		}
+
+		if scenarioErr != nil {
+			tracker.addFailure(scenario.Name, scenario.Uri, logFile, scenarioErr)
+			return ctx, nil
+		}
+
+		// Clean up log files for passing scenarios only after all scenario teardown succeeded.
+		_ = os.Remove(logFile)
+		return ctx, nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/acceptance_test.go` around lines 149 - 157, The code currently
removes the scenario log (os.Remove(logFile)) when scenarioErr is nil before
calling testenv.Persist(ctx), which can lose logs if Persist fails; change the
flow so you only delete the log file after Persist succeeds: remove the
os.Remove(logFile) from the branch where scenarioErr == nil and instead call
os.Remove(logFile) only after testenv.Persist(ctx) returns no error (and if
Persist returns an error, ensure the log is retained and that tracker.addFailure
or equivalent is invoked so the failure summary can include the log); reference
functions/variables: scenarioErr, logFile, tracker.addFailure, and
testenv.Persist.
acceptance/wiremock/wiremock.go (1)

220-236: ⚠️ Potential issue | 🟠 Major

Store the WireMock container handle before port lookup.

If GenericContainer succeeds and Line 229 fails, state.Container is never set, so the new scenario cleanup cannot terminate the leaked WireMock container.

Suggested fix
 	w, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
 		ContainerRequest: req,
 		Started:          true,
 		Logger:           logger,
 	})
 	if err != nil {
 		return ctx, fmt.Errorf("unable to run GenericContainer: %v", err)
 	}
+	state.Container = w

 	port, err := w.MappedPort(ctx, "8080/tcp")
 	if err != nil {
 		return ctx, err
 	}

 	url := fmt.Sprintf("http://localhost:%d", port.Int())
 	state.URL = url
-	state.Container = w
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/wiremock/wiremock.go` around lines 220 - 236, The container handle
returned by testcontainers.GenericContainer (variable w) must be stored to
state.Container immediately after creation so cleanup can always find and
terminate the container even if the subsequent port lookup (w.MappedPort) fails;
modify the block in wiremock.go to set state.Container = w right after checking
err from GenericContainer (before calling w.MappedPort or fmt.Sprintf) and keep
the existing error returns for port lookup so leaked containers are avoidable by
the cleanup routine.
acceptance/kubernetes/kind/image.go (1)

53-76: ⚠️ Potential issue | 🟠 Major

Add image existence validation before accepting cache hit.

Lines 60-62 skip the image build based only on source-hash comparison against a local marker file. The code never verifies that localhost:<registryPort>/cli still exists in the registry behind that port. If the registry container restarts, gets cleared, or the port is reused with a fresh registry state, a cache hit will skip make push-image while the expected image is missing, causing later pulls to fail.

Add a registry image existence check (e.g., remote.Head()) before the cache-hit early return at line 62.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/kubernetes/kind/image.go` around lines 53 - 76, The cache-hit
logic in computeSourceHash()/image build flow relies only on a local cacheFile
and may skip building a missing registry image; before returning early where it
currently prints "[INFO] CLI image cache hit, skipping build", perform an
existence check against the registry (e.g., call a registry head/get method such
as remote.Head() or equivalent HTTP HEAD to
localhost:<k.registryPort>/v2/cli/manifests/latest) for the image repo
"localhost:<registryPort>/cli" and only treat it as a true cache hit if the
registry confirms the image exists; if the head request fails or returns 404,
fall back to running the existing build path (exec.CommandContext(... "make",
"push-image"...)) and only write cacheFile after that successful push.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@acceptance/kubernetes/kind/image.go`:
- Around line 229-243: The current loop spawns an unbounded errgroup goroutine
per version (using errgroup.WithContext and g.Go) which can overwhelm CI;
instead limit concurrency by creating a bounded worker pool or semaphore and
acquire before launching each goroutine (release when done) so at most N
concurrent make task-bundle invocations run; update the block that iterates over
taskBundles (the loop using g.Go, tasksPath, exec.CommandContext and
cmd.CombinedOutput) to acquire the semaphore, spawn the goroutine that defers
release, and use the same errgroup to collect errors, exposing the concurrency
limit as a constant or field (e.g., maxConcurrentTaskBundles) so it can be tuned
for CI.

---

Outside diff comments:
In `@acceptance/acceptance_test.go`:
- Around line 149-157: The code currently removes the scenario log
(os.Remove(logFile)) when scenarioErr is nil before calling
testenv.Persist(ctx), which can lose logs if Persist fails; change the flow so
you only delete the log file after Persist succeeds: remove the
os.Remove(logFile) from the branch where scenarioErr == nil and instead call
os.Remove(logFile) only after testenv.Persist(ctx) returns no error (and if
Persist returns an error, ensure the log is retained and that tracker.addFailure
or equivalent is invoked so the failure summary can include the log); reference
functions/variables: scenarioErr, logFile, tracker.addFailure, and
testenv.Persist.

In `@acceptance/git/git.go`:
- Around line 192-199: The container can leak because state.Container is only
set after git.MappedPort succeeds; set state.Container as soon as the container
is created (assign state.Container = git immediately after creation) so the
cleanup path that checks state.Up() will see the running container even if
MappedPort fails, and ensure HostAndPort/RepositoriesDir are only set after
MappedPort succeeds; alternatively, register a deferred cleanup that references
the created container on error. Update the logic around git (the created
container), git.MappedPort, state.Container, state.HostAndPort,
state.RepositoriesDir and the state.Up() check so partial startups still get
cleaned up.

In `@acceptance/kubernetes/kind/image.go`:
- Around line 53-76: The cache-hit logic in computeSourceHash()/image build flow
relies only on a local cacheFile and may skip building a missing registry image;
before returning early where it currently prints "[INFO] CLI image cache hit,
skipping build", perform an existence check against the registry (e.g., call a
registry head/get method such as remote.Head() or equivalent HTTP HEAD to
localhost:<k.registryPort>/v2/cli/manifests/latest) for the image repo
"localhost:<registryPort>/cli" and only treat it as a true cache hit if the
registry confirms the image exists; if the head request fails or returns 404,
fall back to running the existing build path (exec.CommandContext(... "make",
"push-image"...)) and only write cacheFile after that successful push.

In `@acceptance/registry/registry.go`:
- Around line 112-127: After successfully creating the container with
testcontainers.GenericContainer (the registry variable), immediately store the
handle into state.Container before calling registry.MappedPort so that if
MappedPort or any subsequent step fails the After hook can still terminate the
container; update the code around GenericContainer and state.Container
assignment (and ensure state.Container is set prior to calling
registry.MappedPort) so the registry handle is never left nil on error.

In `@acceptance/wiremock/wiremock.go`:
- Around line 220-236: The container handle returned by
testcontainers.GenericContainer (variable w) must be stored to state.Container
immediately after creation so cleanup can always find and terminate the
container even if the subsequent port lookup (w.MappedPort) fails; modify the
block in wiremock.go to set state.Container = w right after checking err from
GenericContainer (before calling w.MappedPort or fmt.Sprintf) and keep the
existing error returns for port lookup so leaked containers are avoidable by the
cleanup routine.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a4add62f-86ca-444e-970b-ae1ac872bc1a

📥 Commits

Reviewing files that changed from the base of the PR and between b1ede77 and c6cfb67.

📒 Files selected for processing (8)
  • acceptance/acceptance_test.go
  • acceptance/git/git.go
  • acceptance/go.mod
  • acceptance/kubernetes/kind/image.go
  • acceptance/log/log.go
  • acceptance/log/log_test.go
  • acceptance/registry/registry.go
  • acceptance/wiremock/wiremock.go

Comment on lines +229 to +243
g, gCtx := errgroup.WithContext(ctx)
for version, tasks := range taskBundles {
tasksPath := strings.Join(tasks, ",")
cmd := exec.CommandContext(ctx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
if out, err := cmd.CombinedOutput(); err != nil {
fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
fmt.Print(string(out))
return err
}
g.Go(func() error {
tasksPath := strings.Join(tasks, ",")
cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
if out, err := cmd.CombinedOutput(); err != nil {
fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
fmt.Print(string(out))
return err
}
return nil
})
}

return nil
return g.Wait()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cap the task-bundle fan-out.

This launches one make task-bundle per version with no limit. On CI, that can swamp the image builder/registry and reduce reliability instead of improving it.

Suggested fix
 import (
 	"archive/tar"
 	"compress/gzip"
 	"context"
 	"crypto/sha256"
 	"encoding/hex"
 	"errors"
 	"fmt"
 	"io"
 	"os"
 	"os/exec"
 	"path/filepath"
+	"runtime"
 	"strings"
@@
 	g, gCtx := errgroup.WithContext(ctx)
+	g.SetLimit(runtime.NumCPU())
 	for version, tasks := range taskBundles {
 		g.Go(func() error {
 			tasksPath := strings.Join(tasks, ",")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
g, gCtx := errgroup.WithContext(ctx)
for version, tasks := range taskBundles {
tasksPath := strings.Join(tasks, ",")
cmd := exec.CommandContext(ctx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
if out, err := cmd.CombinedOutput(); err != nil {
fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
fmt.Print(string(out))
return err
}
g.Go(func() error {
tasksPath := strings.Join(tasks, ",")
cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* #nosec */
if out, err := cmd.CombinedOutput(); err != nil {
fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
fmt.Print(string(out))
return err
}
return nil
})
}
return nil
return g.Wait()
g, gCtx := errgroup.WithContext(ctx)
g.SetLimit(runtime.NumCPU())
for version, tasks := range taskBundles {
g.Go(func() error {
tasksPath := strings.Join(tasks, ",")
cmd := exec.CommandContext(gCtx, "make", "task-bundle", fmt.Sprintf("TASK_REPO=localhost:%d/ec-task-bundle", k.registryPort), fmt.Sprintf("TASKS=%s", tasksPath), fmt.Sprintf("TASK_TAG=%s", version)) /* `#nosec` */
if out, err := cmd.CombinedOutput(); err != nil {
fmt.Printf("[ERROR] Unable to build and push the Task bundle image, %q returned an error: %v\nCommand output:\n", cmd, err)
fmt.Print(string(out))
return err
}
return nil
})
}
return g.Wait()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/kubernetes/kind/image.go` around lines 229 - 243, The current loop
spawns an unbounded errgroup goroutine per version (using errgroup.WithContext
and g.Go) which can overwhelm CI; instead limit concurrency by creating a
bounded worker pool or semaphore and acquire before launching each goroutine
(release when done) so at most N concurrent make task-bundle invocations run;
update the block that iterates over taskBundles (the loop using g.Go, tasksPath,
exec.CommandContext and cmd.CombinedOutput) to acquire the semaphore, spawn the
goroutine that defers release, and use the same errgroup to collect errors,
exposing the concurrency limit as a constant or field (e.g.,
maxConcurrentTaskBundles) so it can be tuned for CI.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
acceptance 55.18% <ø> (+<0.01%) ⬆️
generative 17.90% <ø> (ø)
integration 26.65% <ø> (ø)
unit 69.01% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
acceptance/profile/profile.go (1)

36-36: Consider making enabled configurable at runtime.

The enabled flag is hardcoded to true. Based on the workflow, E2E_INSTRUMENTATION=true is set, suggesting profiling should be optional. Consider reading from an environment variable:

var enabled = os.Getenv("E2E_INSTRUMENTATION") == "true"

This would allow profiling to be disabled by default and only enabled when explicitly requested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/profile/profile.go` at line 36, The hardcoded package-level
variable enabled should be made runtime-configurable: change the initialization
of enabled to read the E2E_INSTRUMENTATION environment variable (e.g., set
enabled = os.Getenv("E2E_INSTRUMENTATION") == "true") and add the necessary
import for the os package; update any tests or callers of enabled in this
package (functions in profile.go referencing enabled) if they assume a constant
value so they still behave correctly when toggled at runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@acceptance/profile/profile.go`:
- Around line 148-156: The activeContainers counter is only incremented
(activeContainers.Add(1)) when a container finishes starting and never
decremented, so it reports total started rather than current active; update the
logic in the container lifecycle (e.g., move the increment into BeginContainer
before the closure that runs the container and add a corresponding
activeContainers.Add(-1) when the container terminates/cleanup runs), ensure
peakContainers is updated against the concurrent value, and update any uses of
activeContainers/“Final active containers” in logEvent or rename the metric to
totalContainersStarted if you intend to track starts instead of concurrency.

---

Nitpick comments:
In `@acceptance/profile/profile.go`:
- Line 36: The hardcoded package-level variable enabled should be made
runtime-configurable: change the initialization of enabled to read the
E2E_INSTRUMENTATION environment variable (e.g., set enabled =
os.Getenv("E2E_INSTRUMENTATION") == "true") and add the necessary import for the
os package; update any tests or callers of enabled in this package (functions in
profile.go referencing enabled) if they assume a constant value so they still
behave correctly when toggled at runtime.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 24056f50-fc6e-4eb4-952f-8fbc57b0cf50

📥 Commits

Reviewing files that changed from the base of the PR and between c6cfb67 and 9bf9bbf.

📒 Files selected for processing (7)
  • .github/workflows/checks-codecov.yaml
  • acceptance/acceptance_test.go
  • acceptance/git/git.go
  • acceptance/kubernetes/kind/kind.go
  • acceptance/profile/profile.go
  • acceptance/registry/registry.go
  • acceptance/wiremock/wiremock.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • acceptance/git/git.go
  • acceptance/registry/registry.go

Comment on lines +148 to +156
current := activeContainers.Add(1)
mu.Lock()
if current > peakContainers {
peakContainers = current
}
mu.Unlock()

logEvent(map[string]any{
"event": "container", "type": containerType,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

activeContainers metric appears misleading—never decremented.

The activeContainers counter is incremented when a container finishes starting but is never decremented when containers terminate. This means "Final active containers" in the report will always equal the total containers started, not the actual active count.

If tracking concurrent container startups is the intent, the increment should occur before the container starts (in BeginContainer before the closure), and the decrement should occur when containers terminate. Otherwise, consider renaming to totalContainersStarted to reflect actual behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/profile/profile.go` around lines 148 - 156, The activeContainers
counter is only incremented (activeContainers.Add(1)) when a container finishes
starting and never decremented, so it reports total started rather than current
active; update the logic in the container lifecycle (e.g., move the increment
into BeginContainer before the closure that runs the container and add a
corresponding activeContainers.Add(-1) when the container terminates/cleanup
runs), ensure peakContainers is updated against the concurrent value, and update
any uses of activeContainers/“Final active containers” in logEvent or rename the
metric to totalContainersStarted if you intend to track starts instead of
concurrency.

The make push-image step takes 1m41s locally and 3m26s in CI even
when source hasn't changed. A SHA-256 hash of all build inputs
(Go source, go.mod, go.sum, Dockerfile, build.sh, Makefile, and
hack/reduce-snapshot.sh) is now computed and compared against a
per-registry-port cache marker file. When the hash matches, the
build is skipped entirely with a cache hit log message.

Ref: EC-1710

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
acceptance/kubernetes/kind/image.go (1)

229-243: ⚠️ Potential issue | 🟠 Major

Cap the task-bundle fan-out.

The errgroup launches one goroutine per version without a concurrency limit. On CI with many task versions, this can saturate the image builder and registry, potentially reducing reliability rather than improving it.

Suggested fix
+import "runtime"
+
 // ... in buildTaskBundleImage:
 	g, gCtx := errgroup.WithContext(ctx)
+	g.SetLimit(runtime.NumCPU())
 	for version, tasks := range taskBundles {
 		g.Go(func() error {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/kubernetes/kind/image.go` around lines 229 - 243, The loop spawns
an unbounded goroutine per entry in taskBundles via errgroup.WithContext and
g.Go, which can saturate resources; limit concurrency by introducing a bounded
semaphore (e.g., sem := make(chan struct{}, N)) and acquire before starting each
work item and release after it finishes, then wrap the build/push logic
currently inside g.Go with a closure that captures the loop variables (copy
version, tasks into locals) to avoid loop-variable capture bugs; replace the
current g.Go calls to first do sem <- struct{}{} then g.Go(func() error { defer
func(){ <-sem }(); ... }) and return g.Wait() as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@acceptance/git/git.go`:
- Around line 338-344: The code currently calls profile.ContainerTerminated()
unconditionally after attempting to terminate state.Container, which masks
failures; change the logic in the block handling state.Container to call
profile.ContainerTerminated() only when state.Container.Terminate(ctx) returns
nil (i.e., success), otherwise log the error as is and do not mark the profile
as terminated—locate the state.Container conditional and the call to
Terminate(ctx) and move or guard profile.ContainerTerminated() so it executes
only on successful termination.
- Around line 184-203: The profile.BeginContainer("git") call is marked as
started even when subsequent steps fail and the container may leak; change the
flow so you only call the returned endContainer() after successful startup and
registration: call git.MappedPort(ctx) and on success set state.HostAndPort,
state.RepositoriesDir and state.Container = git, then call endContainer(); if
MappedPort or any step after container creation fails, ensure you terminate the
created container (e.g., call git.Terminate(ctx) or otherwise store it so the
existing cleanup can run) instead of leaving it running while skipping
endContainer().

In `@acceptance/registry/registry.go`:
- Around line 342-349: The call to profile.ContainerTerminated() is executed
unconditionally after attempting to terminate the container, which incorrectly
updates profiling metrics even when state.Container.Terminate(ctx) fails; modify
the code in the registry teardown (where testenv.FetchState[registryState](ctx)
and state.Container.Terminate(ctx) are used) so that
profile.ContainerTerminated() is only invoked when Terminate returns nil (i.e.,
on the success path), leaving the error-handling/logging path to log the failure
without updating the profile counts.
- Around line 113-131: The profiling call BeginContainer("registry") is opened
too early and endContainer() is invoked even when setup fails, causing
active-container inflation and leaks; instead, call
profile.BeginContainer("registry") only after successful port resolution and
assignment to state.Container: create the container with
testcontainers.GenericContainer, if that returns an error just return; then call
registry.MappedPort(ctx, "5000/tcp"), and if MappedPort fails call
registry.Terminate(ctx) (or registry.Stop/Remove equivalent) and return; only
after MappedPort succeeds set state.HostAndPort and state.Container and then
call profile.BeginContainer("registry") to get endContainer for later cleanup.

In `@acceptance/wiremock/wiremock.go`:
- Around line 331-338: The code currently calls profile.ContainerTerminated()
unconditionally after attempting to terminate the container; change it so
profile.ContainerTerminated() is only invoked when
state.Container.Terminate(ctx) returns nil (i.e., termination succeeded).
Specifically, inside the block that checks state.Container != nil, call
state.Container.Terminate(ctx), log any error with logger.Warnf("failed to
terminate wiremock container: %v", err) as before, and only call
profile.ContainerTerminated() in the branch where err == nil; do not call it
when termination fails or when state.Container is nil.
- Around line 222-241: The profiling endContainer() call is being invoked on
failure paths too early and a started container w can be orphaned if MappedPort
fails; update the success/failure handling so endContainer() is only called
after both testcontainers.GenericContainer (w) and w.MappedPort succeed, assign
state.Container = w before calling endContainer(), and if MappedPort returns an
error terminate w explicitly (call w.Terminate(ctx)) before returning the error;
locate and adjust logic around the GenericContainer call, the MappedPort call,
the endContainer() invocation, and the assignments to state.Container to
implement this flow.

---

Duplicate comments:
In `@acceptance/kubernetes/kind/image.go`:
- Around line 229-243: The loop spawns an unbounded goroutine per entry in
taskBundles via errgroup.WithContext and g.Go, which can saturate resources;
limit concurrency by introducing a bounded semaphore (e.g., sem := make(chan
struct{}, N)) and acquire before starting each work item and release after it
finishes, then wrap the build/push logic currently inside g.Go with a closure
that captures the loop variables (copy version, tasks into locals) to avoid
loop-variable capture bugs; replace the current g.Go calls to first do sem <-
struct{}{} then g.Go(func() error { defer func(){ <-sem }(); ... }) and return
g.Wait() as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b44b02d0-624c-4ff9-935d-2c257350e413

📥 Commits

Reviewing files that changed from the base of the PR and between 9bf9bbf and 59b26c8.

📒 Files selected for processing (10)
  • .github/workflows/checks-codecov.yaml
  • acceptance/acceptance_test.go
  • acceptance/git/git.go
  • acceptance/go.mod
  • acceptance/kubernetes/kind/image.go
  • acceptance/kubernetes/kind/kind.go
  • acceptance/log/log_test.go
  • acceptance/profile/profile.go
  • acceptance/registry/registry.go
  • acceptance/wiremock/wiremock.go
✅ Files skipped from review due to trivial changes (2)
  • acceptance/kubernetes/kind/kind.go
  • acceptance/profile/profile.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • acceptance/go.mod
  • acceptance/log/log_test.go

Comment thread acceptance/git/git.go
Comment on lines +184 to +203
endContainer := profile.BeginContainer("git")
git, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
endContainer()
return ctx, err
}

port, err := git.MappedPort(ctx, "443/tcp")
endContainer()
if err != nil {
return ctx, err
}

state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
state.RepositoriesDir = repositories
state.Container = git
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the actual code at the specified lines
cat -n acceptance/git/git.go | sed -n '184,203p'

Repository: conforma/cli

Length of output: 693


🏁 Script executed:

# Also get more context around this section to understand the flow
cat -n acceptance/git/git.go | sed -n '170,210p'

Repository: conforma/cli

Length of output: 1453


🏁 Script executed:

# Check if profile.BeginContainer is used elsewhere in the file to understand the pattern
rg "profile\.BeginContainer" acceptance/git/git.go -A 15

Repository: conforma/cli

Length of output: 430


🏁 Script executed:

# Let's also check the cleanup hook mentioned in the scratchpad
rg "sc\.After|profile\.ContainerTerminated" acceptance/git/git.go -B 2 -A 5

Repository: conforma/cli

Length of output: 488


🏁 Script executed:

# Check for other instances of profile.BeginContainer in this file and similar patterns
rg "profile\.BeginContainer" acceptance/git/git.go -B 1 -A 20

Repository: conforma/cli

Length of output: 555


🏁 Script executed:

# Check for similar patterns in other container setup files
fd "\.go$" acceptance/ --type f | xargs rg "profile\.BeginContainer" -l

Repository: conforma/cli

Length of output: 141


🏁 Script executed:

# Let's also verify the structure of the testcontainers.GenericContainer and MappedPort calls
rg "testcontainers\.GenericContainer|\.MappedPort" acceptance/git/git.go -B 2 -A 3

Repository: conforma/cli

Length of output: 375


🏁 Script executed:

# Check wiremock.go for the same pattern
cat -n acceptance/wiremock/wiremock.go | grep -A 25 "profile\.BeginContainer"

Repository: conforma/cli

Length of output: 917


🏁 Script executed:

# Check registry.go for the same pattern
cat -n acceptance/registry/registry.go | grep -A 25 "profile\.BeginContainer"

Repository: conforma/cli

Length of output: 961


Only mark the git container as started on the success path.

Lines 191 and 196 call profile.BeginContainer() on failure paths, causing failed starts to appear active in profiling. Additionally, if git.MappedPort() fails, the container is not stored in state.Container and leaks because the cleanup hook cannot terminate it.

Suggested fix
 endContainer := profile.BeginContainer("git")
 git, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
 	ContainerRequest: req,
 	Started:          true,
 	Logger:           logger,
 })
 if err != nil {
-	endContainer()
 	return ctx, err
 }
 
 port, err := git.MappedPort(ctx, "443/tcp")
-endContainer()
 if err != nil {
-	return ctx, err
+	if termErr := git.Terminate(ctx); termErr != nil {
+		return ctx, fmt.Errorf("map git port: %w; terminate container: %v", err, termErr)
+	}
+	return ctx, err
 }
+endContainer()
 
 state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
 state.RepositoriesDir = repositories
 state.Container = git
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/git/git.go` around lines 184 - 203, The
profile.BeginContainer("git") call is marked as started even when subsequent
steps fail and the container may leak; change the flow so you only call the
returned endContainer() after successful startup and registration: call
git.MappedPort(ctx) and on success set state.HostAndPort, state.RepositoriesDir
and state.Container = git, then call endContainer(); if MappedPort or any step
after container creation fails, ensure you terminate the created container
(e.g., call git.Terminate(ctx) or otherwise store it so the existing cleanup can
run) instead of leaving it running while skipping endContainer().

Comment thread acceptance/git/git.go
Comment on lines +338 to +344
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate git container: %v", err)
}
profile.ContainerTerminated()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n acceptance/git/git.go | sed -n '335,350p'

Repository: conforma/cli

Length of output: 484


Report container termination success only when it actually succeeds.

Line 343 unconditionally calls profile.ContainerTerminated() after the termination attempt, even when the preceding error branch indicates termination failed. This masks cleanup problems in profiling reports.

Suggested fix
 if state.Container != nil {
 	if err := state.Container.Terminate(ctx); err != nil {
 		logger, _ := log.LoggerFor(ctx)
 		logger.Warnf("failed to terminate git container: %v", err)
-	}
-	profile.ContainerTerminated()
+	} else {
+		profile.ContainerTerminated()
+	}
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate git container: %v", err)
}
profile.ContainerTerminated()
}
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate git container: %v", err)
} else {
profile.ContainerTerminated()
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/git/git.go` around lines 338 - 344, The code currently calls
profile.ContainerTerminated() unconditionally after attempting to terminate
state.Container, which masks failures; change the logic in the block handling
state.Container to call profile.ContainerTerminated() only when
state.Container.Terminate(ctx) returns nil (i.e., success), otherwise log the
error as is and do not mark the profile as terminated—locate the state.Container
conditional and the call to Terminate(ctx) and move or guard
profile.ContainerTerminated() so it executes only on successful termination.

Comment on lines +113 to +131
endContainer := profile.BeginContainer("registry")
registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
endContainer()
return ctx, err
}

port, err := registry.MappedPort(ctx, "5000/tcp")
endContainer()
if err != nil {
return ctx, err
}

state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
state.Container = registry
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n acceptance/registry/registry.go | sed -n '110,135p'

Repository: conforma/cli

Length of output: 776


🏁 Script executed:

# Check the profile package to understand BeginContainer contract
rg -t go "func.*BeginContainer" -A 10 acceptance/

Repository: conforma/cli

Length of output: 643


🏁 Script executed:

cat -n acceptance/profile/profile.go | head -60

Repository: conforma/cli

Length of output: 1821


🏁 Script executed:

cat -n acceptance/profile/profile.go | sed -n '60,130p'

Repository: conforma/cli

Length of output: 2170


🏁 Script executed:

cat -n acceptance/profile/profile.go | sed -n '125,160p'

Repository: conforma/cli

Length of output: 1135


🏁 Script executed:

cat -n acceptance/profile/profile.go | sed -n '125,165p'

Repository: conforma/cli

Length of output: 1281


🏁 Script executed:

cat -n acceptance/profile/profile.go | sed -n '163,180p'

Repository: conforma/cli

Length of output: 635


🏁 Script executed:

# Check if there's an sc.After hook mentioned in the scratchpad for cleanup
rg -t go "sc\.After\|registry\.Terminate" acceptance/registry/registry.go -B 2 -A 2

Repository: conforma/cli

Length of output: 38


🏁 Script executed:

cat -n acceptance/registry/registry.go | wc -l

Repository: conforma/cli

Length of output: 59


🏁 Script executed:

# Check the full startStubRegistry function to see complete context
cat -n acceptance/registry/registry.go | sed -n '100,140p'

Repository: conforma/cli

Length of output: 1430


🏁 Script executed:

# Check if there's a cleanup hook for the registry container
grep -n "ContainerTerminated\|Terminate" acceptance/registry/registry.go

Repository: conforma/cli

Length of output: 154


🏁 Script executed:

cat -n acceptance/registry/registry.go | sed -n '340,353p'

Repository: conforma/cli

Length of output: 467


🏁 Script executed:

# Check wiremock file mentioned in scratchpad for comparison of similar pattern
find acceptance -name "*wiremock*" -type f

Repository: conforma/cli

Length of output: 87


🏁 Script executed:

# Check the wiremock implementation for similar pattern
cat -n acceptance/wiremock/wiremock.go | grep -A 20 "BeginContainer"

Repository: conforma/cli

Length of output: 724


🏁 Script executed:

# Check wiremock cleanup hook for comparison
grep -n "ContainerTerminated\|Terminate" acceptance/wiremock/wiremock.go

Repository: conforma/cli

Length of output: 154


Only mark the registry container as started after port resolution succeeds.

Calling endContainer() at lines 120 and 125 inflates the active-container profile: line 120 counts failed GenericContainer calls as active, and line 125 counts as active even if MappedPort fails. When MappedPort fails, the container is leaked because state.Container is assigned only after port resolution succeeds, leaving no cleanup path.

Suggested fix
 endContainer := profile.BeginContainer("registry")
 registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
 	ContainerRequest: req,
 	Started:          true,
 	Logger:           logger,
 })
 if err != nil {
-	endContainer()
 	return ctx, err
 }
 
 port, err := registry.MappedPort(ctx, "5000/tcp")
-endContainer()
 if err != nil {
-	return ctx, err
+	if termErr := registry.Terminate(ctx); termErr != nil {
+		return ctx, fmt.Errorf("map registry port: %w; terminate container: %v", err, termErr)
+	}
+	return ctx, err
 }
+endContainer()
 
 state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
 state.Container = registry
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
endContainer := profile.BeginContainer("registry")
registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
endContainer()
return ctx, err
}
port, err := registry.MappedPort(ctx, "5000/tcp")
endContainer()
if err != nil {
return ctx, err
}
state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
state.Container = registry
endContainer := profile.BeginContainer("registry")
registry, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
return ctx, err
}
port, err := registry.MappedPort(ctx, "5000/tcp")
if err != nil {
if termErr := registry.Terminate(ctx); termErr != nil {
return ctx, fmt.Errorf("map registry port: %w; terminate container: %v", err, termErr)
}
return ctx, err
}
endContainer()
state.HostAndPort = fmt.Sprintf("localhost:%d", port.Int())
state.Container = registry
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/registry/registry.go` around lines 113 - 131, The profiling call
BeginContainer("registry") is opened too early and endContainer() is invoked
even when setup fails, causing active-container inflation and leaks; instead,
call profile.BeginContainer("registry") only after successful port resolution
and assignment to state.Container: create the container with
testcontainers.GenericContainer, if that returns an error just return; then call
registry.MappedPort(ctx, "5000/tcp"), and if MappedPort fails call
registry.Terminate(ctx) (or registry.Stop/Remove equivalent) and return; only
after MappedPort succeeds set state.HostAndPort and state.Container and then
call profile.BeginContainer("registry") to get endContainer for later cleanup.

Comment on lines +342 to +349
state := testenv.FetchState[registryState](ctx)
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate registry container: %v", err)
}
profile.ContainerTerminated()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n acceptance/registry/registry.go | sed -n '340,355p'

Repository: conforma/cli

Length of output: 467


🏁 Script executed:

rg -n "ContainerTerminated" acceptance/registry/registry.go

Repository: conforma/cli

Length of output: 92


🏁 Script executed:

rg -n "profile\s*:=" acceptance/registry/registry.go | head -20

Repository: conforma/cli

Length of output: 38


🏁 Script executed:

rg -n "ContainerTerminated" -A 3 --max-count=10

Repository: conforma/cli

Length of output: 934


🏁 Script executed:

cat -n acceptance/profile/profile.go | sed -n '160,180p'

Repository: conforma/cli

Length of output: 662


🏁 Script executed:

rg -B 5 "profile.ContainerTerminated" acceptance/wiremock/wiremock.go | head -20

Repository: conforma/cli

Length of output: 285


🏁 Script executed:

rg -B 5 "profile.ContainerTerminated" acceptance/git/git.go | head -20

Repository: conforma/cli

Length of output: 280


Call profile.ContainerTerminated() only after successful termination.

Line 348 calls profile.ContainerTerminated() unconditionally, causing incorrect profiling metrics even when Container.Terminate() fails. The function decrements the active container count and increments the terminated count, so it should only be called on the success path.

Suggested fix
 if state.Container != nil {
 	if err := state.Container.Terminate(ctx); err != nil {
 		logger, _ := log.LoggerFor(ctx)
 		logger.Warnf("failed to terminate registry container: %v", err)
-	}
-	profile.ContainerTerminated()
+	} else {
+		profile.ContainerTerminated()
+	}
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
state := testenv.FetchState[registryState](ctx)
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate registry container: %v", err)
}
profile.ContainerTerminated()
}
state := testenv.FetchState[registryState](ctx)
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate registry container: %v", err)
} else {
profile.ContainerTerminated()
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/registry/registry.go` around lines 342 - 349, The call to
profile.ContainerTerminated() is executed unconditionally after attempting to
terminate the container, which incorrectly updates profiling metrics even when
state.Container.Terminate(ctx) fails; modify the code in the registry teardown
(where testenv.FetchState[registryState](ctx) and state.Container.Terminate(ctx)
are used) so that profile.ContainerTerminated() is only invoked when Terminate
returns nil (i.e., on the success path), leaving the error-handling/logging path
to log the failure without updating the profile counts.

Comment on lines +222 to +241
endContainer := profile.BeginContainer("wiremock")
w, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
Logger: logger,
})
if err != nil {
endContainer()
return ctx, fmt.Errorf("unable to run GenericContainer: %v", err)
}

port, err := w.MappedPort(ctx, "8080/tcp")
endContainer()
if err != nil {
return ctx, err
}

url := fmt.Sprintf("http://localhost:%d", port.Int())
state.URL = url
state.Container = w
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd acceptance/wiremock && cat -n wiremock.go | sed -n '220,245p'

Repository: conforma/cli

Length of output: 815


🏁 Script executed:

cat -n acceptance/profile/profile.go

Repository: conforma/cli

Length of output: 11021


🏁 Script executed:

git log --oneline acceptance/wiremock/wiremock.go | head -5

Repository: conforma/cli

Length of output: 122


Only mark WireMock as started on the success path.

Lines 229 and 234 call endContainer() on failure paths, which incorrectly increments profiling counters for failed startups. Additionally, if MappedPort() fails at line 235, the container w remains running but is never stored in state.Container, leaving it orphaned since cleanup only terminates state.Container. Move endContainer() to after both GenericContainer() and MappedPort() succeed, and explicitly terminate w on MappedPort() failure.

Suggested fix
 endContainer := profile.BeginContainer("wiremock")
 w, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
 	ContainerRequest: req,
 	Started:          true,
 	Logger:           logger,
 })
 if err != nil {
-	endContainer()
 	return ctx, fmt.Errorf("unable to run GenericContainer: %v", err)
 }
 
 port, err := w.MappedPort(ctx, "8080/tcp")
-endContainer()
 if err != nil {
-	return ctx, err
+	if termErr := w.Terminate(ctx); termErr != nil {
+		return ctx, fmt.Errorf("map wiremock port: %w; terminate container: %v", err, termErr)
+	}
+	return ctx, err
 }
+endContainer()
 
 url := fmt.Sprintf("http://localhost:%d", port.Int())
 state.URL = url
 state.Container = w
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/wiremock/wiremock.go` around lines 222 - 241, The profiling
endContainer() call is being invoked on failure paths too early and a started
container w can be orphaned if MappedPort fails; update the success/failure
handling so endContainer() is only called after both
testcontainers.GenericContainer (w) and w.MappedPort succeed, assign
state.Container = w before calling endContainer(), and if MappedPort returns an
error terminate w explicitly (call w.Terminate(ctx)) before returning the error;
locate and adjust logic around the GenericContainer call, the MappedPort call,
the endContainer() invocation, and the assignments to state.Container to
implement this flow.

Comment on lines +331 to +338
state := testenv.FetchState[wiremockState](ctx)
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate wiremock container: %v", err)
}
profile.ContainerTerminated()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n acceptance/wiremock/wiremock.go | sed -n '325,345p'

Repository: conforma/cli

Length of output: 691


🏁 Script executed:

rg "ContainerTerminated" --type go -B 3 -A 3

Repository: conforma/cli

Length of output: 1542


🏁 Script executed:

cat -n acceptance/profile/profile.go | head -50

Repository: conforma/cli

Length of output: 1631


🏁 Script executed:

rg "ContainerTerminated" -A 10 acceptance/profile/profile.go

Repository: conforma/cli

Length of output: 383


🏁 Script executed:

rg -A 5 "if err := state.Container.Terminate" acceptance/registry/registry.go acceptance/git/git.go

Repository: conforma/cli

Length of output: 785


Only call profile.ContainerTerminated() when termination succeeds.

Line 337 calls profile.ContainerTerminated() unconditionally, even when state.Container.Terminate(ctx) fails on line 333. This reports the container as terminated in the profiling metrics when it may still be running.

Suggested fix
 if state.Container != nil {
 	if err := state.Container.Terminate(ctx); err != nil {
 		logger, _ := log.LoggerFor(ctx)
 		logger.Warnf("failed to terminate wiremock container: %v", err)
-	}
-	profile.ContainerTerminated()
+	} else {
+		profile.ContainerTerminated()
+	}
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
state := testenv.FetchState[wiremockState](ctx)
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate wiremock container: %v", err)
}
profile.ContainerTerminated()
}
state := testenv.FetchState[wiremockState](ctx)
if state.Container != nil {
if err := state.Container.Terminate(ctx); err != nil {
logger, _ := log.LoggerFor(ctx)
logger.Warnf("failed to terminate wiremock container: %v", err)
} else {
profile.ContainerTerminated()
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acceptance/wiremock/wiremock.go` around lines 331 - 338, The code currently
calls profile.ContainerTerminated() unconditionally after attempting to
terminate the container; change it so profile.ContainerTerminated() is only
invoked when state.Container.Terminate(ctx) returns nil (i.e., termination
succeeded). Specifically, inside the block that checks state.Container != nil,
call state.Container.Terminate(ctx), log any error with logger.Warnf("failed to
terminate wiremock container: %v", err) as before, and only call
profile.ContainerTerminated() in the branch where err == nil; do not call it
when termination fails or when state.Container is nil.

Adds a profile package to the acceptance tests that tracks per-scenario
wall-clock timing, container startup timing by type (registry, git,
wiremock), Kind cluster creation sub-phases, and peak container
concurrency. The profiling report is printed to stderr at the end of
the test run, making it visible in GitHub Actions logs.

This is temporary instrumentation for the EC-1710 performance spike.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@st3penta st3penta changed the title Improve acceptance test performance and reliability [WIP] Improve acceptance test performance and reliability Apr 14, 2026
st3penta and others added 2 commits April 14, 2026 17:18
Instead of running make push-image (full multi-stage Dockerfile that
compiles Go inside the container), build the ec and kubectl binaries
locally using the host's Go cache, then inject them into a minimal
ubi-minimal base image via a new Dockerfile.acceptance.

This avoids redundant Go compilation inside the container and leverages
the host's build cache. Expected CI savings: ~2-3min (3m24s -> ~30s).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After applying cluster resources, wait only for the in-cluster registry
before starting image builds. The CLI image build and task bundle build
now run concurrently with each other and with the Tekton Pipelines
deployment, since they only need the registry to push images.

Expected CI savings: ~1m40s (builds overlap with Tekton deployment,
and with each other).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
st3penta and others added 8 commits April 16, 2026 11:41
This improves UX by giving the user clear indications about when the
test suite started, so he can estimate when it will finish.

It also outputs the total elapsed time at the end, for informational
purposes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ref: https://redhat.atlassian.net/browse/EC-1710
The acceptance job's Go build cache was broken: it used
actions/cache/restore (read-only) with path '**' and key 'main',
which didn't match the actual cache paths (~/.cache/go-build,
~/go/pkg/mod). Every CI run compiled from scratch, causing
kind-build-cli-image (~4m41s) and kind-build-task-bundle (~4m16s)
to dominate wall-clock time.

Fix by switching to read-write actions/cache targeting the correct
directories with a content-based key derived from go.sum files.

Also download a pre-built tkn binary (version extracted from
tools/go.mod) instead of compiling the Tekton CLI from source on
every run. The Makefile task-bundle target now uses tkn from PATH
when available, falling back to go run for local development.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The conforma/github-workflows/golang-version-check action scans all
files matching Dockerfile* for a Go builder stage. Our minimal
acceptance Dockerfile has no build stage (binaries are pre-built on
the host), causing the check to fail. Rename to acceptance.Dockerfile
so it doesn't match the Dockerfile* glob.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three changes to make test failures immediately visible:
- Gate logExecution on scenario failure so passing scenarios are silent
- Gate conftest debug output on command failure
- Switch godog formatter from "pretty" to "progress" (override with
  -format flag or EC_ACCEPTANCE_FORMAT env var)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Route snapshot artifact debug prints (file descriptor, manifest,
  repo, digest) through the per-scenario file logger instead of stdout
- Suppress k8s client-side throttling warnings by redirecting klog
  output to io.Discard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
klog defaults to logtostderr=true, which bypasses SetOutput entirely.
Must call LogToStderr(false) first so SetOutput(io.Discard) takes effect.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The shell runs with -e (set in .SHELLFLAGS), so a go test failure
aborts before the timing echo. Capture the exit code and re-propagate
it after printing duration and collecting coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove duplicated error from failed scenarios summary since go test
already prints the full error via --- FAIL: lines. Move summary and
profiling report to TestMain so they appear after all go test output.
Gate verbose execution details behind -verbose flag, use diff for stderr
assertions, and suppress progress formatter output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
st3penta and others added 4 commits April 16, 2026 17:21
Instrument key scenario steps (create-namespace, create-policy,
create-snapshot, run-task, await-task-done, build-snapshot-artifact) and
extract retrospective Tekton TaskRun timing data (scheduling overhead,
per-step execution, overall execution time) to identify performance
bottlenecks in Kind scenarios.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip CPU resource requests from task bundle steps in acceptance tests
to eliminate Tekton scheduling waterfall. Each TaskRun pod requested
2600m CPU, limiting concurrent pods to 1-2 on CI (3.5 allocatable CPUs),
effectively serializing 26 Kind scenarios. Removing requests allows all
concurrent pods to schedule immediately.

Also set imagePullPolicy: IfNotPresent on CLI image steps as a
defensive measure, and parallelize namespaced resource application in
applyResources by applying cluster-scoped resources first then
namespaced resources concurrently via errgroup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Download the official kubectl binary from dl.k8s.io in the CI workflow,
version-matched to tools/kubectl/go.mod, avoiding compilation of the
entire k8s.io/kubernetes dependency tree. In image.go, prefer a
PATH-available kubectl over go build, falling back to source build for
local development without kubectl installed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The require directive is a single-line format where awk $2 gives the
module path, not the version. Use grep -oE to extract the version
pattern directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant