Cancelling a job on a Linux/macOS host runner can leave the spawned process
tree running and hang the runner — the same failure mode fixed for Windows in
#1011, just on the other platforms.
Steps are launched as process-group leaders (`Setpgid`, or `Setsid` for the PTY
path), but the default `exec.CommandContext` cancellation only kills the
**direct child**. When a step launches a shell that starts a child which in turn
spawns further background processes, cancelling the job leaves the descendants
running. Because those orphans inherited the step's stdout/stderr pipe, the read
end never hits EOF and `cmd.Wait()` blocks forever.
Because the step executor never returns:
- the orphaned processes keep running (the cancelled work is not actually
stopped), and
- end-of-job cleanup is never reached, so the runner appears to go offline / stop
picking up jobs.
## Fix
Apply the same tree-kill approach as Windows, using the Unix counterpart of a Job
Object: the **process group**.
- Add a Unix `processKiller` (`process_unix.go`) that captures the step's PGID
(== PID, since the step is launched as a group leader) and sends `SIGKILL` to
the whole group on cancellation. This also closes the inherited pipe handles so
`cmd.Wait()` can return. `ESRCH` (group already gone) is not treated as an error.
- Restrict the previous no-op stub (`process_other.go`) to `plan9` and have it
fall back to a single-process kill, preserving plan9's prior behaviour.
- Wire `cmd.Cancel` (tree kill) and `cmd.WaitDelay` (10s) **unconditionally** in
`exec()` instead of Windows-only. `WaitDelay` also covers a step that
backgrounds a process holding the pipe open after the main process exits.
Reviewed-on: https://gitea.com/gitea/runner/pulls/1025
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
## Problem
Several runner code paths could drop the **tail** of a step's log output, so a
failing (or cancelled) step would show output that is missing its last line(s).
This was observed in practice and traced to four independent issues.
## Root causes & fixes
### 1. Trailing line without a newline was never flushed
`common.lineWriter` buffers output until it sees a `\n`. A final line **without**
a trailing newline (e.g. an error message printed right before a process exits,
a panic, `printf` without `\n`) stayed in the internal buffer and was never
emitted — the writer exposed no flush at all.
- Added `lineWriter.Flush()` (idempotent), a `Flusher` interface, and a
`FlushWriter(io.Writer)` helper.
- Flush at every stream EOF: the exec copy goroutine, the container `attach()`
streaming goroutine, and at step end (`useStepLogger`).
### 2. Cancellation/timeout truncated output
`waitForCommand` returned immediately on `ctx.Done()` and abandoned the
output-copy goroutine, losing output the command had already produced. It now
drains with a bounded grace period before returning. The response channel is
buffered so the goroutine can't leak if the drain times out.
### 3. `attach()` raced the final bytes
Container output was streamed in a fire-and-forget goroutine that `wait()` did
not synchronize with, so the step could proceed before the last bytes were
written. `wait()` now blocks on the streaming goroutine (bounded) so output is
fully drained and flushed first.
### 4. `::stop-commands::` silently dropped lines from the step log
Lines between `::stop-commands::<token>` and its end token were echoed without
the `raw_output` field **and** short-circuited the handler chain (`return false`),
so they never reached the step log (non-raw entries aren't appended while a step
is running). Now returns `true` so they are still captured.
Reviewed-on: https://gitea.com/gitea/runner/pulls/1028
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
Fixes#1023.
## Problem
In Windows host mode, a single stalled delete syscall (AV/EDR filter driver, unresponsive mount, dying disk) wedged the job forever at `Cleaning up container`. `HostEnvironment.Remove()` bounds every teardown phase (`terminateRunningProcesses`, both `removePathWithRetry` calls) except the `CleanUp` callback — an unbounded `os.RemoveAll(miscpath)` assigned in `startHostEnvironment`. The runner then held its capacity slot indefinitely, the task was reaped as a zombie, and there were no diagnostics.
## Fix
- **Bound the cleanup (availability):** `Remove()` now runs `CleanUp` under `hostCleanupTimeout` (30s) via `runWithTimeout`; on timeout it logs a warning and continues job completion. The stuck goroutine is left to finish (a delete syscall can't be interrupted). Added debug logs around the phase.
- **Reclaim the leak (disk hygiene):** a timed-out cleanup can leave a scratch dir behind, so the existing idle stale-dir sweep is extended to also remove orphaned host-mode scratch dirs (16-hex names) under `Host.WorkdirParent`, leaving the shared `tool_cache` and operator data untouched. The `bind_workdir` gate is dropped from `shouldRunIdleCleanup` so host-mode runners run the sweep.
Reviewed-on: https://gitea.com/gitea/runner/pulls/1024
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Updates `golang.org/x/crypto` from `v0.50.0` to `v0.52.0` (and `golang.org/x/net` from `v0.53.0` to `v0.54.0` as a transitive bump).
## Why
`make security-check` (govulncheck) reported **7 vulnerabilities**, all in `golang.org/x/crypto/ssh` at `v0.50.0`, reachable through the git action cache fetch path (`act/runner/action_cache.go` → `git.Remote.FetchContext`):
| ID | Issue |
| --- | --- |
| GO-2026-5013 | Byte arithmetic underflow/panic in `ssh` |
| GO-2026-5015 | Server panic during `CheckHostKey`/`Authenticate` |
| GO-2026-5017 | Client can cause server deadlock on unexpected responses |
| GO-2026-5018 | Pathological RSA/DSA parameters may cause DoS |
| GO-2026-5019 | Bypass of FIDO/U2F physical interaction |
| GO-2026-5020 | Infinite loop on large channel writes |
| GO-2026-5021 | Auth bypass via unenforced `@revoked` status in `knownhosts` |
All are fixed in `v0.52.0`.
Reviewed-on: https://gitea.com/gitea/runner/pulls/1027
Reviewed-by: techknowlogick <9+techknowlogick@noreply.gitea.com>
Completes the runner side of the cancellation flow, superseding #825. Two parts:
### 1. Report cancellations correctly (`fix`)
When `Reporter.Close` ran with the state still `UNSPECIFIED` and the reporter's
context had been cancelled, the synthesised final state attributed the job to
`RESULT_FAILURE` with an "Early termination" log row — misreporting a
cancellation as a generic failure. `Close` now detects the cancelled context
and finalizes the task as `RESULT_CANCELLED`.
### 2. Advertise the `cancelling` capability (`feat`)
[actions-proto-go v0.6.0](https://gitea.com/gitea/actions-proto-go) adds a
`capabilities` field to `RegisterRequest`/`DeclareRequest`, so the runner can
now tell the server it understands the transitional cancelling state:
- Bumps `gitea.dev/actions-proto-go` to `v0.6.0`.
- Adds a single `RunnerCapabilities()` source of truth exposing
`CapabilityCancelling`.
- Sends `Capabilities` on both register and declare.
With this the server records `HasCancellingSupport` and can rely on the runner
running post-step cleanup before a task is finalized as `RESULT_CANCELLED`.
## Compatibility
Wire-compatible against older servers: the new field uses a previously unused
field number (8 on `RegisterRequest`, 3 on `DeclareRequest`) and the client uses
the binary protobuf codec, so a server predating the field silently ignores it —
registration and declaration succeed and the feature simply stays off. It
activates only once both runner and server are on v0.6.0.
## Server side
The matching Gitea change (read `GetCapabilities()`, persist
`HasCancellingSupport`) is a separate PR against `gitea/gitea`.
Supersedes #825.
Reviewed-on: https://gitea.com/gitea/runner/pulls/1016
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
Reviewed-by: wxiaoguang <29147+wxiaoguang@noreply.gitea.com>
## Background
`DOCKER_USERNAME` and `DOCKER_PASSWORD` are commonly used by workflows as ordinary secrets for logging in to a private registry and pushing images. However, the runner also treated these secret names as implicit Docker pull credentials.
These credentials carry no registry information, but they were attached to every pull unconditionally. As a result, a user who configured `DOCKER_USERNAME` / `DOCKER_PASSWORD` secrets for their private registry (e.g. to push images) would have those same credentials sent to Docker Hub when pulling a public image, causing the pull to fail with authentication failure.
## Changes
- Stop using `DOCKER_USERNAME` and `DOCKER_PASSWORD` as implicit pull credentials for job containers.
- Stop injecting `DOCKER_USERNAME` and `DOCKER_PASSWORD` as pull credentials for step containers.
## ⚠️ BREAKING ⚠️
This is a breaking change.
Workflows or runner setups that previously relied on `DOCKER_USERNAME` and `DOCKER_PASSWORD` being implicitly used for Docker image pulls must migrate to an explicit authentication mechanism.
Migration options:
- For private job container images, use `container.credentials`:
```yaml
jobs:
build:
container:
image: registry.example.com/image:tag
credentials:
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
```
- For private service container images, use service `credentials`.
- For private `uses: docker://...` or private Docker actions, configure Docker authentication in the runner environment before the job starts. For example, run `docker login` on the runner host.
`DOCKER_USERNAME` and `DOCKER_PASSWORD` can still be used as ordinary workflow secrets, for example with `docker/login-action` before pushing images.
---
Related:
- Fixes#386
---------
Co-authored-by: Nicolas <bircni@icloud.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/runner/pulls/1007
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <39446+zettat123@noreply.gitea.com>
Co-committed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
- Add GitHub-style Actions **job summaries** support (writes to `GITHUB_STEP_SUMMARY` / `workflow/SUMMARY.md`) and render them in the run UI.
- Gitea stores summaries internally (DB) and serves them in the run view payload.
- `act_runner` uploads the summary **only when Gitea advertises support** (`X-Gitea-Actions-Capabilities: job-summary`), and warns on upload failures without failing the job.
## Compatibility
- New Gitea + old runner: no upload → no summary shown (no behavior change)
- New runner + old Gitea: capability not advertised → runner skips upload (no behavior change)
## Issue
- Fixesgo-gitea/gitea#23721
Reviewed-on: https://gitea.com/gitea/runner/pulls/917
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
## Background
Remote action cache directories can be keyed by the raw `uses` string. When Gitea's `DEFAULT_ACTIONS_URL` changes, the raw `uses` value may stay the same while the resolved clone URL changes.
In that case, an existing cached clone can still point to the old `origin` URL. Reusing it may fetch from the wrong remote with credentials for the new resolved URL, causing action clone failures until the user manually clears `~/.cache/act`.
## Changes
- Verify the cached clone's `origin` URL before reusing it in `CloneIfRequired`.
- Remove the cached clone and re-clone when the existing `origin` is different from the requested URL.
## Related
- Fixes#1010
Reviewed-on: https://gitea.com/gitea/runner/pulls/1014
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <39446+zettat123@noreply.gitea.com>
Co-committed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
## Problem
Cancelling a job on a Windows host runner can leave the spawned process
tree running and hang the runner. When a step launches a shell that
starts a child which in turn spawns further GUI/background processes,
cancelling the job kills only the direct child (the default
`exec.CommandContext` behaviour). The surviving descendants inherited
the step's stdout/stderr pipe, so the read end never hit EOF and
`cmd.Wait()` blocked forever.
Because the step executor never returned:
- the orphaned processes kept running (the cancelled work was not
actually stopped), and
- end-of-job cleanup (`Remove` → `terminateRunningProcesses`) was never
reached, so the runner appeared to go offline / stop picking up jobs.
`CREATE_NEW_PROCESS_GROUP` does not help here — it affects Ctrl-C signal
delivery, not handle inheritance or tree termination.
## Fix
- Assign each Windows step process to a **Job Object** immediately after
`cmd.Start()`. Descendants created afterwards are automatically part
of the job.
- Override `cmd.Cancel` to `TerminateJobObject`, so cancellation kills
the **entire descendant tree** atomically. This also closes the
inherited pipe handles, so `cmd.Wait()` can return.
- Set `cmd.WaitDelay` (10s) as a safety net: once the process has
exited, Wait force-closes the pipes and returns rather than blocking
forever — covering the case where the job-object setup fails (e.g.
nested-job restrictions), in which we fall back to the previous
single-process kill.
- The Job Object is created **without** `JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE`,
so closing the handle on normal completion does not kill legitimate
background processes; the tree is only torn down on explicit cancel.
Implemented behind `runtime.GOOS == "windows"` with a Windows-only
`processKiller` (Job Object) and no-op stubs elsewhere, so non-Windows
behaviour (default cancellation + `Setpgid`) is unchanged.
## Changes
- `act/container/process_windows.go` — Job Object `processKiller`
(create / assign / terminate).
- `act/container/process_other.go` — no-op stubs (`//go:build !windows`).
- `act/container/host_environment.go` — wire `cmd.Cancel` (tree kill)
and `cmd.WaitDelay` into `exec()`.
- `go.mod` / `go.sum` — promote `golang.org/x/sys` to a direct
dependency.
## Testing
I fully tested it already
## Notes
Follow-up to the Windows leftover-process reaping in #996: that sweep
now actually runs on cancellation because the step no longer hangs
before reaching it.
Reviewed-on: https://gitea.com/gitea/runner/pulls/1011
Reviewed-by: techknowlogick <9+techknowlogick@noreply.gitea.com>
`containerReference.id` was cached from `Create()` and never re-validated, so a container torn down out-of-band (AutoRemove on an unexpected exit, daemon-side cleanup, sibling-job race in a parallel matrix) left a stale id behind. The next `Copy`/`Exec` then hit the daemon with that dead id and failed the otherwise-successful job with `Could not find the file /var/run/act/ in container <id>`.
`find()` now `ContainerInspect`s the cached id and clears it only on a definitive `NotFound`; transient errors trust the cache so cleanup pipelines don't abort on a daemon blip. Operations that need a live container (`copyContent`/`copyDir`/`CopyTarStream`/`exec`/`GetContainerArchive`) fail fast with a clear `container "<name>" does not exist` instead of the daemon's generic empty-id error.
---
This PR was written with the help of Claude Opus 4.7
---------
Co-authored-by: Nicolas <bircni@icloud.com>
Reviewed-on: https://gitea.com/gitea/runner/pulls/1003
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <2021+silverwind@noreply.gitea.com>
Co-committed-by: silverwind <2021+silverwind@noreply.gitea.com>
`TestGetImagePullOptions` left docker/cli's process-global config dir pointed at `testdata/docker-pull-options` (which ships dummy `username:password` creds) via `config.SetDir`, without restoring it. Because that override is process-global, every later docker-gated test in the package then pulled with those creds — `TestDockerCopyToSymlinkPath`'s `alpine:latest` pull failed with `incorrect username or password` and broke CI. The workflow's `DOCKER_CONFIG` override can't mask this, since `SetDir` wins in-process.
Restore `config.Dir()` with `t.Cleanup`, and isolate the socket tests' leaks of the exported `CommonSocketLocations` and `DOCKER_HOST` behind an `isolateSocketEnv` helper.
Refs https://gitea.com/gitea/gitea.com/issues/83
---
This PR was written with the help of Claude Opus 4.8
Reviewed-on: https://gitea.com/gitea/runner/pulls/1004
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
Running the full suite under `-race` (dropping `-short`) exposed pre-existing data races in parallel matrix-job execution, fixed by not sharing mutable state across combinations:
- `containerDaemonSocket()`/`validVolumes()` derive per-job values instead of mutating shared `Config`
- `getWorkflowSecrets` builds a fresh map, `rc.steps()` clones each step, and go-git workdir access is serialized
- every write to a shared `Job`'s result/outputs runs under a per-`Job` lock, each combo interpolating outputs from a pristine snapshot (last wins, as on GitHub)
### Test suite
- capability gates (docker / network / host-tools / Linux) replace the `-short` skips, and the suite runs offline via local fixtures (the artifact flow uses an in-process loopback server, only the docker-action force-pull needs the network)
- drops redundant tests, adds a regression test for https://gitea.com/gitea/runner/issues/981 and a docker-in-docker harness (`make test-dind`)
---
This PR was written with the help of Claude Opus 4.7
Reviewed-on: https://gitea.com/gitea/runner/pulls/994
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
## Summary
- When Gitea cancels a job, the reporter cancels its own task context; the final Close() flush then aborted on that same cancelled context and Gitea never received the runner's acknowledgement (missing tail logs and final state).
- On Windows the cancelled context also neutralised terminateRunningProcesses, leaving step grandchildren alive in the workspace, holding file handles, so the runner could no longer clean up and pick up new work.
- Reporter.Close() now flushes on a detached, bounded context via a new rpcCtx() helper and configurable Runner.ReportCloseTimeout (default 10s).
- terminateRunningProcesses now PowerShell-enumerates Win32_Process and taskkill /T /F's every process whose ExecutablePath or CommandLine references the job's workspace directories, on a detached context.
- The daemon heartbeat loop still exits on <-r.ctx.Done(): the runner is intentionally seen as offline by Gitea during cleanup so it isn't handed a new task overlapping the in-progress teardown.
## Test plan
- [x] go test ./internal/pkg/report/... ./act/container/ -run 'TestReporter_ServerCancelStillFlushesFinal|TestBuildWindowsWorkspaceKillScript'
- [x] make fmt && make lint-go - 0 issues
- [x] GOOS=windows go build ./... - clean
- [x] Manual on a Windows runner: trigger a long-running workflow, cancel from Gitea UI; verify (a) the job ends with tail logs + cancelled state in Gitea, (b) workspace cleans up, (c) the runner picks up a new job without restart.
Authored-by: bircni
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: silverwind <me@silverwind.io>
Reviewed-on: https://gitea.com/gitea/runner/pulls/996
Reviewed-by: silverwind <2021+silverwind@noreply.gitea.com>
## Summary
- Non-raw_output log entries above the globally configured `log.level` are no longer forwarded to the Gitea job log output
- Step output (`raw_output=true`) is always forwarded regardless of level — it is actual job stdout/stderr, not runner internals
- State-machine fields (`stepResult`, `jobResult`) are always processed regardless of level, preserving correct tracking for skipped steps (whose `stepResult` is emitted at `DebugLevel` in `step.go`)
- Extracts a `shouldAppendLogRow` helper to avoid repeating the combined `!duringSteps() && entry.Level <= log.GetLevel()` guard in three places
## Why not the approach in #677
PR #677 adds `if entry.Level != log.GetLevel() { return nil }` at the top of `Fire()`. That has two bugs:
1. Uses `!=` instead of `>`, so `Error`/`Fatal` entries are dropped when the configured level is `Warn`
2. Returns early before processing `stepResult`/`jobResult` state fields — skipped steps (whose `stepResult` is logged at `DebugLevel`) would never be marked complete
This fix instead applies the level guard only at the `r.logRows` append sites, leaving state tracking unconditional.
Relates to #409.
Reviewed-on: https://gitea.com/gitea/runner/pulls/989
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Pin floating image tags:
- `golang` → `1.26-alpine3.23`
- `docker` dind variants → `29.5.2`
- `alpine` (basic stage + test fixture) → `3.23`
`ubuntu:24.04` and `scratch` left unchanged (no more-specific tag).
---
This PR was written with the help of Claude Opus 4.7
Reviewed-on: https://gitea.com/gitea/runner/pulls/992
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
The `TestRunEvent*` integration tests are skipped in CI (`make test` runs `-short`), which hid several breakages that make them fail when run locally:
- `runTest` built the runner `Config` without `ContainerMaxLifetime`, so the job container ran `/bin/sleep 0` and exited immediately — every step failed with "container is not running". Set it to 1h.
- The root `.gitignore`'s unscoped `.env` and `dist` rules shadowed fixtures under `testdata/`. Anchored `dist` → `/dist` (the goreleaser output) and un-ignored `testdata/secrets/.env`.
- Added the missing `testdata/secrets/.env` fixture for `TestRunEventSecrets`.
- The `node24` local action referenced a `dist/index.js` bundle that was never committed (and was gitignored). Made the fixture self-contained (dependency-free ESM, `main: index.js`) so it runs without an `ncc` build. If you'd rather keep the `@actions/core`-based action and commit the built bundle instead, happy to switch.
Network-dependent subtests (remote `uses:`/composite actions) are out of scope.
---
This PR was written with the help of Claude Opus 4.7
---------
Co-authored-by: Nicolas <bircni@icloud.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-on: https://gitea.com/gitea/runner/pulls/987
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
The teardown that removes a job's per-job network and container runs as a `Finally` on the step pipeline in `newJobExecutor`, which only executes after a successful start. When the start itself fails (e.g. a `docker cp` error from a buggy daemon), that `Finally` is skipped, so the network and container leak until Docker's address pool is exhausted and later jobs can no longer create networks.
This tears them down in `startContainer` when the start returns an error, reusing the existing `cleanUpJobContainer` teardown.
Exposed by the daemon regression in https://gitea.com/gitea/runner/issues/981, where every failed `docker cp` leaked a per-job network.
---
This PR was written with the help of Claude Opus 4.7
Reviewed-on: https://gitea.com/gitea/runner/pulls/986
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
The runner already enforces Gitea v1.21+ (see the `connect.CodeUnimplemented` check in `daemon.go`), so several shims kept for v1.20 compatibility have been dead since 2023:
- `compatibleWithOldEnvs` — the `GITEA_DEBUG`, `GITEA_TRACE`, `GITEA_RUNNER_CAPACITY`, `GITEA_RUNNER_FILE`, `GITEA_RUNNER_ENVIRON`, `GITEA_RUNNER_ENV_FILE` env vars (superseded by the config file)
- `VersionHeader` (`x-runner-version`) and the `version` param of `client.New`
- `AgentLabels` field in `RegisterRequest` (replaced by `Labels`)
Also replaces a verbose `strings.TrimRightFunc` closure with `strings.TrimRight(s, "\r\n")` in the log row parser.
---
This PR was written with the help of Claude Opus 4.7
---------
Co-authored-by: Nicolas <bircni@icloud.com>
Reviewed-on: https://gitea.com/gitea/runner/pulls/978
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
Adds `org.opencontainers.image.source` and `org.opencontainers.image.version` labels to all three image variants (`basic`, `dind`, `dind-rootless`).
- `source` lets tools like renovate retrieve release notes from the source repo.
- `version` exposes the build version on the image itself.
Both `release-tag` and `release-nightly` workflows pass `VERSION` as a build arg so the label reflects the actual git tag (or `git describe` output for nightly).
---
This PR was written with the help of Claude Opus 4.7
---------
Reviewed-on: https://gitea.com/gitea/runner/pulls/975
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
My builds kept flaking out with errors like `invalid format delimiter 'ghadelimiter_...' not found before end of file` or just strange failures in the complete job. After some digging I found an issue in `parseEnvFile` and have tested this fix against the test case presented.
- `parseEnvFile` reads `$GITHUB_ENV` / `$GITHUB_OUTPUT` with a `bufio.Scanner` using the default 64 KiB token size, and never checks `s.Err()`.
- Any action that writes a multi-line value with a single line >64 KiB silently aborts the scan with `bufio.ErrTooLong`, which surfaces as the misleading `"invalid format delimiter
'ghadelimiter_…' not found before end of file"`.
- Real-world trigger: `docker/build-push-action`'s `metadata` output embeds the full `GITHUB_EVENT_PATH` payload via buildx provenance; a long PR description (e.g. a Renovate dependency
table) puts the body field on one JSON-escaped line well past 64 KiB.
- Raise the scanner buffer to 1 MiB so realistic outputs parse.
### Reproduction
Test this in an action. This removes the `docker/build-push-action` aspect and reproduces it directly.
```yaml
jobs:
repro:
runs-on: ubuntu-latest
steps:
- id: big
run: |
{
echo 'value<<EOF'
head -c 70000 /dev/urandom | base64 -w0
echo
echo 'EOF'
} >> "$GITHUB_OUTPUT"
```
---------
Co-authored-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <me@silverwind.io>
Reviewed-on: https://gitea.com/gitea/runner/pulls/974
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Reviewed-by: silverwind <2021+silverwind@noreply.gitea.com>
Co-authored-by: Jacob Alberty <jacob.alberty@gmail.com>
Co-committed-by: Jacob Alberty <jacob.alberty@gmail.com>
Removes code that whole-program reachability analysis (`deadcode` from `golang.org/x/tools`) confirmed unreachable, plus the `act/workflowpattern` package which no file outside its own directory imports.
- `act/common/draw.go` — CLI box-drawing helpers left over from nektos/act's dropped CLI
- `act/common/file.go` — `CopyFile`/`CopyDir` package-level helpers (container types have their own `CopyDir` methods, kept)
- `act/common/executor.go` — `Warning` type and `Warningf`. The `case Warning:` arm in `(Executor).Then`'s type switch was dead too (no code ever constructed a `Warning`); the switch is replaced with `if err != nil { return err }`
- `act/lookpath/env.go` — `LookPath` no-arg wrapper and `defaultEnv` struct. Only `LookPath2(file, env)` was used externally; the `Env` interface is kept
- `act/runner/action_cache_offline_mode.go` — `GoGitActionCacheOfflineMode` wrapper, never instantiated
- `act/workflowpattern/` — entire package, never imported
Net `-943` lines.
---
This PR was written with the help of Claude Opus 4.7
---------
Co-authored-by: Nicolas <bircni@icloud.com>
Reviewed-on: https://gitea.com/gitea/runner/pulls/971
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>
It displayed an unused log and start an unused go routine. We should check the executors number before continue.
```
INFO[2026-05-12T21:01:04-07:00] Running job with maxParallel=1 for 1 matrix combinations
INFO[2026-05-12T21:01:04-07:00] NewParallelExecutor: Creating 1 workers for 1 executors
INFO[2026-05-12T21:01:04-07:00] NewParallelExecutor: Creating 1 workers for 0 executors
INFO[2026-05-12T21:01:04-07:00] NewParallelExecutor: Creating 1 workers for 0 executors
```
Reviewed-on: https://gitea.com/gitea/runner/pulls/960
Reviewed-by: silverwind <2021+silverwind@noreply.gitea.com>
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-committed-by: Lunny Xiao <xiaolunwen@gmail.com>
Fixes#956.
Pseudo-TTY allocation is now an explicit, runner-wide opt-in via `runner.allocate_pty`, applied to both host and docker backends. Default is off, matching GitHub `actions/runner`.
```yaml
runner:
allocate_pty: false # default
```
**Before:** the host backend hardcoded `if true /* allocate Terminal */` and the docker backend used `term.IsTerminal(os.Stdout.Fd())`. As a result, `docker build` (and other TTY-aware tools) saw a TTY and emitted cursor-control redraw frames that flooded captured logs with thousands of duplicate-looking progress lines — only on host-mode runners in production, and on docker-mode runners when the daemon happened to be launched from a shell rather than a service.
**After:** both backends consult `Config.AllocatePTY`. The `term.IsTerminal` heuristic is gone, so behavior no longer depends on whether the daemon has a controlling terminal.
**Reproduction:** running `docker build` through `HostEnvironment.Exec` with output captured to a buffer:
| | Before (`if true`) | After (`AllocatePTY=false`) |
|---|---:|---:|
| bytes captured | 18,167 | 1,048 |
| ANSI CSI sequences | 556 | 0 |
| cursor-up `\e[1A` | 181 | 0 |
**Side fix:** `ptyWriter.AutoStop` is now `atomic.Bool`. The field is written from the exec goroutine after `cmd.Wait()` and read from the `copyPtyOutput` goroutine via `ptyWriter.Write`; existing tests never tripped the race detector because their commands produced no output before exit. The new host-mode test does.
---
This PR was written with the help of Claude Opus 4.7
---------
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Nicolas <bircni@icloud.com>
Reviewed-on: https://gitea.com/gitea/runner/pulls/961
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: silverwind <2021+silverwind@noreply.gitea.com>
Co-committed-by: silverwind <2021+silverwind@noreply.gitea.com>
Bumps `github.com/avast/retry-go` v4.7.0 -> v5.0.0, `golangci-lint` v2.11.4 -> v2.12.2 (aligns with gitea/gitea), and pins `govulncheck` to v1.3.0.
- `retry-go` v5 replaces the package-level `retry.Do(fn, opts...)` with a builder API `retry.New(opts...).Do(fn)`. The single call site in `internal/pkg/report/reporter.go` was migrated.
- `golangci-lint` v2.12.2 surfaces three new findings in `act/` (modernize/slicesbackward, govet/inline): one backward loop now uses `slices.Backward`, and the deprecated `reflect.Ptr` alias is replaced with `reflect.Pointer`.
- `go.mod`: the two direct-`require` blocks are merged into one, and a stray `gopkg.in/yaml.v3 // indirect` is moved into the indirect block. Purely cosmetic; `go.sum` is unchanged.
---
This PR was written with the help of Claude Opus 4.7
Reviewed-on: https://gitea.com/gitea/runner/pulls/965
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>