7 Commits

Author SHA1 Message Date
Renovate Bot
966d35564a fix(deps): update module go.etcd.io/bbolt to v1.5.0 2026-06-22 00:13:32 +00:00
Nicolas
6bdcb54828 feat: Enable jobs.<job_id>.timeout-minutes and jobs.<job_id>.continue-on-error (#1032)
Two `jobs.<job_id>` workflow syntax fields were parsed from YAML but silently ignored. This PR implements both:

- **`jobs.<job_id>.timeout-minutes`** — applies a context deadline around the entire job execution (container start, pre-steps, main steps, post-steps). Mirrors the existing step-level `evaluateStepTimeout`. Supports expression interpolation (e.g. `${{ env.MY_TIMEOUT }}`).

- **`jobs.<job_id>.continue-on-error`** — evaluates the expression when a job fails. If all failing matrix combinations had `continue-on-error: true`, the job does not cause the workflow run to fail (`handleFailure` skips it), and the tolerated failure reports `success` to dependent jobs through the `needs` context so jobs gated on the default `if: success()` still run (matching GitHub). The "any firm failure wins" rule is serialised under the existing per-job lock, so parallel matrix combinations are safe.

Both features follow the same patterns already used at the step level (`evaluateStepTimeout` / `isContinueOnError` in `act/runner/step.go`).

## Version compatibility

These changes are backward compatible. With mismatched versions the feature degrades silently to the previous behaviour (field ignored) — no errors on either side.

- `timeout-minutes`: runner-only, no server dependency.
- `continue-on-error`: requires both this runner PR and the matching Gitea server PR to take full effect. With only one side updated, the field continues to be ignored.

Related: [Github](https://github.com/go-gitea/gitea/pull/38100)
---------

Co-authored-by: silverwind <2021+silverwind@noreply.gitea.com>
Co-authored-by: silverwind <me@silverwind.io>
Reviewed-on: https://gitea.com/gitea/runner/pulls/1032
Reviewed-by: silverwind <2021+silverwind@noreply.gitea.com>
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
2026-06-21 17:05:36 +00:00
Nicolas
007717956a feat: Add optional runner.post_task_script hook after task cleanup (#1026)
- Adds `runner.post_task_script` and `runner.post_task_script_timeout` (default `5m`) to run a host executable after each task’s built-in cleanup (post-steps, container teardown, bind-workdir removal).
- Stops task heartbeats via `Reporter.StopHeartbeats()` while the script runs so Gitea won’t assign overlapping work; the final task acknowledgement still happens in `reporter.Close()`.
- Script output goes to the runner process log; non-zero exits are warned only and do not change the job result.
- Documents lifecycle, offline behavior, timeouts, and Windows limits (`.ps1` not supported yet) in `docs/post-task-script.md`.

Reviewed-on: https://gitea.com/gitea/runner/pulls/1026
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
2026-06-19 19:28:10 +00:00
Nicolas
df0370f8bf fix: Interpolate job container.volume (#1036)
Interpolate job container.volumes in GetBindsAndMounts(), matching service container volumes and other container fields (image, options).

  Fixes expressions like ${{ secrets.MAME }}:/path:ro being passed literally and rejected as invalid bind mounts

Reviewed-on: https://gitea.com/gitea/runner/pulls/1036
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
2026-06-18 02:55:30 +00:00
Nicolas
5f0636faad feat: Support ssh:// action URLs (#1035)
Adds `ssh://` to the list of recognized URL schemes in `newRemoteAction`, so a
step can reference an action over SSH, e.g.:

```yaml
uses: ssh://git@gitea.example.com/actions/checkout@v4
```

Previously only `https://` / `http://` prefixes were parsed; an `ssh://` URL
fell through to the bare `org/repo` parser and failed.

### How auth works

SSH auth is delegated entirely to go-git's defaults — the runner configures no
SSH-specific options:

- **Which key?** go-git falls back to the host's **ssh-agent** (`$SSH_AUTH_SOCK`).
  There is no key-file fallback, so the agent must hold a usable key. The SSH
  **username** comes from the URL, so use `ssh://git@host/...` (a bare
  `ssh://host/...` authenticates as an empty user and most servers reject it).
- **Host key trust?** Established out-of-band via the host's `known_hosts`
  (`$SSH_KNOWN_HOSTS`, `~/.ssh/known_hosts`, `/etc/ssh/ssh_known_hosts`). The
  runner host must already trust the remote; there is no accept-on-first-use.
- **Host key changes?** The clone fails with a host-key-mismatch error and stays
  failed until `known_hosts` is updated on the host. Note `InsecureSkipTLS` does
  **not** apply to SSH.

### Caching

The action cache path is derived from `{org}/{repo}` only (scheme/host are not
part of the key), so an `ssh://` action shares cache storage with the same
`org/repo` fetched over HTTP. This is unchanged by this PR and works in practice
(fetches resolve by SHA), but is worth noting.

### Tests

Adds `ssh://` cases to `Test_newRemoteAction` covering the scheme prefix, the
`git@` username placement, and a malformed-URL rejection. The agent/known_hosts
behavior lives in go-git and is not unit-tested here.

Fixes #841

Reviewed-on: https://gitea.com/gitea/runner/pulls/1035
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
2026-06-17 20:28:40 +00:00
Nicolas
4997f33b5f docs: Improve the documentation for cache (#1034)
Reviewed-on: https://gitea.com/gitea/runner/pulls/1034
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Nicolas <bircni@icloud.com>
Co-committed-by: Nicolas <bircni@icloud.com>
2026-06-15 21:50:42 +00:00
StarAurryon
2963716953 feat: ipv6 options for network container creation (#1029)
Here is a final proposal for ipv6 enablement on temporary network created by gitea runner

---------

Co-authored-by: Nicolas <bircni@icloud.com>
Co-authored-by: Nicolas Schwartz <9308314+StarAurryon@users.noreply.github.com>
Reviewed-on: https://gitea.com/gitea/runner/pulls/1029
Reviewed-by: Nicolas <bircni@icloud.com>
Co-authored-by: StarAurryon <206206+staraurryon@noreply.gitea.com>
Co-committed-by: StarAurryon <206206+staraurryon@noreply.gitea.com>
2026-06-15 05:05:20 +00:00
43 changed files with 1545 additions and 383 deletions

View File

@@ -160,9 +160,42 @@ Prefer a YAML file for all settings.
If `runner.labels` is set in the YAML file, those labels are used during `register` and the `--labels` CLI flag is ignored. If `runner.labels` is set in the YAML file, those labels are used during `register` and the `--labels` CLI flag is ignored.
#### External cache (`actions/cache`) #### Caching (`actions/cache`)
If `cache.external_server` is set, you must set `cache.external_secret` to the same value on this runner and on the standalone cache server. Run the server with `gitea-runner cache-server` using a config that defines `cache.external_secret` (and matching `cache.dir` / host / port as needed). Flags `--dir`, `--host`, and `--port` on `cache-server` override the file. Each runner starts its own cache server automatically. Cache entries are local to that runner — runners do not share a cache by default.
**Shared cache across multiple runners**
Run one dedicated `gitea-runner cache-server` that all runners point at.
1. Create a config file for the cache server host:
```yaml
cache:
dir: /data/actcache
port: 8088
external_secret: "replace-with-a-strong-random-secret"
```
2. Start the server:
```bash
gitea-runner -c cache-server-config.yaml cache-server
```
3. On every runner:
```yaml
cache:
external_server: "http://<cache-server-host>:8088/"
external_secret: "replace-with-a-strong-random-secret" # must match the server
```
Alternatively, mount the same NFS/CIFS share on every runner and point `cache.dir` at it — simpler, but with weaker isolation between repositories.
**S3 / MinIO** — mount object storage as a FUSE filesystem (e.g. [s3fs](https://github.com/s3fs-fuse/s3fs-fuse) or [goofys](https://github.com/kahing/goofys)) and set `cache.dir` to the mount point.
Flags `--dir`, `--host`, and `--port` on `cache-server` override the corresponding `cache.*` YAML keys; all other settings, including `external_secret`, require the config file.
#### Official Docker image #### Official Docker image
@@ -176,6 +209,16 @@ When `container.bind_workdir` is enabled, stale task workspace directories can b
- only purely numeric subdirectories under `container.workdir_parent` are treated as task workspaces and may be removed - only purely numeric subdirectories under `container.workdir_parent` are treated as task workspaces and may be removed
- cleanup assumes `container.workdir_parent` is not shared across multiple runners - cleanup assumes `container.workdir_parent` is not shared across multiple runners
#### Post-task script (`runner.post_task_script`)
Optional host script that runs **after** each task's built-in cleanup (post-steps, container teardown, bind-workdir removal). Use it for extra machine housekeeping — Docker pruning, disk cleanup, and similar.
**While the script runs, the runner stops task heartbeats and stays offline from Gitea's perspective until the script exits (or hits `runner.post_task_script_timeout`, default `5m`).** A script that blocks without exiting keeps the runner from taking new work for up to that timeout. Script output goes to the runner log, not the job log; a non-zero exit is warned but does not change the job result.
On Windows, use `.exe`, `.bat`, or `.cmd` paths; **PowerShell (`.ps1`) is not supported yet** as the configured path — wrap commands in a `.cmd` file instead.
See **[docs/post-task-script.md](docs/post-task-script.md)** for lifecycle details, environment variables, timeout interaction, and platform notes.
### Example Deployments ### Example Deployments
Check out the [examples](examples) directory for sample deployment types. Check out the [examples](examples) directory for sample deployment types.

View File

@@ -84,6 +84,12 @@ type NewDockerBuildExecutorInput struct {
Platform string Platform string
} }
// NewDockerNetworkCreateExecutorInput the input for the NewDockerNetworkCreateExecutor function
type NewDockerNetworkCreateExecutorInput struct {
EnableIPv4 *bool
EnableIPv6 *bool
}
// NewDockerPullExecutorInput the input for the NewDockerPullExecutor function // NewDockerPullExecutorInput the input for the NewDockerPullExecutor function
type NewDockerPullExecutorInput struct { type NewDockerPullExecutorInput struct {
Image string Image string

View File

@@ -14,7 +14,7 @@ import (
"github.com/moby/moby/client" "github.com/moby/moby/client"
) )
func NewDockerNetworkCreateExecutor(name string) common.Executor { func NewDockerNetworkCreateExecutor(name string, opts NewDockerNetworkCreateExecutorInput) common.Executor {
return func(ctx context.Context) error { return func(ctx context.Context) error {
cli, err := GetDockerClient(ctx) cli, err := GetDockerClient(ctx)
if err != nil { if err != nil {
@@ -37,8 +37,10 @@ func NewDockerNetworkCreateExecutor(name string) common.Executor {
} }
_, err = cli.NetworkCreate(ctx, name, client.NetworkCreateOptions{ _, err = cli.NetworkCreate(ctx, name, client.NetworkCreateOptions{
Driver: "bridge", Driver: "bridge",
Scope: "local", Scope: "local",
EnableIPv4: opts.EnableIPv4,
EnableIPv6: opts.EnableIPv6,
}) })
if err != nil { if err != nil {
return err return err

View File

@@ -61,7 +61,7 @@ func NewDockerVolumeRemoveExecutor(volume string, force bool) common.Executor {
} }
} }
func NewDockerNetworkCreateExecutor(name string) common.Executor { func NewDockerNetworkCreateExecutor(name string, opts NewDockerNetworkCreateExecutorInput) common.Executor {
return func(ctx context.Context) error { return func(ctx context.Context) error {
return nil return nil
} }

View File

@@ -23,6 +23,7 @@ import (
"gitea.com/gitea/runner/act/common" "gitea.com/gitea/runner/act/common"
"gitea.com/gitea/runner/act/filecollector" "gitea.com/gitea/runner/act/filecollector"
"gitea.com/gitea/runner/act/lookpath" "gitea.com/gitea/runner/act/lookpath"
"gitea.com/gitea/runner/internal/pkg/process"
"github.com/go-git/go-billy/v5/helper/polyfill" "github.com/go-git/go-billy/v5/helper/polyfill"
"github.com/go-git/go-billy/v5/osfs" "github.com/go-git/go-billy/v5/osfs"
@@ -261,7 +262,7 @@ func setupPty(cmd *exec.Cmd, cmdline string) (*os.File, *os.File, error) {
cmd.Stdin = tty cmd.Stdin = tty
cmd.Stdout = tty cmd.Stdout = tty
cmd.Stderr = tty cmd.Stderr = tty
cmd.SysProcAttr = getSysProcAttr(cmdline, true) cmd.SysProcAttr = process.SysProcAttr(cmdline, true)
return ppty, tty, nil return ppty, tty, nil
} }
@@ -321,30 +322,14 @@ func (e *HostEnvironment) exec(ctx context.Context, command []string, cmdline st
cmd.Env = envList cmd.Env = envList
cmd.Stderr = e.StdOut cmd.Stderr = e.StdOut
cmd.Dir = wd cmd.Dir = wd
cmd.SysProcAttr = getSysProcAttr(cmdline, false) cmd.SysProcAttr = process.SysProcAttr(cmdline, false)
// A step often launches a process tree (a shell that starts a child which // Kill the step's whole process tree on cancellation (a step often launches a
// spawns further background or GUI processes). The default context // shell that spawns further background or GUI children) and bound the post-exit
// cancellation only kills the direct child, leaving the rest of the tree // I/O wait, so an orphan inheriting cmd's stdout/stderr pipe can never hang
// running; and because the orphans inherit cmd's stdout/stderr pipe, // cmd.Wait() and the runner. See process.TreeKill. The PTY path below may
// cmd.Wait() would block forever, hanging the runner. Kill the whole tree on // override SysProcAttr, but never touches Cancel/WaitDelay.
// cancellation — via a Job Object on Windows and the process group on Unix treeKill := process.NewTreeKill(cmd)
// (see processKiller) — and bound the wait so a leftover pipe writer can
// never hang Wait indefinitely.
var killer atomic.Pointer[processKiller]
cmd.Cancel = func() error {
if k := killer.Load(); k != nil {
return k.Kill()
}
if cmd.Process != nil {
return cmd.Process.Kill()
}
return nil
}
// Once the step process has exited, give its I/O pipes at most this long to
// drain before Wait force-closes them and returns (Go's WaitDelay). This
// also covers a step that backgrounds a process holding the pipe open.
cmd.WaitDelay = 10 * time.Second
var ppty *os.File var ppty *os.File
var tty *os.File var tty *os.File
@@ -375,15 +360,9 @@ func (e *HostEnvironment) exec(ctx context.Context, command []string, cmdline st
if err := cmd.Start(); err != nil { if err := cmd.Start(); err != nil {
return err return err
} }
// Capture the started process for tree-kill on cancellation: a Job Object on if k, kerr := treeKill.Capture(cmd.Process); kerr != nil {
// Windows (children spawned afterwards are auto-included) and the process
// group on Unix. On failure (e.g. Windows nested-job restrictions) we fall
// back to the default single-process kill; WaitDelay + end-of-job cleanup
// still apply.
if k, kerr := newProcessKiller(cmd.Process); kerr != nil {
common.Logger(ctx).Warnf("process tree kill setup failed, falling back to single-process kill: %v", kerr) common.Logger(ctx).Warnf("process tree kill setup failed, falling back to single-process kill: %v", kerr)
} else { } else {
killer.Store(k)
defer k.Close() defer k.Close()
} }
err = cmd.Wait() err = cmd.Wait()

View File

@@ -1,29 +0,0 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
//go:build plan9
package container
import "os"
// processKiller falls back to single-process termination on platforms without
// a process-group / Job Object tree-kill. The Job Object (Windows) and process
// group (Unix) based tree-kills live in process_windows.go / process_unix.go;
// here we just kill the direct child, matching the previous default behaviour.
type processKiller struct {
p *os.Process
}
func newProcessKiller(p *os.Process) (*processKiller, error) {
return &processKiller{p: p}, nil
}
func (k *processKiller) Kill() error {
if k == nil || k.p == nil {
return nil
}
return k.p.Kill()
}
func (k *processKiller) Close() error { return nil }

View File

@@ -1,56 +0,0 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
//go:build !windows && !plan9
package container
import (
"errors"
"os"
"syscall"
)
// processKiller terminates a step process together with its whole process
// group, which is the Unix counterpart of the Windows Job Object tree-kill.
//
// Background: a step often launches a process tree (a shell that starts a child
// which in turn spawns further background processes). The default
// exec.CommandContext cancellation only kills the direct child, so cancelling a
// job left the rest of the tree running. Because those orphans inherited the
// step's stdout/stderr pipe, cmd.Wait() also blocked forever and the runner
// hung.
//
// Steps are started with Setpgid (or Setsid for the PTY path, see
// getSysProcAttr), which makes the step process the leader of a new process
// group whose ID equals its PID. Signalling the negative PID delivers to every
// process still in that group, so we can tear down the whole tree atomically on
// cancellation, which also closes the inherited pipe handles so cmd.Wait() can
// return.
type processKiller struct {
pgid int
}
// newProcessKiller captures the process group of p (an already-started
// process). Because the step is launched with Setpgid/Setsid, p is a group
// leader and its PGID equals its PID; children spawned afterwards stay in the
// same group unless they explicitly create their own.
func newProcessKiller(p *os.Process) (*processKiller, error) {
return &processKiller{pgid: p.Pid}, nil
}
// Kill sends SIGKILL to the entire process group (the step process and every
// descendant that stayed in the group). A missing group (ESRCH) means the
// processes already exited and is not treated as an error.
func (k *processKiller) Kill() error {
if k == nil || k.pgid <= 0 {
return nil
}
if err := syscall.Kill(-k.pgid, syscall.SIGKILL); err != nil && !errors.Is(err, syscall.ESRCH) {
return err
}
return nil
}
// Close is a no-op on Unix; there is no job handle to release.
func (k *processKiller) Close() error { return nil }

View File

@@ -1,71 +0,0 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
package container
import (
"os"
"golang.org/x/sys/windows"
)
// processKiller terminates a step process together with its entire descendant
// tree via a Windows Job Object.
//
// Background: a step often launches a process tree (a shell that starts a
// child which in turn spawns further GUI or background processes). The default
// exec.CommandContext cancellation only kills the direct child, so cancelling a
// job left the rest of the tree running. Because those orphans inherited the
// step's stdout/stderr pipe, cmd.Wait() also blocked forever and the runner hung.
//
// Assigning the step process to a Job Object lets us kill the whole tree
// atomically on cancellation (TerminateJobObject), which also closes the
// inherited pipe handles so cmd.Wait() can return.
type processKiller struct {
job windows.Handle
}
// newProcessKiller creates a Job Object and assigns p (an already-started
// process) to it. Children spawned by p afterwards are automatically part of
// the job. The job does NOT use JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE, so closing
// the handle on normal completion does not kill legitimate background
// processes; the tree is only torn down by an explicit Kill (cancellation).
func newProcessKiller(p *os.Process) (*processKiller, error) {
job, err := windows.CreateJobObject(nil, nil)
if err != nil {
return nil, err
}
h, err := windows.OpenProcess(windows.PROCESS_SET_QUOTA|windows.PROCESS_TERMINATE, false, uint32(p.Pid))
if err != nil {
windows.CloseHandle(job)
return nil, err
}
defer windows.CloseHandle(h)
if err := windows.AssignProcessToJobObject(job, h); err != nil {
windows.CloseHandle(job)
return nil, err
}
return &processKiller{job: job}, nil
}
// Kill terminates every process currently assigned to the job (the step process
// and all of its descendants).
func (k *processKiller) Kill() error {
if k == nil || k.job == 0 {
return nil
}
return windows.TerminateJobObject(k.job, 1)
}
// Close releases the job handle. It does not terminate the processes.
func (k *processKiller) Close() error {
if k == nil || k.job == 0 {
return nil
}
h := k.job
k.job = 0
return windows.CloseHandle(h)
}

View File

@@ -8,23 +8,10 @@ package container
import ( import (
"os" "os"
"syscall"
"github.com/creack/pty" "github.com/creack/pty"
) )
func getSysProcAttr(_ string, tty bool) *syscall.SysProcAttr {
if tty {
return &syscall.SysProcAttr{
Setsid: true,
Setctty: true,
}
}
return &syscall.SysProcAttr{
Setpgid: true,
}
}
func openPty() (*os.File, *os.File, error) { func openPty() (*os.File, *os.File, error) {
return pty.Open() return pty.Open()
} }

View File

@@ -7,15 +7,8 @@ package container
import ( import (
"errors" "errors"
"os" "os"
"syscall"
) )
func getSysProcAttr(cmdLine string, tty bool) *syscall.SysProcAttr {
return &syscall.SysProcAttr{
Setpgid: true,
}
}
func openPty() (*os.File, *os.File, error) { func openPty() (*os.File, *os.File, error) {
return nil, nil, errors.New("Unsupported") return nil, nil, errors.New("Unsupported")
} }

View File

@@ -7,15 +7,8 @@ package container
import ( import (
"errors" "errors"
"os" "os"
"syscall"
) )
func getSysProcAttr(cmdLine string, tty bool) *syscall.SysProcAttr {
return &syscall.SysProcAttr{
Rfork: syscall.RFNOTEG,
}
}
func openPty() (*os.File, *os.File, error) { func openPty() (*os.File, *os.File, error) {
return nil, nil, errors.New("Unsupported") return nil, nil, errors.New("Unsupported")
} }

View File

@@ -7,13 +7,8 @@ package container
import ( import (
"errors" "errors"
"os" "os"
"syscall"
) )
func getSysProcAttr(cmdLine string, tty bool) *syscall.SysProcAttr {
return &syscall.SysProcAttr{CmdLine: cmdLine, CreationFlags: syscall.CREATE_NEW_PROCESS_GROUP}
}
func openPty() (*os.File, *os.File, error) { func openPty() (*os.File, *os.File, error) {
return nil, nil, errors.New("Unsupported") return nil, nil, errors.New("Unsupported")
} }

View File

@@ -266,7 +266,7 @@ func (impl *interperterImpl) jobSuccess() (bool, error) { //nolint:unparam // pr
jobNeeds := impl.getNeedsTransitive(impl.config.Run.Job()) jobNeeds := impl.getNeedsTransitive(impl.config.Run.Job())
for _, needs := range jobNeeds { for _, needs := range jobNeeds {
if jobs[needs].Result != "success" { if jobs[needs].NeedsResult() != "success" {
return false, nil return false, nil
} }
} }
@@ -283,7 +283,7 @@ func (impl *interperterImpl) jobFailure() (bool, error) { //nolint:unparam // pr
jobNeeds := impl.getNeedsTransitive(impl.config.Run.Job()) jobNeeds := impl.getNeedsTransitive(impl.config.Run.Job())
for _, needs := range jobNeeds { for _, needs := range jobNeeds {
if jobs[needs].Result == "failure" { if jobs[needs].NeedsResult() == "failure" {
return true, nil return true, nil
} }
} }

View File

@@ -190,23 +190,52 @@ func (w *Workflow) WorkflowCallConfig() *WorkflowCall {
// Job is the structure of one job in a workflow // Job is the structure of one job in a workflow
type Job struct { type Job struct {
Name string `yaml:"name"` Name string `yaml:"name"`
RawNeeds yaml.Node `yaml:"needs"` RawNeeds yaml.Node `yaml:"needs"`
RawRunsOn yaml.Node `yaml:"runs-on"` RawRunsOn yaml.Node `yaml:"runs-on"`
Env yaml.Node `yaml:"env"` Env yaml.Node `yaml:"env"`
If yaml.Node `yaml:"if"` If yaml.Node `yaml:"if"`
Steps []*Step `yaml:"steps"` Steps []*Step `yaml:"steps"`
TimeoutMinutes string `yaml:"timeout-minutes"` TimeoutMinutes string `yaml:"timeout-minutes"`
Services map[string]*ContainerSpec `yaml:"services"` RawContinueOnError string `yaml:"continue-on-error"`
Strategy *Strategy `yaml:"strategy"` Services map[string]*ContainerSpec `yaml:"services"`
RawContainer yaml.Node `yaml:"container"` Strategy *Strategy `yaml:"strategy"`
Defaults Defaults `yaml:"defaults"` RawContainer yaml.Node `yaml:"container"`
Outputs map[string]string `yaml:"outputs"` Defaults Defaults `yaml:"defaults"`
Uses string `yaml:"uses"` Outputs map[string]string `yaml:"outputs"`
With map[string]any `yaml:"with"` Uses string `yaml:"uses"`
RawSecrets yaml.Node `yaml:"secrets"` With map[string]any `yaml:"with"`
RawPermissions yaml.Node `yaml:"permissions"` RawSecrets yaml.Node `yaml:"secrets"`
Result string RawPermissions yaml.Node `yaml:"permissions"`
Result string
// Runtime fields set during execution (not from YAML):
ContinueOnError bool // true when all failing matrix combinations had continue-on-error=true
hasFirmFailure bool // true once any combination failed without continue-on-error
}
// SetContinueOnError records whether this combination's failure should not fail the workflow.
// Must be called under the job lock. Safe across parallel matrix combinations.
func (j *Job) SetContinueOnError(continueOnErr bool) {
if continueOnErr {
if !j.hasFirmFailure {
j.ContinueOnError = true
}
} else {
j.hasFirmFailure = true
j.ContinueOnError = false
}
}
// NeedsResult returns the job result as seen by dependent jobs through the
// `needs` context. A job that failed but was tolerated via continue-on-error
// reports "success" to its dependents, matching GitHub: such a failure must not
// block jobs gated on the default `if: success()`, even though the overall
// workflow run is still marked as failed.
func (j *Job) NeedsResult() string {
if j.Result == "failure" && j.ContinueOnError {
return "success"
}
return j.Result
} }
// Strategy for the job // Strategy for the job

View File

@@ -32,6 +32,32 @@ func TestStepCloneIsolatesMutableFields(t *testing.T) {
assert.Equal(t, "original", orig.With["arg"], "With map must not be shared with the clone") assert.Equal(t, "original", orig.With["arg"], "With map must not be shared with the clone")
} }
// TestJobNeedsResult guards the continue-on-error semantics exposed to dependent
// jobs through the `needs` context: a failed-but-tolerated job reports "success"
// so it does not block dependents gated on the default `if: success()`, matching
// GitHub. A firm failure and any non-failure result are reported verbatim.
func TestJobNeedsResult(t *testing.T) {
cases := []struct {
name string
result string
continueOnError bool
want string
}{
{"tolerated failure reports success", "failure", true, "success"},
{"firm failure reports failure", "failure", false, "failure"},
{"success is unchanged", "success", false, "success"},
{"success with continue-on-error is unchanged", "success", true, "success"},
{"empty result is unchanged", "", true, ""},
{"skipped is unchanged", "skipped", true, "skipped"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
j := &Job{Result: tc.result, ContinueOnError: tc.continueOnError}
assert.Equal(t, tc.want, j.NeedsResult())
})
}
}
func TestReadWorkflow_ScheduleEvent(t *testing.T) { func TestReadWorkflow_ScheduleEvent(t *testing.T) {
yaml := ` yaml := `
name: local-action-docker-url name: local-action-docker-url

View File

@@ -56,7 +56,7 @@ func (rc *RunContext) NewExpressionEvaluatorWithEnv(ctx context.Context, env map
for _, needs := range jobNeeds { for _, needs := range jobNeeds {
using[needs] = exprparser.Needs{ using[needs] = exprparser.Needs{
Outputs: jobs[needs].Outputs, Outputs: jobs[needs].Outputs,
Result: jobs[needs].Result, Result: jobs[needs].NeedsResult(),
} }
} }
@@ -127,7 +127,7 @@ func (rc *RunContext) NewStepExpressionEvaluator(ctx context.Context, step step)
for _, needs := range jobNeeds { for _, needs := range jobNeeds {
using[needs] = exprparser.Needs{ using[needs] = exprparser.Needs{
Outputs: jobs[needs].Outputs, Outputs: jobs[needs].Outputs,
Result: jobs[needs].Result, Result: jobs[needs].NeedsResult(),
} }
} }

View File

@@ -22,6 +22,7 @@ import (
"gitea.com/gitea/runner/act/common" "gitea.com/gitea/runner/act/common"
"gitea.com/gitea/runner/act/container" "gitea.com/gitea/runner/act/container"
"gitea.com/gitea/runner/act/exprparser"
"gitea.com/gitea/runner/act/model" "gitea.com/gitea/runner/act/model"
) )
@@ -204,11 +205,21 @@ func newJobExecutor(info jobInfo, sf stepFactory, rc *RunContext) common.Executo
return common.NewPipelineExecutor(info.startContainer(), common.NewPipelineExecutor(pipeline...). return common.NewPipelineExecutor(info.startContainer(), common.NewPipelineExecutor(pipeline...).
Finally(func(ctx context.Context) error { Finally(func(ctx context.Context) error {
var cancel context.CancelFunc var cancel context.CancelFunc
if ctx.Err() == context.Canceled { switch ctx.Err() {
case context.Canceled:
// in case of an aborted run, we still should execute the // in case of an aborted run, we still should execute the
// post steps to allow cleanup. // post steps to allow cleanup.
ctx, cancel = context.WithTimeout(common.WithLogger(context.Background(), common.Logger(ctx)), 5*time.Minute) ctx, cancel = context.WithTimeout(common.WithLogger(context.Background(), common.Logger(ctx)), 5*time.Minute)
defer cancel() defer cancel()
case context.DeadlineExceeded:
// The job hit its timeout-minutes. Without a fresh context the post
// steps would run against the already-expired context and be skipped,
// so cleanup post-hooks (e.g. actions/checkout post, cache save) would
// not run. Derive the context with WithoutCancel so the new deadline
// applies but the job error state is preserved: the job is still
// reported as failed and container teardown matches a normal failure.
ctx, cancel = context.WithTimeout(context.WithoutCancel(ctx), 5*time.Minute)
defer cancel()
} }
return postExecutor(ctx) return postExecutor(ctx)
}). }).
@@ -223,6 +234,12 @@ func setJobResult(ctx context.Context, info jobInfo, rc *RunContext, success boo
// read-modify-write of the job result so a failing combination is not lost-updated by a // read-modify-write of the job result so a failing combination is not lost-updated by a
// concurrent succeeding one. // concurrent succeeding one.
job := rc.Run.Job() job := rc.Run.Job()
var continueOnError bool
if !success {
// Use a fresh context so an expired job timeout cannot block expression evaluation.
evalCtx := common.WithLogger(context.Background(), common.Logger(ctx))
continueOnError = evaluateJobContinueOnError(evalCtx, rc, job)
}
jobResult := func() string { jobResult := func() string {
defer lockJob(job)() defer lockJob(job)()
result := "success" result := "success"
@@ -233,6 +250,7 @@ func setJobResult(ctx context.Context, info jobInfo, rc *RunContext, success boo
} }
if !success { if !success {
result = "failure" result = "failure"
job.SetContinueOnError(continueOnError)
} }
info.result(result) info.result(result)
return result return result
@@ -271,6 +289,32 @@ func setJobOutputs(ctx context.Context, rc *RunContext) {
} }
} }
// applyJobTimeout applies the job-level timeout-minutes to ctx, mirroring the
// step-level evaluateStepTimeout in step.go.
func applyJobTimeout(ctx context.Context, rc *RunContext, job *model.Job) (context.Context, context.CancelFunc) {
timeout := rc.ExprEval.Interpolate(ctx, job.TimeoutMinutes)
if timeout != "" {
if timeoutMinutes, err := strconv.ParseInt(timeout, 10, 64); err == nil {
return context.WithTimeout(ctx, time.Duration(timeoutMinutes)*time.Minute)
}
}
return ctx, func() {}
}
// evaluateJobContinueOnError evaluates the job-level continue-on-error expression.
func evaluateJobContinueOnError(ctx context.Context, rc *RunContext, job *model.Job) bool {
expr := strings.TrimSpace(job.RawContinueOnError)
if expr == "" {
return false
}
continueOnError, err := EvalBool(ctx, rc.NewExpressionEvaluator(ctx), expr, exprparser.DefaultStatusCheckNone)
if err != nil {
common.Logger(ctx).Warnf("continue-on-error expression %q evaluation failed: %v", expr, err)
return false
}
return continueOnError
}
func tryUploadJobSummary(ctx context.Context, rc *RunContext) { func tryUploadJobSummary(ctx context.Context, rc *RunContext) {
if rc == nil || rc.JobContainer == nil || rc.Config == nil { if rc == nil || rc.JobContainer == nil || rc.Config == nil {
return return

View File

@@ -28,6 +28,7 @@ import (
"github.com/stretchr/testify/assert" "github.com/stretchr/testify/assert"
"github.com/stretchr/testify/mock" "github.com/stretchr/testify/mock"
"github.com/stretchr/testify/require" "github.com/stretchr/testify/require"
yaml "go.yaml.in/yaml/v4"
) )
func TestJobExecutor(t *testing.T) { func TestJobExecutor(t *testing.T) {
@@ -347,6 +348,133 @@ func TestNewJobExecutor(t *testing.T) {
} }
} }
// TestNewJobExecutorRunsPostStepsAfterTimeout guards the timeout-minutes cleanup
// path: when a job exceeds its timeout the job context is DeadlineExceeded, but
// the post steps (cleanup hooks like actions/checkout post and cache save) must
// still run against a fresh, non-expired context, and the job must still be
// reported as failed.
func TestNewJobExecutorRunsPostStepsAfterTimeout(t *testing.T) {
ctx := common.WithJobErrorContainer(context.Background())
// The timeout is generous so the main step (which blocks on ctx.Done below) is
// always reached before the deadline fires; otherwise the pipeline would
// short-circuit before the step runs and the job error would never be set.
ctx, cancel := context.WithTimeout(ctx, 200*time.Millisecond)
defer cancel()
jim := &jobInfoMock{}
sfm := &stepFactoryMock{}
rc := &RunContext{
JobContainer: &jobContainerMock{},
Run: &model.Run{
JobID: "test",
Workflow: &model.Workflow{
Jobs: map[string]*model.Job{
"test": {},
},
},
},
Config: &Config{},
}
rc.ExprEval = rc.NewExpressionEvaluator(ctx)
stepModel := &model.Step{ID: "1"}
jim.On("steps").Return([]*model.Step{stepModel})
jim.On("matrix").Return(map[string]any{})
jim.On("startContainer").Return(func(ctx context.Context) error { return nil })
jim.On("interpolateOutputs").Return(func(ctx context.Context) error { return nil })
jim.On("closeContainer").Return(func(ctx context.Context) error { return nil })
// The job timed out, so it must be reported as failed. stopContainer is left
// unexpected on purpose: a timed-out (failed) job preserves its error state, so
// the graceful stop is skipped exactly like any other failure without AutoRemove.
jim.On("result", "failure")
sm := &stepMock{}
sfm.On("newStep", stepModel, rc).Return(sm, nil)
sm.On("pre").Return(func(ctx context.Context) error { return nil })
// The main step runs past the job timeout: it blocks until the job context is
// done, mirroring a step that overruns timeout-minutes.
sm.On("main").Return(func(ctx context.Context) error {
<-ctx.Done()
return ctx.Err()
})
var postRan bool
var postCtxErr error
sm.On("post").Return(func(ctx context.Context) error {
postRan = true
postCtxErr = ctx.Err()
return nil
})
executor := newJobExecutor(jim, sfm, rc)
// The executor itself returns nil on timeout: the failure is surfaced through
// the job result ("failure", asserted via the result mock below), not the
// return value.
require.NoError(t, executor(ctx))
assert.True(t, postRan, "post step must run after a job timeout")
require.NoError(t, postCtxErr, "post step must run against a fresh, non-expired context")
jim.AssertExpectations(t)
sfm.AssertExpectations(t)
sm.AssertExpectations(t)
}
// TestSetJobResultMatrixContinueOnError exercises the parallel-matrix path
// end-to-end: two combinations share one *model.Job and continue-on-error is
// keyed on matrix.experimental, so one combination tolerates its failure and the
// other does not. The job is reported as continue-on-error only when EVERY failing
// combination was tolerated; a single firm failure makes the whole job firm, and
// handleFailure then fails the run.
func TestSetJobResultMatrixContinueOnError(t *testing.T) {
const jobYAML = "continue-on-error: ${{ matrix.experimental }}\nruns-on: ubuntu-latest"
newSharedJob := func(t *testing.T) (*model.Job, *model.Workflow) {
t.Helper()
var job *model.Job
require.NoError(t, yaml.Unmarshal([]byte(jobYAML), &job))
return job, &model.Workflow{
Name: "workflow1",
Jobs: map[string]*model.Job{"job1": job},
}
}
planFor := func(wf *model.Workflow) *model.Plan {
return &model.Plan{Stages: []*model.Stage{{Runs: []*model.Run{{Workflow: wf, JobID: "job1"}}}}}
}
ctx := context.Background()
// fail drives a single matrix combination through the failure path; each
// RunContext is its own jobInfo (rc implements jobInfo) and shares the job.
fail := func(wf *model.Workflow, experimental bool) {
rc := newTestRC(wf, map[string]any{"experimental": experimental})
setJobResult(ctx, rc, rc, false)
}
t.Run("one tolerated and one firm failure fails the run", func(t *testing.T) {
job, wf := newSharedJob(t)
// Order is intentional: the tolerated combination finishes first, then the
// firm one. The firm-failure latch must still win regardless of order.
fail(wf, true)
fail(wf, false)
assert.Equal(t, "failure", job.Result)
assert.False(t, job.ContinueOnError, "a single firm failure must make the whole job firm")
assert.Error(t, handleFailure(planFor(wf))(ctx))
})
t.Run("all tolerated failures do not fail the run", func(t *testing.T) {
job, wf := newSharedJob(t)
fail(wf, true)
fail(wf, true)
assert.Equal(t, "failure", job.Result)
assert.True(t, job.ContinueOnError, "every failing combination was tolerated")
assert.NoError(t, handleFailure(planFor(wf))(ctx))
})
}
func TestHasJobSummaryCapability(t *testing.T) { func TestHasJobSummaryCapability(t *testing.T) {
assert.True(t, hasJobSummaryCapability("cache,job-summary artifacts")) assert.True(t, hasJobSummaryCapability("cache,job-summary artifacts"))
assert.True(t, hasJobSummaryCapability("cache,\njob-summary\tartifacts")) assert.True(t, hasJobSummaryCapability("cache,\njob-summary\tartifacts"))
@@ -674,3 +802,104 @@ func tarArchive(t *testing.T, entries ...tarEntry) []byte {
require.NoError(t, tw.Close()) require.NoError(t, tw.Close())
return buf.Bytes() return buf.Bytes()
} }
func newTestRC(wf *model.Workflow, matrix map[string]any) *RunContext {
return &RunContext{
Config: &Config{
Workdir: ".",
Platforms: map[string]string{
"ubuntu-latest": "ubuntu-latest",
},
},
StepResults: map[string]*model.StepResult{},
Env: map[string]string{},
Matrix: matrix,
Run: &model.Run{JobID: "job1", Workflow: wf},
}
}
func makeTestRC(t *testing.T, jobYAML string) *RunContext {
t.Helper()
var job *model.Job
require.NoError(t, yaml.Unmarshal([]byte(jobYAML), &job))
rc := newTestRC(&model.Workflow{
Name: "workflow1",
Jobs: map[string]*model.Job{"job1": job},
}, nil)
rc.ExprEval = rc.NewExpressionEvaluator(context.Background())
return rc
}
func TestApplyJobTimeout(t *testing.T) {
cases := []struct {
name string
yaml string
wantTimeout bool
}{
{"empty", "runs-on: ubuntu-latest", false},
{"integer", "timeout-minutes: 5\nruns-on: ubuntu-latest", true},
{"non-numeric ignored", "timeout-minutes: abc\nruns-on: ubuntu-latest", false},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
rc := makeTestRC(t, tc.yaml)
ctx := context.Background()
newCtx, cancel := applyJobTimeout(ctx, rc, rc.Run.Job())
defer cancel()
_, hasDeadline := newCtx.Deadline()
assert.Equal(t, tc.wantTimeout, hasDeadline)
})
}
}
func TestEvaluateJobContinueOnError(t *testing.T) {
cases := []struct {
name string
yaml string
want bool
}{
{"absent", "runs-on: ubuntu-latest", false},
{"true", "continue-on-error: true\nruns-on: ubuntu-latest", true},
{"false", "continue-on-error: false\nruns-on: ubuntu-latest", false},
{"expression true", "continue-on-error: ${{ 'x' == 'x' }}\nruns-on: ubuntu-latest", true},
{"expression false", "continue-on-error: ${{ 'x' != 'x' }}\nruns-on: ubuntu-latest", false},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
rc := makeTestRC(t, tc.yaml)
got := evaluateJobContinueOnError(context.Background(), rc, rc.Run.Job())
assert.Equal(t, tc.want, got)
})
}
}
func TestJobSetContinueOnError(t *testing.T) {
t.Run("first call true", func(t *testing.T) {
j := &model.Job{}
j.SetContinueOnError(true)
assert.True(t, j.ContinueOnError)
})
t.Run("first call false", func(t *testing.T) {
j := &model.Job{}
j.SetContinueOnError(false)
assert.False(t, j.ContinueOnError)
})
t.Run("true then false locks to false", func(t *testing.T) {
j := &model.Job{}
j.SetContinueOnError(true)
j.SetContinueOnError(false)
assert.False(t, j.ContinueOnError)
})
t.Run("false then true stays false", func(t *testing.T) {
j := &model.Job{}
j.SetContinueOnError(false)
j.SetContinueOnError(true)
assert.False(t, j.ContinueOnError)
})
t.Run("true then true stays true", func(t *testing.T) {
j := &model.Job{}
j.SetContinueOnError(true)
j.SetContinueOnError(true)
assert.True(t, j.ContinueOnError)
})
}

View File

@@ -189,6 +189,9 @@ func (rc *RunContext) GetBindsAndMounts() ([]string, map[string]string) {
if job := rc.Run.Job(); job != nil { if job := rc.Run.Job(); job != nil {
if container := job.Container(); container != nil { if container := job.Container(); container != nil {
for _, v := range container.Volumes { for _, v := range container.Volumes {
if rc.ExprEval != nil {
v = rc.ExprEval.Interpolate(context.Background(), v)
}
if !strings.Contains(v, ":") || filepath.IsAbs(v) { if !strings.Contains(v, ":") || filepath.IsAbs(v) {
// Bind anonymous volume or host file. // Bind anonymous volume or host file.
binds = append(binds, v) binds = append(binds, v)
@@ -471,7 +474,8 @@ func (rc *RunContext) startJobContainer() common.Executor {
rc.pullServicesImages(rc.Config.ForcePull), rc.pullServicesImages(rc.Config.ForcePull),
rc.JobContainer.Pull(rc.Config.ForcePull), rc.JobContainer.Pull(rc.Config.ForcePull),
rc.stopJobContainer(), rc.stopJobContainer(),
container.NewDockerNetworkCreateExecutor(networkName).IfBool(createAndDeleteNetwork), container.NewDockerNetworkCreateExecutor(networkName, rc.Config.ContainerNetworkCreateOptions).
IfBool(createAndDeleteNetwork),
rc.startServiceContainers(networkName), rc.startServiceContainers(networkName),
rc.JobContainer.Create(rc.Config.ContainerCapAdd, rc.Config.ContainerCapDrop), rc.JobContainer.Create(rc.Config.ContainerCapAdd, rc.Config.ContainerCapDrop),
rc.JobContainer.Start(false), rc.JobContainer.Start(false),

View File

@@ -276,6 +276,37 @@ func TestRunContext_GetBindsAndMounts(t *testing.T) {
{"MountExistingVolume", []string{"volume-id:/volume"}, "", map[string]string{"volume-id": "/volume"}}, {"MountExistingVolume", []string{"volume-id:/volume"}, "", map[string]string{"volume-id": "/volume"}},
} }
t.Run("InterpolatedContainerVolumes", func(t *testing.T) {
job := &model.Job{}
err := job.RawContainer.Encode(map[string][]string{
"volumes": {"${{ secrets.MAME }}:/root/.mame/roms:ro"},
})
require.NoError(t, err)
rc := &RunContext{
Name: "TestRCName",
Run: &model.Run{
Workflow: &model.Workflow{
Name: "TestWorkflowName",
},
},
Config: &Config{
BindWorkdir: false,
Secrets: map[string]string{
"MAME": "/host/mame/roms",
},
},
}
rc.Run.JobID = "job1"
rc.Run.Workflow.Jobs = map[string]*model.Job{"job1": job}
rc.ExprEval = rc.NewExpressionEvaluator(context.Background())
gotbind, gotmount := rc.GetBindsAndMounts()
assert.Contains(t, gotbind, "/host/mame/roms:/root/.mame/roms:ro")
assert.NotContains(t, gotbind, "${{ secrets.MAME }}")
assert.NotContains(t, gotmount, "${{ secrets.MAME }}")
})
for _, testcase := range tests { for _, testcase := range tests {
t.Run(testcase.name, func(t *testing.T) { t.Run(testcase.name, func(t *testing.T) {
job := &model.Job{} job := &model.Job{}

View File

@@ -15,6 +15,7 @@ import (
"time" "time"
"gitea.com/gitea/runner/act/common" "gitea.com/gitea/runner/act/common"
"gitea.com/gitea/runner/act/container"
"gitea.com/gitea/runner/act/model" "gitea.com/gitea/runner/act/model"
docker_container "github.com/moby/moby/api/types/container" docker_container "github.com/moby/moby/api/types/container"
@@ -28,47 +29,48 @@ type Runner interface {
// Config contains the config for a new runner // Config contains the config for a new runner
type Config struct { type Config struct {
Actor string // the user that triggered the event Actor string // the user that triggered the event
Workdir string // path to working directory Workdir string // path to working directory
ActionCacheDir string // path used for caching action contents ActionCacheDir string // path used for caching action contents
ActionOfflineMode bool // when offline, use cached action contents ActionOfflineMode bool // when offline, use cached action contents
BindWorkdir bool // bind the workdir to the job container BindWorkdir bool // bind the workdir to the job container
EventName string // name of event to run EventName string // name of event to run
EventPath string // path to JSON file to use for event.json in containers EventPath string // path to JSON file to use for event.json in containers
DefaultBranch string // name of the main branch for this repository DefaultBranch string // name of the main branch for this repository
ReuseContainers bool // reuse containers to maintain state ReuseContainers bool // reuse containers to maintain state
ForcePull bool // force pulling of the image, even if already present ForcePull bool // force pulling of the image, even if already present
ForceRebuild bool // force rebuilding local docker image action ForceRebuild bool // force rebuilding local docker image action
LogOutput bool // log the output from docker run LogOutput bool // log the output from docker run
JSONLogger bool // use json or text logger JSONLogger bool // use json or text logger
LogPrefixJobID bool // switches from the full job name to the job id LogPrefixJobID bool // switches from the full job name to the job id
Env map[string]string // env for containers Env map[string]string // env for containers
Inputs map[string]string // manually passed action inputs Inputs map[string]string // manually passed action inputs
Secrets map[string]string // list of secrets Secrets map[string]string // list of secrets
Vars map[string]string // list of vars Vars map[string]string // list of vars
Token string // GitHub token Token string // GitHub token
InsecureSecrets bool // switch hiding output when printing to terminal InsecureSecrets bool // switch hiding output when printing to terminal
Platforms map[string]string // list of platforms Platforms map[string]string // list of platforms
Privileged bool // use privileged mode Privileged bool // use privileged mode
UsernsMode string // user namespace to use UsernsMode string // user namespace to use
ContainerArchitecture string // Desired OS/architecture platform for running containers ContainerArchitecture string // Desired OS/architecture platform for running containers
ContainerDaemonSocket string // Path to Docker daemon socket ContainerDaemonSocket string // Path to Docker daemon socket
ContainerOptions string // Options for the job container ContainerOptions string // Options for the job container
UseGitIgnore bool // controls if paths in .gitignore should not be copied into container, default true UseGitIgnore bool // controls if paths in .gitignore should not be copied into container, default true
GitHubInstance string // GitHub instance to use, default "github.com" GitHubInstance string // GitHub instance to use, default "github.com"
ContainerCapAdd []string // list of kernel capabilities to add to the containers ContainerCapAdd []string // list of kernel capabilities to add to the containers
ContainerCapDrop []string // list of kernel capabilities to remove from the containers ContainerCapDrop []string // list of kernel capabilities to remove from the containers
AutoRemove bool // controls if the container is automatically removed upon workflow completion AutoRemove bool // controls if the container is automatically removed upon workflow completion
ArtifactServerPath string // the path where the artifact server stores uploads ArtifactServerPath string // the path where the artifact server stores uploads
ArtifactServerAddr string // the address the artifact server binds to ArtifactServerAddr string // the address the artifact server binds to
ArtifactServerPort string // the port the artifact server binds to ArtifactServerPort string // the port the artifact server binds to
NoSkipCheckout bool // do not skip actions/checkout NoSkipCheckout bool // do not skip actions/checkout
RemoteName string // remote name in local git repo config RemoteName string // remote name in local git repo config
ReplaceGheActionWithGithubCom []string // Use actions from GitHub Enterprise instance to GitHub ReplaceGheActionWithGithubCom []string // Use actions from GitHub Enterprise instance to GitHub
ReplaceGheActionTokenWithGithubCom string // Token of private action repo on GitHub. ReplaceGheActionTokenWithGithubCom string // Token of private action repo on GitHub.
Matrix map[string]map[string]bool // Matrix config to run Matrix map[string]map[string]bool // Matrix config to run
ContainerNetworkMode docker_container.NetworkMode // the network mode of job containers (the value of --network) ContainerNetworkMode docker_container.NetworkMode // the network mode of job containers (the value of --network)
ActionCache ActionCache // Use a custom ActionCache Implementation ContainerNetworkCreateOptions container.NewDockerNetworkCreateExecutorInput // the default network create options
ActionCache ActionCache // Use a custom ActionCache Implementation
PresetGitHubContext *model.GithubContext // the preset github context, overrides some fields like DefaultBranch, Env, Secrets etc. PresetGitHubContext *model.GithubContext // the preset github context, overrides some fields like DefaultBranch, Env, Secrets etc.
EventJSON string // the content of JSON file to use for event.json in containers, overrides EventPath EventJSON string // the content of JSON file to use for event.json in containers, overrides EventPath
@@ -248,7 +250,10 @@ func (runner *runnerImpl) NewPlanExecutor(plan *model.Plan) common.Executor {
return err return err
} }
return executor(common.WithJobErrorContainer(WithJobLogger(ctx, rc.Run.JobID, jobName, rc.Config, &rc.Masks, matrix))) jobCtx := common.WithJobErrorContainer(WithJobLogger(ctx, rc.Run.JobID, jobName, rc.Config, &rc.Masks, matrix))
jobCtx, cancelTimeout := applyJobTimeout(jobCtx, rc, job)
defer cancelTimeout()
return executor(jobCtx)
}) })
} }
// Run all matrix combinations of this job, then drop its aggregation mutex: the // Run all matrix combinations of this job, then drop its aggregation mutex: the
@@ -303,7 +308,7 @@ func handleFailure(plan *model.Plan) common.Executor {
return func(ctx context.Context) error { return func(ctx context.Context) error {
for _, stage := range plan.Stages { for _, stage := range plan.Stages {
for _, run := range stage.Runs { for _, run := range stage.Runs {
if run.Job().Result == "failure" { if run.Job().Result == "failure" && !run.Job().ContinueOnError {
return fmt.Errorf("Job '%s' failed", run.String()) return fmt.Errorf("Job '%s' failed", run.String())
} }
} }

View File

@@ -312,7 +312,7 @@ func (ra *remoteAction) IsCheckout() bool {
func newRemoteAction(action string) *remoteAction { func newRemoteAction(action string) *remoteAction {
// support http(s)://host/owner/repo@v3 // support http(s)://host/owner/repo@v3
for _, schema := range []string{"https://", "http://"} { for _, schema := range []string{"https://", "http://", "ssh://"} {
if after, ok := strings.CutPrefix(action, schema); ok { if after, ok := strings.CutPrefix(action, schema); ok {
splits := strings.SplitN(after, "/", 2) splits := strings.SplitN(after, "/", 2)
if len(splits) != 2 { if len(splits) != 2 {

View File

@@ -778,6 +778,32 @@ func Test_newRemoteAction(t *testing.T) {
}, },
wantCloneURL: "http://gitea.com/actions/aws", wantCloneURL: "http://gitea.com/actions/aws",
}, },
{
action: "ssh://git@gitea.com/actions/heroku@main", // it's invalid for GitHub, but gitea supports it
want: &remoteAction{
URL: "ssh://git@gitea.com",
Org: "actions",
Repo: "heroku",
Path: "",
Ref: "main",
},
wantCloneURL: "ssh://git@gitea.com/actions/heroku",
},
{
action: "ssh://git@gitea.com/actions/aws/ec2@main", // the ssh user is kept as part of the host segment
want: &remoteAction{
URL: "ssh://git@gitea.com",
Org: "actions",
Repo: "aws",
Path: "ec2",
Ref: "main",
},
wantCloneURL: "ssh://git@gitea.com/actions/aws",
},
{
action: "ssh://gitea.com/onlyonesegment@main", // missing org/repo after the host
want: nil,
},
} }
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.action, func(t *testing.T) { t.Run(tt.action, func(t *testing.T) {

155
docs/post-task-script.md Normal file
View File

@@ -0,0 +1,155 @@
# Post-task script
The post-task script is an optional host hook that runs **once after every task**, after the runner has already finished its normal per-task cleanup. Typical uses include pruning Docker images, vacuuming ephemeral disks, or resetting VM state between jobs.
It is configured under `runner.post_task_script` in the runner YAML config (see [config.example.yaml](../internal/pkg/config/config.example.yaml)).
## When it runs
For each task, execution order is:
1. Workflow runs (steps, actions, containers).
2. In-job cleanup (action `post:` steps, container stop/remove).
3. Job outputs are reported to Gitea.
4. Bind-workdir workspace removal, when `container.bind_workdir` is enabled.
5. **Post-task script** (this hook).
6. Final task acknowledgement to Gitea (`reporter.Close()`).
The script is **additive**: it does not replace any built-in cleanup. When `container.bind_workdir` is enabled, the task workspace directory has usually already been deleted before the script starts. `GITEA_WORKSPACE` is still set to the path the job used, for reference.
## Runner stays offline until the script finishes
This is the most important operational detail.
When the post-task script starts, the runner **stops sending task heartbeats** to Gitea (the same mechanism used during cancel/cleanup). From Gitea's perspective, the runner is **not available for new work** until:
1. The script exits (success or failure), **and**
2. The runner sends the final task flush to Gitea.
While the script runs:
- **Gitea will not assign another task** to this runner for the current job slot (heartbeats are stopped).
- **The runner capacity slot stays occupied** locally — with `capacity: 1`, the poller will not start another task until the script completes.
- **Runner shutdown** (`shutdown_timeout`) counts this phase as part of the in-flight task; a long or stuck script delays graceful shutdown.
If the script **never exits**, the runner remains in this state until `runner.post_task_script_timeout` elapses (default **5 minutes** when a script is configured). The runner then kills the script process and proceeds to the final acknowledgement. Until that timeout fires, **the runner effectively stays offline**.
Set `post_task_script_timeout` to a value that matches how long your housekeeping is allowed to take — not how long you wish it could take. Prefer short, bounded scripts.
### Recommendations
- Keep scripts **fast and bounded** (seconds, not minutes).
- Avoid interactive prompts, blocking network calls without timeouts, or waiting on user input.
- Use **idempotent** operations (the script may run after success, failure, or cancellation).
- Test failure modes: hung script, non-zero exit, missing executable.
- Watch the **runner process log** for script output (it is not written to the Gitea job log).
- On shutdown, ensure scripts respond to process termination within `post_task_script_timeout`.
## Configuration
```yaml
runner:
# Path to an executable on the host. Empty or omitted disables the hook.
post_task_script: /usr/local/bin/gitea-post-task.sh
# Hard limit on script runtime. Default when post_task_script is set: 5m.
# If the script exceeds this, it is killed and the runner continues.
post_task_script_timeout: 2m
```
| Option | Default | Description |
| --- | --- | --- |
| `runner.post_task_script` | *(disabled)* | Host path to the script or binary. Relative paths are resolved from the runner process working directory. |
| `runner.post_task_script_timeout` | `5m` (only when script is set) | Maximum time the script may run before the runner kills it and moves on. |
The script must be **executable** on the host (shebang on Linux/macOS, or a native `.exe` / `.bat` / `.cmd` on Windows). **PowerShell (`.ps1`) is not supported yet** as the value of `post_task_script`; the runner executes the configured path directly and does not invoke `powershell.exe` for you.
`gitea-runner exec` does **not** load runner YAML and will not run this hook.
## Environment variables
The script receives `runner.envs` / `runner.env_file` values plus:
| Variable | Description |
| --- | --- |
| `GITEA_TASK_ID` | Numeric task ID. |
| `GITEA_RUN_ID` | Workflow run ID, when provided by the server. |
| `GITEA_REPOSITORY` | Repository slug (`owner/name`). |
| `GITEA_WORKSPACE` | Workspace path used for the job (may already be deleted). |
| `GITEA_JOB_RESULT` | `success`, `failure`, `cancelled`, `skipped`, or `unknown`. |
The script environment is **not** a full copy of the job container environment. System variables such as `PATH` are only present if you define them in `runner.envs` or `runner.env_file`.
## Output and errors
- **Stdout/stderr** are written to the **runner process log** (logrus), prefixed with `post-task script stdout:` / `post-task script stderr:`.
- **Non-zero exit codes** are logged as warnings only. They do **not** change the job result already reported to Gitea.
- **Timeouts and start failures** are logged as warnings; the runner still completes the task acknowledgement.
## Interaction with other timeouts
| Timeout | Effect on post-task script |
| --- | --- |
| `runner.post_task_script_timeout` | Kills the script if it runs too long. This is the **only** timeout that bounds the script. |
| `runner.timeout` | Caps the task **up to** the script. The script detaches from the task deadline, so a job near the runner timeout limit does **not** cut the script short — it still gets its full `post_task_script_timeout`. |
| `runner.shutdown_timeout` | On SIGINT/SIGTERM, bounds how long the runner waits for the **task** to finish. The post-task script detaches from cancellation, so it is **not** interrupted by this window and may extend shutdown until its own `post_task_script_timeout` elapses. |
## Examples
### Linux — prune dangling Docker resources
`/usr/local/bin/gitea-post-task.sh`:
```sh
#!/bin/sh
set -eu
docker image prune -f
docker builder prune -f --filter 'until=24h'
```
`config.yaml`:
```yaml
runner:
post_task_script: /usr/local/bin/gitea-post-task.sh
post_task_script_timeout: 3m
```
### Windows — batch file (`.cmd`)
Use a `.cmd` or `.bat` file. PowerShell scripts are **not supported yet** as `post_task_script`; call PowerShell from a batch wrapper if needed:
`C:\gitea-runner\scripts\post-task.cmd`:
```bat
@echo off
docker image prune -f
```
```yaml
runner:
post_task_script: C:\gitea-runner\scripts\post-task.cmd
post_task_script_timeout: 3m
```
PowerShell workaround until native `.ps1` support exists:
`C:\gitea-runner\scripts\post-task.cmd`:
```bat
@echo off
powershell.exe -NoProfile -NonInteractive -ExecutionPolicy Bypass -File "%~dp0post-task.ps1"
```
## Windows notes
- Supported as `post_task_script`: `.exe`, `.bat`, `.cmd`.
- **Not supported yet:** `.ps1` as the configured path (use a `.cmd` wrapper; see above).
- `.sh` files require a Unix shell on the PATH unless you point `post_task_script` at the interpreter.
- Use backslashes or forward slashes in YAML paths; both work in Go on Windows.
## See also
- [Configuration](../README.md#configuration) — generating and loading `config.yaml`
- [config.example.yaml](../internal/pkg/config/config.example.yaml) — all runner options
- Bind-workdir idle cleanup (`runner.workdir_cleanup_age`) — separate from this hook; runs only when the runner is idle

2
go.mod
View File

@@ -35,7 +35,7 @@ require (
github.com/spf13/pflag v1.0.10 github.com/spf13/pflag v1.0.10
github.com/stretchr/testify v1.11.1 github.com/stretchr/testify v1.11.1
github.com/timshannon/bolthold v0.0.0-20240314194003-30aac6950928 github.com/timshannon/bolthold v0.0.0-20240314194003-30aac6950928
go.etcd.io/bbolt v1.4.3 go.etcd.io/bbolt v1.5.0
go.yaml.in/yaml/v4 v4.0.0-rc.3 go.yaml.in/yaml/v4 v4.0.0-rc.3
golang.org/x/sys v0.46.0 golang.org/x/sys v0.46.0
golang.org/x/term v0.44.0 golang.org/x/term v0.44.0

4
go.sum
View File

@@ -209,6 +209,8 @@ github.com/xeipuuv/gojsonschema v1.2.0/go.mod h1:anYRn/JVcOK2ZgGU+IjEV4nwlhoK5sQ
go.etcd.io/bbolt v1.3.8/go.mod h1:N9Mkw9X8x5fupy0IKsmuqVtoGDyxsaDlbk4Rd05IAQw= go.etcd.io/bbolt v1.3.8/go.mod h1:N9Mkw9X8x5fupy0IKsmuqVtoGDyxsaDlbk4Rd05IAQw=
go.etcd.io/bbolt v1.4.3 h1:dEadXpI6G79deX5prL3QRNP6JB8UxVkqo4UPnHaNXJo= go.etcd.io/bbolt v1.4.3 h1:dEadXpI6G79deX5prL3QRNP6JB8UxVkqo4UPnHaNXJo=
go.etcd.io/bbolt v1.4.3/go.mod h1:tKQlpPaYCVFctUIgFKFnAlvbmB3tpy1vkTnDWohtc0E= go.etcd.io/bbolt v1.4.3/go.mod h1:tKQlpPaYCVFctUIgFKFnAlvbmB3tpy1vkTnDWohtc0E=
go.etcd.io/bbolt v1.5.0 h1:S7GAl7Fxv12yohbwFfIbQCGDWbQbtDGPET4P/bD4lxU=
go.etcd.io/bbolt v1.5.0/go.mod h1:mkltfYE5aUHQxUct9N9V+Kp7aSjFqjgrhcXIS70Lrdk=
go.etcd.io/gofail v0.1.0/go.mod h1:VZBCXYGZhHAinaBiiqYvuDynvahNsAyLFwB3kEHKz1M= go.etcd.io/gofail v0.1.0/go.mod h1:VZBCXYGZhHAinaBiiqYvuDynvahNsAyLFwB3kEHKz1M=
go.opentelemetry.io/auto/sdk v1.2.1 h1:jXsnJ4Lmnqd11kwkBV2LgLoFMZKizbCi5fNZ/ipaZ64= go.opentelemetry.io/auto/sdk v1.2.1 h1:jXsnJ4Lmnqd11kwkBV2LgLoFMZKizbCi5fNZ/ipaZ64=
go.opentelemetry.io/auto/sdk v1.2.1/go.mod h1:KRTj+aOaElaLi+wW1kO/DZRXwkF4C5xPbEe3ZiIhN7Y= go.opentelemetry.io/auto/sdk v1.2.1/go.mod h1:KRTj+aOaElaLi+wW1kO/DZRXwkF4C5xPbEe3ZiIhN7Y=
@@ -253,8 +255,6 @@ golang.org/x/sys v0.15.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.46.0 h1:noSf2Fq6F8DBgS+LysIkx7rIExoNHJsxOAtPp4rthXw= golang.org/x/sys v0.46.0 h1:noSf2Fq6F8DBgS+LysIkx7rIExoNHJsxOAtPp4rthXw=
golang.org/x/sys v0.46.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw= golang.org/x/sys v0.46.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.43.0 h1:S4RLU2sB31O/NCl+zFN9Aru9A/Cq2aqKpTZJ6B+DwT4=
golang.org/x/term v0.43.0/go.mod h1:lrhlHNdQJHO+1qVYiHfFKVuVioJIheAc3fBSMFYEIsk=
golang.org/x/term v0.44.0 h1:0rLvDRCtNj0gZkyIXhCyOb2OAzEhLVqc4B+hrsBhrmc= golang.org/x/term v0.44.0 h1:0rLvDRCtNj0gZkyIXhCyOb2OAzEhLVqc4B+hrsBhrmc=
golang.org/x/term v0.44.0/go.mod h1:7ze4MdzUzLXpSAoFP1H0bOI9aXDqveSvatT5vKcFh2Y= golang.org/x/term v0.44.0/go.mod h1:7ze4MdzUzLXpSAoFP1H0bOI9aXDqveSvatT5vKcFh2Y=
golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=

View File

@@ -0,0 +1,132 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
package run
import (
"context"
"errors"
"fmt"
"io"
"os/exec"
"strconv"
"strings"
"time"
"gitea.com/gitea/runner/act/common"
"gitea.com/gitea/runner/internal/pkg/config"
"gitea.com/gitea/runner/internal/pkg/metrics"
"gitea.com/gitea/runner/internal/pkg/process"
"gitea.com/gitea/runner/internal/pkg/report"
runnerv1 "gitea.dev/actions-proto-go/runner/v1"
log "github.com/sirupsen/logrus"
)
func (r *Runner) runPostTaskScript(ctx context.Context, reporter *report.Reporter, task *runnerv1.Task, workdir string) {
script := r.cfg.Runner.PostTaskScript
if script == "" {
return
}
timeout := r.cfg.Runner.PostTaskScriptTimeout
if timeout <= 0 {
timeout = config.DefaultPostTaskScriptTimeout
}
scriptCtx, cancel := postTaskScriptContext(ctx, timeout)
defer cancel()
env := r.postTaskScriptEnv(reporter, task, workdir)
log.Infof("running post-task script %q for task %d", script, task.Id)
cmd := exec.CommandContext(scriptCtx, script)
cmd.Env = envListFromMap(env)
cmd.SysProcAttr = process.SysProcAttr(script, false)
stdout := postTaskScriptLogWriter("stdout")
stderr := postTaskScriptLogWriter("stderr")
cmd.Stdout = stdout
cmd.Stderr = stderr
// Kill the script's whole process tree on cancellation and bound the post-exit
// I/O wait, so a backgrounded child inheriting cmd's stdout/stderr pipe can
// never hang cmd.Wait() and the runner. See process.TreeKill.
treeKill := process.NewTreeKill(cmd)
if err := cmd.Start(); err != nil {
log.Warnf("post-task script %q for task %d: %v", script, task.Id, err)
return
}
if k, kerr := treeKill.Capture(cmd.Process); kerr != nil {
log.Warnf("post-task script %q for task %d: process tree kill setup failed, falling back to single-process kill: %v", script, task.Id, kerr)
} else {
defer k.Close()
}
err := cmd.Wait()
// Flush any trailing, not-yet-newline-terminated output now that the I/O
// copiers have finished (cmd.Wait, bounded by WaitDelay above, guarantees it).
common.FlushWriter(stdout)
common.FlushWriter(stderr)
if err != nil {
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
log.Warnf("post-task script %q for task %d: %v", script, task.Id, err)
return
}
var exitErr *exec.ExitError
if errors.As(err, &exitErr) {
log.Warnf("post-task script %q for task %d exited with code %d", script, task.Id, exitErr.ExitCode())
return
}
log.Warnf("post-task script %q for task %d: %v", script, task.Id, err)
}
}
func postTaskScriptContext(ctx context.Context, timeout time.Duration) (context.Context, context.CancelFunc) {
// Detach from the task context's deadline and cancellation: the task has
// already finished by the time the post-task script runs, so the script must
// get its full configured timeout. Inheriting the task deadline would silently
// truncate that budget when the job completed close to its own timeout (and an
// already-cancelled task context would skip the script entirely).
// context.WithoutCancel keeps the context values while dropping the deadline.
return context.WithTimeout(context.WithoutCancel(ctx), timeout)
}
func (r *Runner) postTaskScriptEnv(reporter *report.Reporter, task *runnerv1.Task, workdir string) map[string]string {
env := r.cloneEnvs()
env["GITEA_TASK_ID"] = strconv.FormatInt(task.Id, 10)
env["GITEA_WORKSPACE"] = workdir
// GITEA_JOB_RESULT shares the runner's canonical result vocabulary
// (success/failure/cancelled/skipped/unknown), the same strings the reporter
// parses and the metrics labels use.
env["GITEA_JOB_RESULT"] = metrics.ResultToStatusLabel(reporter.Result())
if v := task.Context.Fields["run_id"].GetStringValue(); v != "" {
env["GITEA_RUN_ID"] = v
}
if v := task.Context.Fields["repository"].GetStringValue(); v != "" {
env["GITEA_REPOSITORY"] = v
}
return env
}
func envListFromMap(env map[string]string) []string {
envList := make([]string, 0, len(env))
for k, v := range env {
envList = append(envList, fmt.Sprintf("%s=%s", k, v))
}
return envList
}
// postTaskScriptLogWriter returns an io.Writer that logs the script's output one
// line at a time, tagged with the stream name. It is passed as cmd.Stdout/Stderr
// (rather than a StdoutPipe) so that cmd.WaitDelay governs the copying goroutine:
// a backgrounded process holding the pipe open can never block cmd.Wait()
// indefinitely. Flush any trailing partial line with common.FlushWriter after
// cmd.Wait() returns.
func postTaskScriptLogWriter(stream string) io.Writer {
return common.NewLineWriter(func(line string) bool {
log.Infof("post-task script %s: %s", stream, strings.TrimRight(line, "\r\n"))
return true
})
}

View File

@@ -0,0 +1,157 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
package run
import (
"context"
"os"
"path/filepath"
"testing"
"time"
"gitea.com/gitea/runner/internal/pkg/config"
"gitea.com/gitea/runner/internal/pkg/metrics"
"gitea.com/gitea/runner/internal/pkg/report"
runnerv1 "gitea.dev/actions-proto-go/runner/v1"
log "github.com/sirupsen/logrus"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"google.golang.org/protobuf/types/known/structpb"
)
func TestRunPostTaskScriptSkippedWhenEmpty(t *testing.T) {
r := &Runner{
cfg: &config.Config{},
}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
taskCtx, err := structpb.NewStruct(map[string]any{})
require.NoError(t, err)
task := &runnerv1.Task{Id: 1, Context: taskCtx}
reporter := report.NewReporter(ctx, cancel, nil, task, r.cfg)
require.NotPanics(t, func() {
r.runPostTaskScript(ctx, reporter, task, "/workspace/owner/repo")
})
}
func TestRunPostTaskScriptNonZeroExitDoesNotPanic(t *testing.T) {
dir := t.TempDir()
scriptPath := filepath.Join(dir, "fail.sh")
require.NoError(t, os.WriteFile(scriptPath, []byte("#!/bin/sh\nexit 2\n"), 0o700))
cfg, err := config.LoadDefault("")
require.NoError(t, err)
cfg.Runner.PostTaskScript = scriptPath
r := &Runner{cfg: cfg}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
taskCtx, err := structpb.NewStruct(map[string]any{})
require.NoError(t, err)
task := &runnerv1.Task{Id: 1, Context: taskCtx}
reporter := report.NewReporter(ctx, cancel, nil, task, cfg)
require.NotPanics(t, func() {
r.runPostTaskScript(ctx, reporter, task, "/workspace/owner/repo")
})
}
func TestPostTaskScriptContextUsesFullTimeout(t *testing.T) {
const timeout = 5 * time.Minute
// A task context that finished close to its own deadline must not truncate the
// script's budget: the script should still get its full configured timeout.
near, cancelNear := context.WithTimeout(context.Background(), time.Second)
defer cancelNear()
scriptCtx, cancel := postTaskScriptContext(near, timeout)
defer cancel()
deadline, ok := scriptCtx.Deadline()
require.True(t, ok)
assert.Greater(t, time.Until(deadline), time.Minute, "script timeout truncated to task deadline")
// An already-cancelled task context must not cancel the script either.
cancelledCtx, cancelIt := context.WithCancel(context.Background())
cancelIt()
scriptCtx2, cancel2 := postTaskScriptContext(cancelledCtx, timeout)
defer cancel2()
assert.NoError(t, scriptCtx2.Err(), "script context inherited the cancelled task context")
}
func TestPostTaskScriptEnv(t *testing.T) {
cfg, err := config.LoadDefault("")
require.NoError(t, err)
r := &Runner{
cfg: cfg,
envs: map[string]string{"BASE": "1"},
}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
taskCtx, err := structpb.NewStruct(map[string]any{
"run_id": "99",
"repository": "acme/widget",
})
require.NoError(t, err)
task := &runnerv1.Task{Id: 3, Context: taskCtx}
reporter := report.NewReporter(ctx, cancel, nil, task, cfg)
setReporterJobResult(t, reporter, runnerv1.Result_RESULT_FAILURE)
env := r.postTaskScriptEnv(reporter, task, "/tmp/workspace")
assert.Equal(t, "1", env["BASE"])
assert.Equal(t, "3", env["GITEA_TASK_ID"])
assert.Equal(t, "99", env["GITEA_RUN_ID"])
assert.Equal(t, "acme/widget", env["GITEA_REPOSITORY"])
assert.Equal(t, "/tmp/workspace", env["GITEA_WORKSPACE"])
assert.Equal(t, "failure", env["GITEA_JOB_RESULT"])
}
func TestRunPostTaskScriptIntegration(t *testing.T) {
dir := t.TempDir()
outFile := filepath.Join(dir, "out.txt")
scriptPath := filepath.Join(dir, "post-task.sh")
script := "#!/bin/sh\nprintf '%s %s %s' \"$GITEA_TASK_ID\" \"$GITEA_JOB_RESULT\" \"$CUSTOM\" > \"" + outFile + "\"\n"
require.NoError(t, os.WriteFile(scriptPath, []byte(script), 0o700))
cfg, err := config.LoadDefault("")
require.NoError(t, err)
cfg.Runner.PostTaskScript = scriptPath
r := &Runner{
cfg: cfg,
envs: map[string]string{"CUSTOM": "runner-env"},
}
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
taskCtx, err := structpb.NewStruct(map[string]any{})
require.NoError(t, err)
task := &runnerv1.Task{Id: 11, Context: taskCtx}
reporter := report.NewReporter(ctx, cancel, nil, task, cfg)
setReporterJobResult(t, reporter, runnerv1.Result_RESULT_SUCCESS)
r.runPostTaskScript(ctx, reporter, task, "/workspace/acme/repo")
content, err := os.ReadFile(outFile)
require.NoError(t, err)
assert.Equal(t, "11 success runner-env", string(content))
}
func setReporterJobResult(t *testing.T, reporter *report.Reporter, result runnerv1.Result) {
t.Helper()
require.NoError(t, reporter.Fire(&log.Entry{
Time: time.Now(),
Message: "job finished",
Data: log.Fields{
"stage": "Post",
"jobResult": metrics.ResultToStatusLabel(result),
},
}))
}

View File

@@ -22,6 +22,7 @@ import (
"gitea.com/gitea/runner/act/artifactcache" "gitea.com/gitea/runner/act/artifactcache"
"gitea.com/gitea/runner/act/common" "gitea.com/gitea/runner/act/common"
"gitea.com/gitea/runner/act/container"
"gitea.com/gitea/runner/act/model" "gitea.com/gitea/runner/act/model"
"gitea.com/gitea/runner/act/runner" "gitea.com/gitea/runner/act/runner"
"gitea.com/gitea/runner/internal/pkg/client" "gitea.com/gitea/runner/internal/pkg/client"
@@ -33,7 +34,7 @@ import (
"connectrpc.com/connect" "connectrpc.com/connect"
runnerv1 "gitea.dev/actions-proto-go/runner/v1" runnerv1 "gitea.dev/actions-proto-go/runner/v1"
"github.com/moby/moby/api/types/container" docker_container "github.com/moby/moby/api/types/container"
log "github.com/sirupsen/logrus" log "github.com/sirupsen/logrus"
) )
@@ -418,22 +419,26 @@ func (r *Runner) run(ctx context.Context, task *runnerv1.Task, reporter *report.
AllocatePTY: r.cfg.Runner.AllocatePTY, AllocatePTY: r.cfg.Runner.AllocatePTY,
ActionOfflineMode: r.cfg.Cache.OfflineMode, ActionOfflineMode: r.cfg.Cache.OfflineMode,
ReuseContainers: false, ReuseContainers: false,
ForcePull: r.cfg.Container.ForcePull, ForcePull: r.cfg.Container.ForcePull,
ForceRebuild: r.cfg.Container.ForceRebuild, ForceRebuild: r.cfg.Container.ForceRebuild,
LogOutput: true, LogOutput: true,
JSONLogger: false, JSONLogger: false,
Env: envs, Env: envs,
Secrets: task.Secrets, Secrets: task.Secrets,
GitHubInstance: strings.TrimSuffix(r.client.Address(), "/"), GitHubInstance: strings.TrimSuffix(r.client.Address(), "/"),
AutoRemove: true, AutoRemove: true,
NoSkipCheckout: true, NoSkipCheckout: true,
PresetGitHubContext: preset, PresetGitHubContext: preset,
EventJSON: string(eventJSON), EventJSON: string(eventJSON),
ContainerNamePrefix: fmt.Sprintf("GITEA-ACTIONS-TASK-%d", task.Id), ContainerNamePrefix: fmt.Sprintf("GITEA-ACTIONS-TASK-%d", task.Id),
ContainerMaxLifetime: maxLifetime, ContainerMaxLifetime: maxLifetime,
CleanWorkdir: true, CleanWorkdir: true,
ContainerNetworkMode: container.NetworkMode(r.cfg.Container.Network), ContainerNetworkMode: docker_container.NetworkMode(r.cfg.Container.Network),
ContainerNetworkCreateOptions: container.NewDockerNetworkCreateExecutorInput{
EnableIPv4: r.cfg.Container.NetworkCreateOptions.EnableIPv4,
EnableIPv6: r.cfg.Container.NetworkCreateOptions.EnableIPv6,
},
ContainerOptions: r.cfg.Container.Options, ContainerOptions: r.cfg.Container.Options,
ContainerDaemonSocket: r.cfg.Container.DockerHost, ContainerDaemonSocket: r.cfg.Container.DockerHost,
Privileged: r.cfg.Container.Privileged, Privileged: r.cfg.Container.Privileged,
@@ -470,6 +475,9 @@ func (r *Runner) run(ctx context.Context, task *runnerv1.Task, reporter *report.
} }
} }
reporter.StopHeartbeats()
r.runPostTaskScript(ctx, reporter, task, workdir)
return execErr return execErr
} }

View File

@@ -83,31 +83,47 @@ runner:
# terminal; tools like `docker build` emit redrawing progress frames into the captured log # terminal; tools like `docker build` emit redrawing progress frames into the captured log
# when a TTY is present. # when a TTY is present.
allocate_pty: false allocate_pty: false
# Optional executable on the host, run once after each task's built-in cleanup
# (post-steps, container teardown, bind-workdir removal). Additive only.
#
# IMPORTANT: While this script runs the runner stops task heartbeats and stays
# offline from Gitea's perspective until the script exits. A script that never
# returns blocks new work until post_task_script_timeout kills it (default 5m).
# Keep scripts short; set post_task_script_timeout to a safe upper bound.
#
# Output -> runner process log (not the job log). Non-zero exit -> warning only.
# Windows: use .exe, .bat, or .cmd. PowerShell (.ps1) is not supported yet as
# the configured path; wrap PowerShell commands in a .cmd file instead.
# Full guide: docs/post-task-script.md
post_task_script: ''
# Hard limit on post_task_script runtime. Default if omitted: 5m.
post_task_script_timeout: 5m
cache: cache:
# Enable cache server to use actions/cache. # Enable the built-in cache server (used by actions/cache and similar actions).
enabled: true enabled: true
# The directory to store the cache data. # Directory where cache blobs are stored on disk. Default: $HOME/.cache/actcache
# If it's empty, the cache data will be stored in $HOME/.cache/actcache. # Ignored when external_server is set.
dir: "" dir: ""
# The host of the cache server. # Outbound IP or hostname that job containers use to reach this runner's cache server.
# It's not for the address to listen, but the address to connect from job containers. # Leave empty to detect automatically. 0.0.0.0 is not valid here.
# So 0.0.0.0 is a bad choice, leave it empty to detect automatically. # Ignored when external_server is set.
host: "" host: ""
# The port of the cache server. # Port for the built-in cache server. 0 picks a random free port.
# 0 means to use a random available port. # Ignored when external_server is set.
port: 0 port: 0
# The external cache server URL. Valid only when enable is true. # URL of a shared `gitea-runner cache-server` to use instead of starting a local one.
# If it's specified, runner will use this URL as the ACTIONS_CACHE_URL rather than start a server by itself. # Set on every runner that should share a cache pool. Must end with "/".
# The URL should generally end with "/". # Example: "http://cache-host:8088/"
# Requires external_secret below to be set to the same value on both this runner and the cache-server. # Requires external_secret (below) to match the value on the cache-server.
external_server: "" external_server: ""
# Shared secret between this runner and the external `gitea-runner cache-server`. Required when external_server # Shared secret between this runner and the external cache-server.
# (or `gitea-runner cache-server`) is in use: the runner pre-registers each job's ACTIONS_RUNTIME_TOKEN with the # Required when external_server is set. Must be identical on every runner and the cache-server.
# cache-server, and the cache-server enforces bearer auth + per-repo cache isolation. # Generate with: openssl rand -hex 32
external_secret: "" external_secret: ""
# When true, reuse a cached action instead of fetching from the remote on every job. Note: a moved tag # When true, reuse a cached action instead of fetching from the remote on every job.
# (e.g. a re-tagged "v6") or an updated branch stays at the cached commit until its cache entry is removed. # A moved tag (e.g. a re-tagged "v6") or an updated branch stays at the cached commit
# until its cache entry expires or is manually removed.
offline_mode: false offline_mode: false
container: container:
@@ -116,6 +132,13 @@ container:
# If it's empty, runner will create a network automatically. # If it's empty, runner will create a network automatically.
# Deprecated: `network_mode` is still accepted for old configs; use `network` instead. # Deprecated: `network_mode` is still accepted for old configs; use `network` instead.
network: "" network: ""
# network_create_options only apply when `network` is left empty and the runner
# auto-creates a per-job network that does not already exist. They have no effect
# when a custom `network` name is set, because that network is used as-is and never
# created by the runner. Omit the entire block to use Docker's defaults.
network_create_options:
enable_ipv4: true # Omit to use Docker's default (IPv4 enabled). Set false to disable IPv4.
enable_ipv6: false # Omit to use Docker's default (IPv6 disabled). Enabling it requires dockerd started with --ipv6.
# Whether to use privileged mode or not when launching task containers (privileged mode is required for Docker-in-Docker). # Whether to use privileged mode or not when launching task containers (privileged mode is required for Docker-in-Docker).
privileged: false privileged: false
# Any other options to be used when the container is started (e.g., --add-host=my.gitea.url:host-gateway). # Any other options to be used when the container is started (e.g., --add-host=my.gitea.url:host-gateway).

View File

@@ -16,6 +16,12 @@ import (
"go.yaml.in/yaml/v4" "go.yaml.in/yaml/v4"
) )
// DefaultPostTaskScriptTimeout is the fallback cap on how long the post-task
// script may run when post_task_script is set without an explicit timeout. It is
// applied both at config load (for a configured script) and at the point of use
// (so a programmatically built config still gets a sane bound).
const DefaultPostTaskScriptTimeout = 5 * time.Minute
// Log represents the configuration for logging. // Log represents the configuration for logging.
type Log struct { type Log struct {
Level string `yaml:"level"` // Level indicates the logging level. Level string `yaml:"level"` // Level indicates the logging level.
@@ -23,26 +29,28 @@ type Log struct {
// Runner represents the configuration for the runner. // Runner represents the configuration for the runner.
type Runner struct { type Runner struct {
File string `yaml:"file"` // File specifies the file path for the runner. File string `yaml:"file"` // File specifies the file path for the runner.
Capacity int `yaml:"capacity"` // Capacity specifies the capacity of the runner. Capacity int `yaml:"capacity"` // Capacity specifies the capacity of the runner.
Envs map[string]string `yaml:"envs"` // Envs stores environment variables for the runner. Envs map[string]string `yaml:"envs"` // Envs stores environment variables for the runner.
EnvFile string `yaml:"env_file"` // EnvFile specifies the path to the file containing environment variables for the runner. EnvFile string `yaml:"env_file"` // EnvFile specifies the path to the file containing environment variables for the runner.
Timeout time.Duration `yaml:"timeout"` // Timeout specifies the duration for runner timeout. Timeout time.Duration `yaml:"timeout"` // Timeout specifies the duration for runner timeout.
ShutdownTimeout time.Duration `yaml:"shutdown_timeout"` // ShutdownTimeout specifies the duration to wait for running jobs to complete during a shutdown of the runner. ShutdownTimeout time.Duration `yaml:"shutdown_timeout"` // ShutdownTimeout specifies the duration to wait for running jobs to complete during a shutdown of the runner.
Insecure bool `yaml:"insecure"` // Insecure indicates whether the runner operates in an insecure mode. Insecure bool `yaml:"insecure"` // Insecure indicates whether the runner operates in an insecure mode.
FetchTimeout time.Duration `yaml:"fetch_timeout"` // FetchTimeout specifies the timeout duration for fetching resources. FetchTimeout time.Duration `yaml:"fetch_timeout"` // FetchTimeout specifies the timeout duration for fetching resources.
FetchInterval time.Duration `yaml:"fetch_interval"` // FetchInterval specifies the interval duration for fetching resources. FetchInterval time.Duration `yaml:"fetch_interval"` // FetchInterval specifies the interval duration for fetching resources.
FetchIntervalMax time.Duration `yaml:"fetch_interval_max"` // FetchIntervalMax specifies the maximum backoff interval when idle. FetchIntervalMax time.Duration `yaml:"fetch_interval_max"` // FetchIntervalMax specifies the maximum backoff interval when idle.
WorkdirCleanupAge time.Duration `yaml:"workdir_cleanup_age"` // WorkdirCleanupAge removes stale bind-workdir task directories and orphaned host-mode scratch dirs older than this duration during idle cleanup. WorkdirCleanupAge time.Duration `yaml:"workdir_cleanup_age"` // WorkdirCleanupAge removes stale bind-workdir task directories and orphaned host-mode scratch dirs older than this duration during idle cleanup.
IdleCleanupInterval time.Duration `yaml:"idle_cleanup_interval"` // IdleCleanupInterval runs stale-directory cleanup periodically while the runner is idle. Set to 0 to disable cleanup cadence. IdleCleanupInterval time.Duration `yaml:"idle_cleanup_interval"` // IdleCleanupInterval runs stale-directory cleanup periodically while the runner is idle. Set to 0 to disable cleanup cadence.
LogReportInterval time.Duration `yaml:"log_report_interval"` // LogReportInterval specifies the base interval for periodic log flush. LogReportInterval time.Duration `yaml:"log_report_interval"` // LogReportInterval specifies the base interval for periodic log flush.
LogReportMaxLatency time.Duration `yaml:"log_report_max_latency"` // LogReportMaxLatency specifies the max time a log row can wait before being sent. LogReportMaxLatency time.Duration `yaml:"log_report_max_latency"` // LogReportMaxLatency specifies the max time a log row can wait before being sent.
LogReportBatchSize int `yaml:"log_report_batch_size"` // LogReportBatchSize triggers immediate log flush when buffer reaches this size. LogReportBatchSize int `yaml:"log_report_batch_size"` // LogReportBatchSize triggers immediate log flush when buffer reaches this size.
StateReportInterval time.Duration `yaml:"state_report_interval"` // StateReportInterval specifies the interval for state reporting. StateReportInterval time.Duration `yaml:"state_report_interval"` // StateReportInterval specifies the interval for state reporting.
ReportCloseTimeout time.Duration `yaml:"report_close_timeout"` // ReportCloseTimeout caps each RPC attempt when flushing the final logs and task state at job completion, on a detached context so a server cancel can't block the acknowledgement. ReportCloseTimeout time.Duration `yaml:"report_close_timeout"` // ReportCloseTimeout caps each RPC attempt when flushing the final logs and task state at job completion, on a detached context so a server cancel can't block the acknowledgement.
Labels []string `yaml:"labels"` // Labels specify the labels of the runner. Labels are declared on each startup Labels []string `yaml:"labels"` // Labels specify the labels of the runner. Labels are declared on each startup
GithubMirror string `yaml:"github_mirror"` // GithubMirror defines what mirrors should be used when using github GithubMirror string `yaml:"github_mirror"` // GithubMirror defines what mirrors should be used when using github
AllocatePTY bool `yaml:"allocate_pty"` // AllocatePTY allocates a pseudo-TTY for each step's process. Default is false, matching GitHub's actions/runner. Enable only for jobs that need an interactive terminal; tools like docker build emit redrawing progress frames into the captured log when a TTY is present. Applies to both host and docker backends. AllocatePTY bool `yaml:"allocate_pty"` // AllocatePTY allocates a pseudo-TTY for each step's process. Default is false, matching GitHub's actions/runner. Enable only for jobs that need an interactive terminal; tools like docker build emit redrawing progress frames into the captured log when a TTY is present. Applies to both host and docker backends.
PostTaskScript string `yaml:"post_task_script"` // PostTaskScript is the path to an executable script run on the host after each task's cleanup completes. Empty disables the hook. On Windows use .exe/.bat/.cmd; PowerShell (.ps1) is not supported yet as the configured path.
PostTaskScriptTimeout time.Duration `yaml:"post_task_script_timeout"` // PostTaskScriptTimeout caps how long the post-task script may run. Default is 5m when post_task_script is set.
} }
// Cache represents the configuration for caching. // Cache represents the configuration for caching.
@@ -58,18 +66,24 @@ type Cache struct {
// Container represents the configuration for the container. // Container represents the configuration for the container.
type Container struct { type Container struct {
Network string `yaml:"network"` // Network specifies the network for the container. Network string `yaml:"network"` // Network specifies the network for the container.
NetworkMode string `yaml:"network_mode"` // Deprecated: use Network instead. Could be removed after Gitea 1.20 NetworkCreateOptions ContainerNetworkCreateOptions `yaml:"network_create_options"` // Add options when the network need to be created by the runner
Privileged bool `yaml:"privileged"` // Privileged indicates whether the container runs in privileged mode. NetworkMode string `yaml:"network_mode"` // Deprecated: use Network instead. Could be removed after Gitea 1.20
Options string `yaml:"options"` // Options specifies additional options for the container. Privileged bool `yaml:"privileged"` // Privileged indicates whether the container runs in privileged mode.
WorkdirParent string `yaml:"workdir_parent"` // WorkdirParent specifies the parent directory for the container's working directory. Options string `yaml:"options"` // Options specifies additional options for the container.
ValidVolumes []string `yaml:"valid_volumes"` // ValidVolumes specifies the volumes (including bind mounts) can be mounted to containers. WorkdirParent string `yaml:"workdir_parent"` // WorkdirParent specifies the parent directory for the container's working directory.
DockerHost string `yaml:"docker_host"` // DockerHost specifies the Docker host. It overrides the value specified in environment variable DOCKER_HOST. ValidVolumes []string `yaml:"valid_volumes"` // ValidVolumes specifies the volumes (including bind mounts) can be mounted to containers.
ForcePull bool `yaml:"force_pull"` // Pull docker image(s) even if already present DockerHost string `yaml:"docker_host"` // DockerHost specifies the Docker host. It overrides the value specified in environment variable DOCKER_HOST.
ForceRebuild bool `yaml:"force_rebuild"` // Rebuild docker image(s) even if already present ForcePull bool `yaml:"force_pull"` // Pull docker image(s) even if already present
RequireDocker bool `yaml:"require_docker"` // Always require a reachable docker daemon, even if not required by runner ForceRebuild bool `yaml:"force_rebuild"` // Rebuild docker image(s) even if already present
DockerTimeout time.Duration `yaml:"docker_timeout"` // Timeout to wait for the docker daemon to be reachable, if docker is required by require_docker or runner RequireDocker bool `yaml:"require_docker"` // Always require a reachable docker daemon, even if not required by runner
BindWorkdir bool `yaml:"bind_workdir"` // BindWorkdir binds the workspace to the host filesystem instead of using Docker volumes. Required for DinD when jobs use docker compose with bind mounts. DockerTimeout time.Duration `yaml:"docker_timeout"` // Timeout to wait for the docker daemon to be reachable, if docker is required by require_docker or runner
BindWorkdir bool `yaml:"bind_workdir"` // BindWorkdir binds the workspace to the host filesystem instead of using Docker volumes. Required for DinD when jobs use docker compose with bind mounts.
}
type ContainerNetworkCreateOptions struct {
EnableIPv4 *bool `yaml:"enable_ipv4"` // Enable or disable IPv4 for the network (true for docker by default)
EnableIPv6 *bool `yaml:"enable_ipv6"` // Enable or disable IPv6 for the network (false for docker by default)
} }
// Host represents the configuration for the host. // Host represents the configuration for the host.
@@ -187,6 +201,9 @@ func LoadDefault(file string) (*Config, error) {
if cfg.Runner.ReportCloseTimeout <= 0 { if cfg.Runner.ReportCloseTimeout <= 0 {
cfg.Runner.ReportCloseTimeout = 10 * time.Second cfg.Runner.ReportCloseTimeout = 10 * time.Second
} }
if cfg.Runner.PostTaskScript != "" && cfg.Runner.PostTaskScriptTimeout <= 0 {
cfg.Runner.PostTaskScriptTimeout = DefaultPostTaskScriptTimeout
}
if cfg.Metrics.Addr == "" { if cfg.Metrics.Addr == "" {
cfg.Metrics.Addr = "127.0.0.1:9101" cfg.Metrics.Addr = "127.0.0.1:9101"
} }

View File

@@ -107,6 +107,34 @@ runner:
// TestLoadDefault_MalformedYAMLReturnsParseError pins the error surfaced for // TestLoadDefault_MalformedYAMLReturnsParseError pins the error surfaced for
// invalid YAML to the canonical "parse config file" message rather than the // invalid YAML to the canonical "parse config file" message rather than the
// "for defaults metadata" variant — i.e. the main yaml.Unmarshal runs first. // "for defaults metadata" variant — i.e. the main yaml.Unmarshal runs first.
func TestLoadDefault_LoadsPostTaskScript(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "config.yaml")
require.NoError(t, os.WriteFile(path, []byte(`
runner:
post_task_script: /usr/local/bin/post-task.sh
post_task_script_timeout: 2m
`), 0o600))
cfg, err := LoadDefault(path)
require.NoError(t, err)
assert.Equal(t, "/usr/local/bin/post-task.sh", cfg.Runner.PostTaskScript)
assert.Equal(t, 2*time.Minute, cfg.Runner.PostTaskScriptTimeout)
}
func TestLoadDefault_DefaultsPostTaskScriptTimeout(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "config.yaml")
require.NoError(t, os.WriteFile(path, []byte(`
runner:
post_task_script: /usr/local/bin/post-task.sh
`), 0o600))
cfg, err := LoadDefault(path)
require.NoError(t, err)
assert.Equal(t, 5*time.Minute, cfg.Runner.PostTaskScriptTimeout)
}
func TestLoadDefault_MalformedYAMLReturnsParseError(t *testing.T) { func TestLoadDefault_MalformedYAMLReturnsParseError(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
path := filepath.Join(dir, "config.yaml") path := filepath.Join(dir, "config.yaml")
@@ -117,3 +145,50 @@ func TestLoadDefault_MalformedYAMLReturnsParseError(t *testing.T) {
assert.Contains(t, err.Error(), "parse config file") assert.Contains(t, err.Error(), "parse config file")
assert.NotContains(t, err.Error(), "defaults metadata") assert.NotContains(t, err.Error(), "defaults metadata")
} }
func TestContainerNetworkCreateOptions(t *testing.T) {
// Verify that the enable_ipv4/enable_ipv6 YAML keys unmarshal into the *bool fields,
// distinguishing an explicit true/false from an omitted key (nil). A nil here is
// forwarded as-is to Docker, which applies its own default.
loadOptions := func(t *testing.T, yaml string) ContainerNetworkCreateOptions {
t.Helper()
dir := t.TempDir()
path := filepath.Join(dir, "config.yaml")
require.NoError(t, os.WriteFile(path, []byte(yaml), 0o600))
cfg, err := LoadDefault(path)
require.NoError(t, err)
return cfg.Container.NetworkCreateOptions
}
t.Run("enable_ipv6 true unmarshals to non-nil true", func(t *testing.T) {
opts := loadOptions(t, "container:\n network_create_options:\n enable_ipv6: true\n")
require.NotNil(t, opts.EnableIPv6)
assert.True(t, *opts.EnableIPv6)
})
t.Run("enable_ipv6 false unmarshals to non-nil false", func(t *testing.T) {
opts := loadOptions(t, "container:\n network_create_options:\n enable_ipv6: false\n")
require.NotNil(t, opts.EnableIPv6)
assert.False(t, *opts.EnableIPv6)
})
t.Run("enable_ipv4 false unmarshals to non-nil false", func(t *testing.T) {
opts := loadOptions(t, "container:\n network_create_options:\n enable_ipv4: false\n")
require.NotNil(t, opts.EnableIPv4)
assert.False(t, *opts.EnableIPv4)
})
t.Run("omitted keys stay nil", func(t *testing.T) {
opts := loadOptions(t, "container:\n network_create_options:\n enable_ipv4: true\n")
require.NotNil(t, opts.EnableIPv4)
assert.True(t, *opts.EnableIPv4)
assert.Nil(t, opts.EnableIPv6, "an omitted enable_ipv6 must remain nil so Docker's default applies")
})
t.Run("omitted block leaves both nil", func(t *testing.T) {
opts := loadOptions(t, "container:\n network: \"\"\n")
assert.Nil(t, opts.EnableIPv4)
assert.Nil(t, opts.EnableIPv6)
})
}

View File

@@ -0,0 +1,29 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
//go:build plan9
package process
import "os"
// Killer falls back to single-process termination on platforms without a
// process-group / Job Object tree-kill. The Job Object (Windows) and process
// group (Unix) based tree-kills live in killer_windows.go / killer_unix.go;
// here we just kill the direct child, matching the previous default behaviour.
type Killer struct {
p *os.Process
}
func NewKiller(p *os.Process) (*Killer, error) {
return &Killer{p: p}, nil
}
func (k *Killer) Kill() error {
if k == nil || k.p == nil {
return nil
}
return k.p.Kill()
}
func (k *Killer) Close() error { return nil }

View File

@@ -0,0 +1,56 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
//go:build !windows && !plan9
package process
import (
"errors"
"os"
"syscall"
)
// Killer terminates a started process together with its whole process group,
// which is the Unix counterpart of the Windows Job Object tree-kill.
//
// Background: a process (a step or a post-task script) often launches a process
// tree (a shell that starts a child which in turn spawns further background
// processes). The default exec.CommandContext cancellation only kills the
// direct child, so cancelling left the rest of the tree running. Because those
// orphans inherited the parent's stdout/stderr pipe, cmd.Wait() also blocked
// forever and the runner hung.
//
// Processes are started with Setpgid (or Setsid for the PTY path, see
// SysProcAttr), which makes the process the leader of a new process group whose
// ID equals its PID. Signalling the negative PID delivers to every process
// still in that group, so we can tear down the whole tree atomically on
// cancellation, which also closes the inherited pipe handles so cmd.Wait() can
// return.
type Killer struct {
pgid int
}
// NewKiller captures the process group of p (an already-started process).
// Because the process is launched with Setpgid/Setsid, p is a group leader and
// its PGID equals its PID; children spawned afterwards stay in the same group
// unless they explicitly create their own.
func NewKiller(p *os.Process) (*Killer, error) {
return &Killer{pgid: p.Pid}, nil
}
// Kill sends SIGKILL to the entire process group (the process and every
// descendant that stayed in the group). A missing group (ESRCH) means the
// processes already exited and is not treated as an error.
func (k *Killer) Kill() error {
if k == nil || k.pgid <= 0 {
return nil
}
if err := syscall.Kill(-k.pgid, syscall.SIGKILL); err != nil && !errors.Is(err, syscall.ESRCH) {
return err
}
return nil
}
// Close is a no-op on Unix; there is no job handle to release.
func (k *Killer) Close() error { return nil }

View File

@@ -3,7 +3,7 @@
//go:build !windows && !plan9 //go:build !windows && !plan9
package container package process
import ( import (
"fmt" "fmt"
@@ -47,11 +47,12 @@ func processAlive(pid int) bool {
return true return true
} }
// TestProcessKillerKillsTree verifies that a process group captured by the // TestKillerKillsTree verifies that a process group captured by the killer is
// killer is terminated together with a child the step spawns afterwards. This // terminated together with a child the process spawns afterwards. This mirrors
// mirrors a step that launches a child which spawns further processes, where // a step or post-task script that launches a child which spawns further
// cancelling the job must take down the whole tree, not just the direct child. // processes, where cancelling must take down the whole tree, not just the
func TestProcessKillerKillsTree(t *testing.T) { // direct child.
func TestKillerKillsTree(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
pidFile := filepath.Join(dir, "child.pid") pidFile := filepath.Join(dir, "child.pid")
@@ -60,8 +61,8 @@ func TestProcessKillerKillsTree(t *testing.T) {
// child stays in the parent's process group, so the group kill must reach it. // child stays in the parent's process group, so the group kill must reach it.
script := fmt.Sprintf(`sleep 600 & echo $! > %q; sleep 600`, pidFile) script := fmt.Sprintf(`sleep 600 & echo $! > %q; sleep 600`, pidFile)
cmd := exec.Command("/bin/sh", "-c", script) cmd := exec.Command("/bin/sh", "-c", script)
// Launch as its own process-group leader, exactly like a real step does (see // Launch as its own process-group leader, exactly like a real process does
// getSysProcAttr), so the killer's PGID == the process PID. // (see SysProcAttr), so the killer's PGID == the process PID.
cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true} cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}
require.NoError(t, cmd.Start()) require.NoError(t, cmd.Start())
t.Cleanup(func() { t.Cleanup(func() {
@@ -69,7 +70,7 @@ func TestProcessKillerKillsTree(t *testing.T) {
_ = cmd.Wait() _ = cmd.Wait()
}) })
killer, err := newProcessKiller(cmd.Process) killer, err := NewKiller(cmd.Process)
require.NoError(t, err) require.NoError(t, err)
defer killer.Close() defer killer.Close()

View File

@@ -0,0 +1,72 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
package process
import (
"os"
"golang.org/x/sys/windows"
)
// Killer terminates a started process together with its entire descendant tree
// via a Windows Job Object.
//
// Background: a process (a step or a post-task script) often launches a process
// tree (a shell that starts a child which in turn spawns further GUI or
// background processes). The default exec.CommandContext cancellation only kills
// the direct child, so cancelling left the rest of the tree running. Because
// those orphans inherited the parent's stdout/stderr pipe, cmd.Wait() also
// blocked forever and the runner hung.
//
// Assigning the process to a Job Object lets us kill the whole tree atomically
// on cancellation (TerminateJobObject), which also closes the inherited pipe
// handles so cmd.Wait() can return.
type Killer struct {
job windows.Handle
}
// NewKiller creates a Job Object and assigns p (an already-started process) to
// it. Children spawned by p afterwards are automatically part of the job. The
// job does NOT use JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE, so closing the handle on
// normal completion does not kill legitimate background processes; the tree is
// only torn down by an explicit Kill (cancellation).
func NewKiller(p *os.Process) (*Killer, error) {
job, err := windows.CreateJobObject(nil, nil)
if err != nil {
return nil, err
}
h, err := windows.OpenProcess(windows.PROCESS_SET_QUOTA|windows.PROCESS_TERMINATE, false, uint32(p.Pid))
if err != nil {
windows.CloseHandle(job)
return nil, err
}
defer windows.CloseHandle(h)
if err := windows.AssignProcessToJobObject(job, h); err != nil {
windows.CloseHandle(job)
return nil, err
}
return &Killer{job: job}, nil
}
// Kill terminates every process currently assigned to the job (the process and
// all of its descendants).
func (k *Killer) Kill() error {
if k == nil || k.job == 0 {
return nil
}
return windows.TerminateJobObject(k.job, 1)
}
// Close releases the job handle. It does not terminate the processes.
func (k *Killer) Close() error {
if k == nil || k.job == 0 {
return nil
}
h := k.job
k.job = 0
return windows.CloseHandle(h)
}

View File

@@ -1,7 +1,7 @@
// Copyright 2026 The Gitea Authors. All rights reserved. // Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT // SPDX-License-Identifier: MIT
package container package process
import ( import (
"fmt" "fmt"
@@ -32,11 +32,11 @@ func processAlive(pid int) bool {
return code == stillActive return code == stillActive
} }
// TestProcessKillerKillsTree verifies that a process assigned to the Job Object // TestKillerKillsTree verifies that a process assigned to the Job Object is
// is terminated together with a child it spawns afterwards. This mirrors a step // terminated together with a child it spawns afterwards. This mirrors a step or
// that launches a child which spawns further processes, where cancelling the // post-task script that launches a child which spawns further processes, where
// job must take down the whole tree, not just the direct child. // cancelling must take down the whole tree, not just the direct child.
func TestProcessKillerKillsTree(t *testing.T) { func TestKillerKillsTree(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
pidFile := filepath.Join(dir, "child.pid") pidFile := filepath.Join(dir, "child.pid")
@@ -50,7 +50,7 @@ func TestProcessKillerKillsTree(t *testing.T) {
require.NoError(t, cmd.Start()) require.NoError(t, cmd.Start())
t.Cleanup(func() { _ = cmd.Process.Kill() }) t.Cleanup(func() { _ = cmd.Process.Kill() })
killer, err := newProcessKiller(cmd.Process) killer, err := NewKiller(cmd.Process)
require.NoError(t, err) require.NoError(t, err)
defer killer.Close() defer killer.Close()

View File

@@ -0,0 +1,17 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
//go:build plan9
package process
import "syscall"
// SysProcAttr returns the platform attributes used to start a process. Plan 9
// has no process-group tree-kill (see Killer), so we only request a new rfork
// note group here.
func SysProcAttr(cmdLine string, tty bool) *syscall.SysProcAttr {
return &syscall.SysProcAttr{
Rfork: syscall.RFNOTEG,
}
}

View File

@@ -0,0 +1,24 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
//go:build !windows && !plan9
package process
import "syscall"
// SysProcAttr returns the platform attributes used to start a process so that a
// Killer can later tear down its whole process tree. On Unix the process becomes
// the leader of a new process group (or session, for the PTY path), so a
// signal to the negative PID reaches every descendant that stayed in the group.
func SysProcAttr(_ string, tty bool) *syscall.SysProcAttr {
if tty {
return &syscall.SysProcAttr{
Setsid: true,
Setctty: true,
}
}
return &syscall.SysProcAttr{
Setpgid: true,
}
}

View File

@@ -0,0 +1,14 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
package process
import "syscall"
// SysProcAttr returns the platform attributes used to start a process so that a
// Killer can later tear down its whole process tree. On Windows the process is
// placed in a new process group; the descendant tree is reclaimed via the Job
// Object set up by NewKiller.
func SysProcAttr(cmdLine string, tty bool) *syscall.SysProcAttr {
return &syscall.SysProcAttr{CmdLine: cmdLine, CreationFlags: syscall.CREATE_NEW_PROCESS_GROUP}
}

View File

@@ -0,0 +1,66 @@
// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
package process
import (
"os"
"os/exec"
"sync/atomic"
"time"
)
// treeKillWaitDelay bounds how long Wait lingers for the command's I/O pipes to
// drain after the process exits before force-closing them and returning. It also
// covers a command that backgrounds a process holding a pipe open after a clean
// exit.
const treeKillWaitDelay = 10 * time.Second
// TreeKill wires an exec.Cmd so that cancelling it tears down the command's
// whole process tree (see Killer) rather than only the direct child, and bounds
// the post-exit I/O wait so a leftover pipe writer can never hang cmd.Wait.
//
// Background: a command often launches a process tree (a shell that starts a
// child which spawns further background processes). The default
// exec.CommandContext cancellation only kills the direct child, leaving the rest
// of the tree running; and because the orphans inherit cmd's stdout/stderr pipe,
// cmd.Wait() would block forever, hanging the caller.
//
// Callers still set cmd.SysProcAttr (via SysProcAttr) themselves, because the
// value differs between the plain and PTY execution paths.
type TreeKill struct {
killer atomic.Pointer[Killer]
}
// NewTreeKill sets cmd.Cancel and cmd.WaitDelay. Call it before cmd.Start, then
// call Capture once after a successful Start.
func NewTreeKill(cmd *exec.Cmd) *TreeKill {
t := &TreeKill{}
cmd.Cancel = func() error {
if k := t.killer.Load(); k != nil {
return k.Kill()
}
if cmd.Process != nil {
return cmd.Process.Kill()
}
return nil
}
cmd.WaitDelay = treeKillWaitDelay
return t
}
// Capture assigns the started process (and the descendants it spawns) to a
// Killer so cancellation can reach the whole tree — a Job Object on Windows
// (children spawned afterwards are auto-included) and the process group on Unix.
// Call it once after cmd.Start. On failure the command falls back to the default
// single-process kill and the returned error is for logging only; WaitDelay
// still bounds the wait. The returned Killer should be closed when the command
// finishes (Close is nil-safe).
func (t *TreeKill) Capture(p *os.Process) (*Killer, error) {
k, err := NewKiller(p)
if err != nil {
return nil, err
}
t.killer.Store(k)
return k, nil
}

View File

@@ -44,11 +44,13 @@ type Reporter struct {
// so the gauge skips no-op Set calls when the buffer size is unchanged. // so the gauge skips no-op Set calls when the buffer size is unchanged.
lastLogBufferRows int lastLogBufferRows int
state *runnerv1.TaskState state *runnerv1.TaskState
stateChanged bool stateChanged bool
stateMu sync.RWMutex stateMu sync.RWMutex
outputs sync.Map outputs sync.Map
daemon chan struct{} daemon chan struct{}
heartbeatStop chan struct{}
heartbeatStopOnce sync.Once
// Unix-nanos of the last successful UpdateTask. Atomic so the heartbeat // Unix-nanos of the last successful UpdateTask. Atomic so the heartbeat
// guard in ReportState reads it without contending stateMu. // guard in ReportState reads it without contending stateMu.
@@ -99,7 +101,8 @@ func NewReporter(ctx context.Context, cancel context.CancelFunc, client client.C
state: &runnerv1.TaskState{ state: &runnerv1.TaskState{
Id: task.Id, Id: task.Id,
}, },
daemon: make(chan struct{}), daemon: make(chan struct{}),
heartbeatStop: make(chan struct{}),
} }
if task.Secrets["ACTIONS_STEP_DEBUG"] == "true" { if task.Secrets["ACTIONS_STEP_DEBUG"] == "true" {
@@ -273,6 +276,15 @@ func (r *Reporter) RunDaemon() {
go r.runDaemonLoop() go r.runDaemonLoop()
} }
// StopHeartbeats stops periodic UpdateTask heartbeats without cancelling the
// task context. Close() still delivers the final flush. Safe to call multiple
// times and when the context is already cancelled.
func (r *Reporter) StopHeartbeats() {
r.heartbeatStopOnce.Do(func() {
close(r.heartbeatStop)
})
}
func (r *Reporter) stopLatencyTimer(active *bool, timer *time.Timer) { func (r *Reporter) stopLatencyTimer(active *bool, timer *time.Timer) {
if *active { if *active {
if !timer.Stop() { if !timer.Stop() {
@@ -339,6 +351,12 @@ func (r *Reporter) runDaemonLoop() {
// delivers the final flush on a detached context (flushFinal). // delivers the final flush on a detached context (flushFinal).
close(r.daemon) close(r.daemon)
return return
case <-r.heartbeatStop:
// Stop heartbeating during post-task script execution. Close() still
// delivers the final flush on a detached context (flushFinal).
close(r.daemon)
return
} }
r.stateMu.RLock() r.stateMu.RLock()

View File

@@ -921,3 +921,65 @@ func TestReporter_CloseReportsCancelledOnCanceledCtx(t *testing.T) {
assert.True(t, foundCancelled, "final log must contain a 'Cancelled' row") assert.True(t, foundCancelled, "final log must contain a 'Cancelled' row")
assert.False(t, foundEarlyTermination, "final log must not contain 'Early termination' on the cancel path") assert.False(t, foundEarlyTermination, "final log must not contain 'Early termination' on the cancel path")
} }
// TestReporter_StopHeartbeats verifies that StopHeartbeats ends periodic
// UpdateTask heartbeats while Close() still flushes the final state.
func TestReporter_StopHeartbeats(t *testing.T) {
var updateTaskCalls atomic.Int64
client := mocks.NewClient(t)
client.On("UpdateLog", mock.Anything, mock.Anything).Maybe().Return(
func(_ context.Context, req *connect_go.Request[runnerv1.UpdateLogRequest]) (*connect_go.Response[runnerv1.UpdateLogResponse], error) {
return connect_go.NewResponse(&runnerv1.UpdateLogResponse{
AckIndex: req.Msg.Index + int64(len(req.Msg.Rows)),
}), nil
},
)
client.On("UpdateTask", mock.Anything, mock.Anything).Return(
func(_ context.Context, _ *connect_go.Request[runnerv1.UpdateTaskRequest]) (*connect_go.Response[runnerv1.UpdateTaskResponse], error) {
updateTaskCalls.Add(1)
return connect_go.NewResponse(&runnerv1.UpdateTaskResponse{}), nil
},
)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
taskCtx, err := structpb.NewStruct(map[string]any{})
require.NoError(t, err)
cfg, err := config.LoadDefault("")
require.NoError(t, err)
cfg.Runner.StateReportInterval = 20 * time.Millisecond
cfg.Runner.LogReportInterval = time.Hour
reporter := NewReporter(ctx, cancel, client, &runnerv1.Task{Context: taskCtx}, cfg)
reporter.ResetSteps(1)
reporter.RunDaemon()
reporter.stateMu.Lock()
reporter.stateChanged = true
reporter.state.Result = runnerv1.Result_RESULT_SUCCESS
reporter.state.StoppedAt = timestamppb.Now()
reporter.stateMu.Unlock()
require.Eventually(t, func() bool {
return updateTaskCalls.Load() >= 1
}, time.Second, 5*time.Millisecond, "daemon must send at least one UpdateTask before StopHeartbeats")
beforeStop := updateTaskCalls.Load()
reporter.StopHeartbeats()
select {
case <-reporter.daemon:
case <-time.After(time.Second):
t.Fatal("StopHeartbeats must stop the daemon loop")
}
time.Sleep(3 * cfg.Runner.StateReportInterval)
assert.Equal(t, beforeStop, updateTaskCalls.Load(),
"UpdateTask must not be called after StopHeartbeats")
require.NoError(t, reporter.Close(""))
assert.Greater(t, updateTaskCalls.Load(), beforeStop,
"Close() must still send a final UpdateTask after StopHeartbeats")
}