Files
act_runner/act/container/process_other.go
Nicolas 3996d6d032 fix(cleanup): kill Unix step process group on cancel to avoid hang (#1025)
Cancelling a job on a Linux/macOS host runner can leave the spawned process
tree running and hang the runner — the same failure mode fixed for Windows in
#1011, just on the other platforms.

Steps are launched as process-group leaders (`Setpgid`, or `Setsid` for the PTY
path), but the default `exec.CommandContext` cancellation only kills the
**direct child**. When a step launches a shell that starts a child which in turn
spawns further background processes, cancelling the job leaves the descendants
running. Because those orphans inherited the step's stdout/stderr pipe, the read
end never hits EOF and `cmd.Wait()` blocks forever.

Because the step executor never returns:
- the orphaned processes keep running (the cancelled work is not actually
  stopped), and
- end-of-job cleanup is never reached, so the runner appears to go offline / stop
  picking up jobs.

## Fix

Apply the same tree-kill approach as Windows, using the Unix counterpart of a Job
Object: the **process group**.

- Add a Unix `processKiller` (`process_unix.go`) that captures the step's PGID
  (== PID, since the step is launched as a group leader) and sends `SIGKILL` to
  the whole group on cancellation. This also closes the inherited pipe handles so
  `cmd.Wait()` can return. `ESRCH` (group already gone) is not treated as an error.
- Restrict the previous no-op stub (`process_other.go`) to `plan9` and have it
  fall back to a single-process kill, preserving plan9's prior behaviour.
- Wire `cmd.Cancel` (tree kill) and `cmd.WaitDelay` (10s) **unconditionally** in
  `exec()` instead of Windows-only. `WaitDelay` also covers a step that
  backgrounds a process holding the pipe open after the main process exits.

Reviewed-on: https://gitea.com/gitea/runner/pulls/1025
Reviewed-by: Zettat123 <39446+zettat123@noreply.gitea.com>
2026-06-14 20:52:42 +00:00

30 lines
764 B
Go

// Copyright 2026 The Gitea Authors. All rights reserved.
// SPDX-License-Identifier: MIT
//go:build plan9
package container
import "os"
// processKiller falls back to single-process termination on platforms without
// a process-group / Job Object tree-kill. The Job Object (Windows) and process
// group (Unix) based tree-kills live in process_windows.go / process_unix.go;
// here we just kill the direct child, matching the previous default behaviour.
type processKiller struct {
p *os.Process
}
func newProcessKiller(p *os.Process) (*processKiller, error) {
return &processKiller{p: p}, nil
}
func (k *processKiller) Kill() error {
if k == nil || k.p == nil {
return nil
}
return k.p.Kill()
}
func (k *processKiller) Close() error { return nil }