## Problem
Cancelling a job on a Windows host runner can leave the spawned process
tree running and hang the runner. When a step launches a shell that
starts a child which in turn spawns further GUI/background processes,
cancelling the job kills only the direct child (the default
`exec.CommandContext` behaviour). The surviving descendants inherited
the step's stdout/stderr pipe, so the read end never hit EOF and
`cmd.Wait()` blocked forever.
Because the step executor never returned:
- the orphaned processes kept running (the cancelled work was not
actually stopped), and
- end-of-job cleanup (`Remove` → `terminateRunningProcesses`) was never
reached, so the runner appeared to go offline / stop picking up jobs.
`CREATE_NEW_PROCESS_GROUP` does not help here — it affects Ctrl-C signal
delivery, not handle inheritance or tree termination.
## Fix
- Assign each Windows step process to a **Job Object** immediately after
`cmd.Start()`. Descendants created afterwards are automatically part
of the job.
- Override `cmd.Cancel` to `TerminateJobObject`, so cancellation kills
the **entire descendant tree** atomically. This also closes the
inherited pipe handles, so `cmd.Wait()` can return.
- Set `cmd.WaitDelay` (10s) as a safety net: once the process has
exited, Wait force-closes the pipes and returns rather than blocking
forever — covering the case where the job-object setup fails (e.g.
nested-job restrictions), in which we fall back to the previous
single-process kill.
- The Job Object is created **without** `JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE`,
so closing the handle on normal completion does not kill legitimate
background processes; the tree is only torn down on explicit cancel.
Implemented behind `runtime.GOOS == "windows"` with a Windows-only
`processKiller` (Job Object) and no-op stubs elsewhere, so non-Windows
behaviour (default cancellation + `Setpgid`) is unchanged.
## Changes
- `act/container/process_windows.go` — Job Object `processKiller`
(create / assign / terminate).
- `act/container/process_other.go` — no-op stubs (`//go:build !windows`).
- `act/container/host_environment.go` — wire `cmd.Cancel` (tree kill)
and `cmd.WaitDelay` into `exec()`.
- `go.mod` / `go.sum` — promote `golang.org/x/sys` to a direct
dependency.
## Testing
I fully tested it already
## Notes
Follow-up to the Windows leftover-process reaping in #996: that sweep
now actually runs on cancellation because the step no longer hangs
before reaching it.
Reviewed-on: https://gitea.com/gitea/runner/pulls/1011
Reviewed-by: techknowlogick <9+techknowlogick@noreply.gitea.com>