- Consume stateChanged atomically with the state snapshot under a single Lock
- Restore stateChanged on UpdateTask error so the change is not silently lost
- Collapse the early-return check into the same Lock to avoid triple locking
- Add tests covering the in-flight Fire race and the error-restore path
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace fixed 1s RunDaemon timer with event-driven select loop using
separate log (3s) and state (5s) tickers for periodic flush
- Add batch-size threshold (default 100 rows) to flush logs immediately
during bursty output like npm install
- Add max-latency timer (default 5s) to guarantee single log lines are
delivered within a bounded time
- Trigger immediate flush on step transitions (start/stop) and job
result for responsive frontend UX
- Skip ReportLog when no pending rows and ReportState when state is
unchanged to eliminate no-op HTTP requests
- Replace fixed-rate polling with exponential backoff and jitter to
prevent thundering herd on idle runners
- Tune HTTP client with MaxIdleConnsPerHost=10 and share a single
http.Client between Ping and Runner service clients
- Add configurable options: log_report_interval, log_report_max_latency,
log_report_batch_size, state_report_interval, fetch_interval_max
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Fix data race on `r.closed` between `RunDaemon()` and `Close()` by protecting it with the existing `stateMu` — `closed` is part of the reporter state. `RunDaemon()` reads it under `stateMu.RLock()`, `Close()` sets it inside the existing `stateMu.Lock()` block
- `ReportState` now has a parameter to not report results from runDaemon even if set, from now on `Close` reports the result
- `Close` waits for `RunDaemon()` to signal exit via a closed channel `daemon` before reporting the final logs and state with result, unless something really wrong happens it does not time out
- Add `TestReporter_EphemeralRunnerDeletion` which reproduces the exact scenario from #793: RunDaemon's `ReportState` racing with `Close`, causing the ephemeral runner to be deleted before final logs are sent
- Add `TestReporter_RunDaemonClose_Race` which exercises `RunDaemon()` and `Close()` concurrently to verify no data race on `r.closed` under `go test -race`
- Enable `-race` flag in `make test` so CI catches data races going forward
Based on #794, with fixes for the remaining unprotected `r.closed` reads that the race detector catches.
Fixes#793
---------
Co-authored-by: Christopher Homberger <christopher.homberger@web.de>
Co-authored-by: ChristopherHX <christopher.homberger@web.de>
Co-authored-by: rmawatson <rmawatson@hotmail.com>
Reviewed-on: https://gitea.com/gitea/act_runner/pulls/796
Reviewed-by: ChristopherHX <christopherhx@noreply.gitea.com>
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-committed-by: silverwind <me@silverwind.io>