perf: reduce runner-to-server connection load with adaptive reporting and polling

- Replace fixed 1s RunDaemon timer with event-driven select loop using
  separate log (3s) and state (5s) tickers for periodic flush
- Add batch-size threshold (default 100 rows) to flush logs immediately
  during bursty output like npm install
- Add max-latency timer (default 5s) to guarantee single log lines are
  delivered within a bounded time
- Trigger immediate flush on step transitions (start/stop) and job
  result for responsive frontend UX
- Skip ReportLog when no pending rows and ReportState when state is
  unchanged to eliminate no-op HTTP requests
- Replace fixed-rate polling with exponential backoff and jitter to
  prevent thundering herd on idle runners
- Tune HTTP client with MaxIdleConnsPerHost=10 and share a single
  http.Client between Ping and Runner service clients
- Add configurable options: log_report_interval, log_report_max_latency,
  log_report_batch_size, state_report_interval, fetch_interval_max

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Bo-Yi Wu
2026-04-10 22:41:38 +08:00
parent 90c1275f0e
commit ec07b8c00b
8 changed files with 304 additions and 83 deletions

View File

@@ -32,6 +32,22 @@ runner:
fetch_timeout: 5s
# The interval for fetching the job from the Gitea instance.
fetch_interval: 2s
# The maximum interval for fetching the job from the Gitea instance.
# The runner uses exponential backoff when idle, increasing the interval up to this maximum.
# Set to 0 or same as fetch_interval to disable backoff.
fetch_interval_max: 60s
# The base interval for periodic log flush to the Gitea instance.
# Logs may be sent earlier if the buffer reaches log_report_batch_size.
log_report_interval: 3s
# The maximum time a log row can wait before being sent.
# This ensures even a single log line appears on the frontend within this duration.
log_report_max_latency: 5s
# Flush logs immediately when the buffer reaches this many rows.
# This ensures bursty output (e.g., npm install) is delivered promptly.
log_report_batch_size: 100
# The interval for reporting task state (step status, timing) to the Gitea instance.
# State is also reported immediately on step transitions (start/stop).
state_report_interval: 5s
# The github_mirror of a runner is used to specify the mirror address of the github that pulls the action repository.
# It works when something like `uses: actions/checkout@v4` is used and DEFAULT_ACTIONS_URL is set to github,
# and github_mirror is not empty. In this case,