mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-16 01:46:36 +00:00

Before this patch, the job/workflow name impacted the metric name, meaning a change in the workflow definition could break monitoring. This patch adds a map to get a stable name on metrics from a workflow name. In addition, it reworks a bit how we track the last processed workflow: the github queries are broken if filtering is applied, meaning we have a list of workflow, ordered by 'created_at', which mixes completed & running workflows. We have no guarantees over the order of completion, meaning we cannot stop at the first completed job we found (even per-workflow). This PR processed the last 1000 workflows, but allows an early stop if the created_at time is older than 8 hours. This means we could miss long-running workflows (>8 hours), and if the number of workflows started before another one completes becomes high (>1000), we'll miss it. To detect this kind of behavior, a new metric is added "oldest workflow processed", which should at least indicate if the depth is too small. An alternative without arbitrary cut would be to initially parse all workflows, and then record the last non-completed one we find and always start from the last (moving the lower bound as they complete). But LLVM has forever-queued workflows runs (>1 years), hence this would cause us to iterate over a very large number of jobs. --------- Signed-off-by: Nathan Gauër <brioche@google.com>