Time keeps marching on while developing a branch for merging and history
diverges. git
merge history but fails when similar lines are changed,
conflicting with each other. Most Code Review tools will report there would be
one of these line conflicts. What they can't do is report when there would be
semantic conflicts (e.g. adding new code that calls a function that was renamed).
Only running CI will detect these types of conflicts.
Strategies for reducing the impact of semantic conflicts include:
- CI runs against the developer's branch:
- Merge race window: Between last manual rebase and merge-time.
- Cost: none
- User notification: post-merge
- CI runs against the developer's branch, merged with
master
:- Merge race window: Between last push and merge-time.
- Cost: none
- User notification: post-push
- Code Review policy rejects PRs that are X days old or Y commits behind:
- Merge race window: bounded by configuration
- Cost: compute time and/or accept latency due to "satisfy the tool" pushes
- User notification: after X time
- Merges into
master
are serialized and only made available when CI passes:- Merge race window: none
- Cost: compute time and/or merge latency from extra pipeline run
- User notification: pre-merge
Other benefits to Submit Queues:
- To speed up PR feedback, its common to rely heavily on test avoidance but there are gaps where we don't have precise enough information and we err on the side of faster CI runs rather than more complete CI runs. Sometimes, this is worked around by markers the user can leave to get extra validation. A Submit Queue can be used to implement a two-tier CI system where you get the best of both worlds: fast PR feedback and green-master.
- There is a gap between when a PR lands and CI pushing out artifacts. Submit Queue artifacts could be staged and made available if the pipeline succeeds, making them available immediately on merge.
Considerations
- Latency for merging emergency fixes
- Uncaught failures in the Submit Queue CI run, potentially failing unrelated changes in the future
- CI Flakiness rejecting changes unnecessarily and disrupting any speculative CI runs
- General latency before changes become available
- Controlling for side effects (e.g. deployments) in Submit Queue CI runs since the merge might get rejected
- Improve throughput through predicting failure
- ML like Uber
- PRs too old, like Shopify
- Resource utilization
- Required processing "bandwidth"
- Cost
Existing Submit Queues
- bors
- Used by Rust and other projects
- Integrates with GitHub
- Communicate through @-mentions in PR comments
- Can request a full test suite run without a merge
- Uses a priority queue
- Batches changes
- Batch per priority level
- Only one batch is active at a time
- Max on success latency is O(2 * ci_time)
- Max system load is O(pipeline)
- Cost is O(pipeline * ci_time)
- Bisects batches on failure
- Latency is O(num_errors * log queue_size)
- Max system load is O(pipeline)
- Cost is O(pipeline * ci_time * num_errors * log queue_size)
- Gitlab Merge Trains
- Communicate through MR UX
- Has "Land immediately" button, jumping the queue
- Currently conflicts with "Merge When Pipeline Succeeds" feature
- Speculative execution, rather than batching
- On success latency is always O(ci_time)
- Max system load is O(pipeline * queue_size)
- Cost is O(pipeline * ci_time * queue_size)
- Restarts process on failure
- Latency is O(num_errors * ci_time)
- Max system load is O(queue_size * pipeline)
- Cost is O(pipeline * ci_time * queue_size * num_errors)
- ShipIt Merge Queue
- Used by Shopify
- See also v1 announcement and v2 announcement
- Integrates with Github
- Auto-rejects stale PRs (age and #commits)
- Communicate through @-mentions in PR comments
- Has "Land immediately" command, jumping the queue
- Queue locking for emergencies
- Auto-lock on too many items in queue
- Batching
- Fixed size queue (they use 8) to balance throughput and fault isolation
- Batches run in parallel (speculative execution?). They have it fixed at 3 to limit strain on CI resources.
- Used by Shopify
- arc submit
- Used by Uber (forked arc)
- Speculation engine tries to predict build success
- Graph is created with priority, success, and if builds two merges can safely be done at once
TODO
- Butler
- Native support for
fixup!
commits!
- Native support for
- Mergify
- User-defined queues is a nice
way of handling priorities
- Particularly allowing to bypass PR CI check since it will be redundant with merge-queue check.
- Git-native PR dependencies
- User-defined queues is a nice
way of handling priorities
- MergeQueue
- Mergtastic