Ai Engineering Governance - LeadDev London 2026
June 1, 2026 · 1 day, 16 hours ago
This is the talk track for the 10-minute LeadDev London demo.
The opening line is:
Ai did not remove the bottleneck. It moved it from writing code to trusting code.
The goal is to make one idea land quickly: code generation has scaled, but the systems around review, risk, and governance have not automatically scaled with it. Qodo is the review and governance layer that helps engineering teams trust what enters the pull request workflow.
10-minute promise
By the end of the demo, the audience should see three things:
- Where Ai changes the code review bottleneck.
- How Qodo reviews a real PR with context.
- How governance rules make standards enforceable.
The deck should stay demo-first. The statistics are there to establish urgency. The architecture is there to orient the audience. The demo is the proof.
The shift
Start with the evidence slide.
| Signal |
Value |
Meaning |
| Ai code-gen adoption |
74% |
Code generation is mainstream. |
| Ai-generated code |
41% |
A large share of code now enters through Ai-assisted workflows. |
| Longer PR reviews |
+91% |
Review is becoming the queue. |
| PR size increase |
+154% |
Review surface area is expanding. |
| Current spend / dev / year |
$101-500 |
Teams already fund the first wave of tools. |
| Emerging 2026 target |
$1,000 |
Budget is moving toward broader engineering Ai, including review and governance. |
Talk track:
The first wave of Ai engineering focused on helping developers produce code faster. That worked. But when code volume and PR size go up, the constraint moves. The limiting factor becomes whether the organization can review, verify, govern, and safely merge the code it can now produce.
The before / now contrast is simple:
| Period |
Constraint |
Optimization |
| Before |
Writing code |
Generate faster |
| Now |
Trusting code |
Review better |
The pull request is the pressure point. Ai can produce larger changes faster, but human review capacity does not scale at the same rate. Standards are still scattered across docs, linters, reviewer memory, prior PRs, and team habits. The merge decision still belongs to the engineering organization.
The question is no longer just:
Can we generate it?
It is:
Can we safely merge it?
Qodo architecture
Use the architecture diagram to explain the shape of the system before opening the demo.
The point is not to show every implementation detail. The point is to show that Qodo is not just reading a diff.
flowchart LR
PR["1. Developer opens<br/>Ai-assisted PR"]
Inputs["2. Qodo gathers<br/>review inputs"]
Orchestrator["3. Review<br/>Orchestrator"]
Correctness["Correctness<br/>Agent"]
Security["Security<br/>Agent"]
Tests["Test<br/>Agent"]
Architecture["Architecture<br/>Agent"]
Compliance["Compliance<br/>Agent"]
Synthesis["4. Synthesis +<br/>Prioritization"]
Workflow["5. PR Comments +<br/>Required Actions"]
Analytics["5. Governance<br/>Analytics"]
Diff["PR diff"]
Patterns["Repo patterns"]
Context["Multi-repo context"]
Plan["Ticket intent<br/>Acceptance criteria"]
Rules["Governance rules<br/>Org / team / repo / path"]
PR --> Inputs
Inputs --> Diff
Inputs --> Patterns
Inputs --> Context
Inputs --> Plan
Inputs --> Rules
Diff --> Orchestrator
Patterns --> Orchestrator
Context --> Orchestrator
Plan --> Orchestrator
Rules --> Orchestrator
Orchestrator --> Correctness
Orchestrator --> Security
Orchestrator --> Tests
Orchestrator --> Architecture
Orchestrator --> Compliance
Correctness --> Synthesis
Security --> Synthesis
Tests --> Synthesis
Architecture --> Synthesis
Compliance --> Synthesis
Synthesis --> Workflow
Synthesis --> Analytics
classDef pr fill:#064e3b,stroke:#34d399,color:#fff
classDef context fill:#164e63,stroke:#22d3ee,color:#fff
classDef planning fill:#3b2f12,stroke:#fbbf24,color:#fff
classDef rules fill:#4c1d95,stroke:#a78bfa,color:#fff
classDef inputs fill:#581c87,stroke:#c084fc,color:#fff
classDef orchestrator fill:#1e3a8a,stroke:#60a5fa,color:#fff
classDef agent fill:#1f2937,stroke:#d1d5db,color:#fff
classDef synthesis fill:#374151,stroke:#f9fafb,color:#fff
classDef output fill:#7f1d1d,stroke:#f97316,color:#fff
class PR pr
class Diff,Patterns,Context context
class Plan planning
class Rules rules
class Inputs inputs
class Orchestrator orchestrator
class Correctness,Security,Tests,Architecture,Compliance agent
class Synthesis synthesis
class Workflow,Analytics output
Talk track:
Qodo is not just reading a diff. It combines intent, codebase context, governance rules, and specialist review agents before writing back into the PR workflow.
The sequence is:
- A developer opens an Ai-assisted PR.
- Qodo gathers review inputs.
- The review orchestrator coordinates the review.
- Specialist agents review correctness, security, reliability, testability, architecture, performance, compliance, and ticket fit.
- Qodo synthesizes the findings into one PR workflow.
Why multi-agent review matters:
- Code review is not one job.
- Correctness, security, reliability, testability, architecture, performance, compliance, and ticket fit are different review responsibilities.
- Separating review responsibilities lets Qodo produce focused findings.
- Synthesis turns those findings back into one developer workflow.
Demo 1: review output
Open the Qodo review comment:
https://github.com/codium-ai/qodo-platform/pull/1768#issuecomment-4288244992
Talk track:
This is the first thing I want you to notice: Qodo is not returning one generic paragraph. It is producing a structured review. Findings are grouped with status, severity, and category. You can see bugs, rule violations, cross-repo conflicts, team insights, reliability, correctness, maintainability, and performance.
The important labels to call out:
Bug
Rule violation
Cross-repo conflict
Team insight
Action required
Remediation recommended
Resolved
The stage point:
The value is not just that Qodo found issues. The value is that it gives the team a review object they can work through inside the PR.
Demo 2: relevance and evidence
Stay on the same PR comment and open the Persona PATCH needs batching finding.
Talk track:
This is where the review becomes more interesting than a static analysis warning. Qodo is explaining relevance, prior context, and evidence.
Call out the parts of the finding:
- Relevance:
Medium
- Similar prior PRs:
PR-#1551 and PR-#1267
- Recommendation generated from similar findings in past PRs
- Evidence from backend code, unit tests, and frontend configuration behavior
The evidence text says the backend performs cross-field validation after merging a PATCH, and frontend configuration submits only dirty keys. That creates a likely partial-PATCH failure mode when a user flips auto-select without also touching persona_identifier.
The stage point:
This is what context-aware review should look like. The finding is not just "this line might be wrong." It connects code behavior, tests, frontend workflow, and prior team decisions.
Governance
Move from the demo back to the general governance story.
Governance is review judgment made explicit.
Before governance:
- Standards live in documents.
- Senior reviewers remember edge cases.
- Security guidance is applied inconsistently.
- Repeated PR feedback stays informal.
- Leadership cannot see what still merges unresolved.
After governance:
- Define the rule once.
- Scope it by org, team, repository, or path.
- Set severity.
- Enforce it during review.
- Measure violations.
- Tune low-signal rules.
The key line:
Governance is not a meeting after development. It is your standards showing up in every PR.
Rules engine lifecycle
Explain rules as an operating loop, not a policy document.
| Step |
Action |
Meaning |
| 1 |
Discover |
Find repeated review feedback. |
| 2 |
Codify |
Turn it into a clear rule. |
| 3 |
Scope |
Apply it by org, repo, or path. |
| 4 |
Enforce |
Surface it in every relevant PR. |
| 5 |
Measure |
Track passed, detected, and merged violations. |
| 6 |
Tune |
Adjust, disable, or retire noisy rules. |
The line to land:
Rules become governance when they are enforced, measured, and improved.
Suggested rules
Suggested rules are the bridge from existing team behavior into formal governance.
Qodo can detect recurring review patterns and suggest rules. Suggestions are not policy. They become governance when an admin reviews, scopes, and activates them.
flowchart LR
Pattern["Recurring<br/>review pattern"]
Suggestion["Suggested<br/>rule"]
Review["Admin<br/>review"]
Scope["Scope"]
Activate["Activate"]
Governance["Governance"]
Pattern --> Suggestion --> Review --> Scope --> Activate --> Governance
Talk track:
Ai can find the pattern, but the organization still decides what becomes policy.
Governance analytics
Governance only becomes credible when it is measurable.
The analytics framing is:
| Metric |
Meaning |
| Passed |
The rule was evaluated and passed. |
| Detected |
A rule caught risk before merge. |
| Merged |
The PR merged with unresolved risk. |
| Leadership signal |
Merged violations show which risks still got through. |
Talk track:
Merged violations are the leadership signal. They answer a different question from developer productivity: which risks are still making it through the process?
Trust: benchmarks
Once the workflow is clear, the trust question becomes:
Can we trust the reviewer?
Use the benchmark table as evidence, not as the main story. The agentic-review-benchmarks/benchmark-pr-mapping repository compares agentic code review tools on precision, recall, and F1.
| Agent |
Precision |
Recall |
F1 |
| Qodo - Exhaustive |
63.8% |
56.7% |
60.1% |
| Qodo - Precise |
74.5% |
44.2% |
55.4% |
| Augment |
70.6% |
32.1% |
44.1% |
| Copilot |
50.1% |
37.4% |
42.8% |
| Cursor |
78.5% |
26.2% |
39.3% |
| Greptile |
68.5% |
27.2% |
39.0% |
| Codex |
83.0% |
24.3% |
37.6% |
| Coderabbit |
53.7% |
19.0% |
28.0% |
| Sentry |
85.3% |
13.8% |
23.7% |
Source: https://github.com/agentic-review-benchmarks/benchmark-pr-mapping
Talk track:
Benchmarks help compare review systems, but they do not replace fit with your workflow, rules, codebase, and risk model.
Build vs buy
End on build vs buy. Keep it short.
| Capability |
Build |
Qodo |
| PR review agent |
Platform-owned |
Productized |
| Rule enforcement |
Custom system |
Built into review |
| Multi-repo context |
Hard to maintain |
Context engine |
| Governance analytics |
Separate reporting |
Portal metrics |
| Time to value |
Months |
Days |
The 10-minute talk track
| Time |
Segment |
Talk track |
| 0:00-0:45 |
Opening |
Ai did not remove the bottleneck. It moved it from writing code to trusting code. |
| 0:45-2:00 |
The shift |
Code generation, PR size, review time, and budget are all moving. The PR is the pressure point. |
| 2:00-3:00 |
Architecture |
Qodo combines PR context, planning intent, rules, multi-repo context, specialist agents, and synthesis. |
| 3:00-5:00 |
Demo 1 |
Show the structured Qodo review output: bugs, rule violations, cross-repo conflict, team insight, action required. |
| 5:00-6:30 |
Demo 2 |
Open one finding and show relevance, prior PRs, and evidence across backend, tests, and frontend workflow. |
| 6:30-8:15 |
Governance |
Rules turn repeated review judgment into enforceable, scoped, measurable behavior. |
| 8:15-9:15 |
Analytics and trust |
Merged violations show which risks still got through. Benchmarks help evaluate reviewer quality. |
| 9:15-10:00 |
Build vs buy |
The hard part is not prototyping a reviewer. It is operating governed review across teams, repos, rules, and workflows. |
References
- Demo PR comment: https://github.com/codium-ai/qodo-platform/pull/1768#issuecomment-4288244992
- Agentic review benchmark: https://github.com/agentic-review-benchmarks/benchmark-pr-mapping
- Qodo governance hub: https://docs.qodo.ai/governance/governance-hub
- Qodo rule enforcement: https://docs.qodo.ai/code-review/get-started/rule-enforcement
- Qodo suggested rules: https://docs.qodo.ai/code-review/get-started/rule-enforcement/suggested-rules
- Qodo rule analytics: https://docs.qodo.ai/qodo-documentation/code-review/get-started/rule-enforcement/analytics