Writings

Ai Engineering Governance - LeadDev London 2026

June 1, 2026 · 1 day, 16 hours ago

This is the talk track for the 10-minute LeadDev London demo.

The opening line is:

Ai did not remove the bottleneck. It moved it from writing code to trusting code.

The goal is to make one idea land quickly: code generation has scaled, but the systems around review, risk, and governance have not automatically scaled with it. Qodo is the review and governance layer that helps engineering teams trust what enters the pull request workflow.

10-minute promise

By the end of the demo, the audience should see three things:

  1. Where Ai changes the code review bottleneck.
  2. How Qodo reviews a real PR with context.
  3. How governance rules make standards enforceable.

The deck should stay demo-first. The statistics are there to establish urgency. The architecture is there to orient the audience. The demo is the proof.

The shift

Start with the evidence slide.

Signal Value Meaning
Ai code-gen adoption 74% Code generation is mainstream.
Ai-generated code 41% A large share of code now enters through Ai-assisted workflows.
Longer PR reviews +91% Review is becoming the queue.
PR size increase +154% Review surface area is expanding.
Current spend / dev / year $101-500 Teams already fund the first wave of tools.
Emerging 2026 target $1,000 Budget is moving toward broader engineering Ai, including review and governance.

Talk track:

The first wave of Ai engineering focused on helping developers produce code faster. That worked. But when code volume and PR size go up, the constraint moves. The limiting factor becomes whether the organization can review, verify, govern, and safely merge the code it can now produce.

The before / now contrast is simple:

Period Constraint Optimization
Before Writing code Generate faster
Now Trusting code Review better

The pull request is the pressure point. Ai can produce larger changes faster, but human review capacity does not scale at the same rate. Standards are still scattered across docs, linters, reviewer memory, prior PRs, and team habits. The merge decision still belongs to the engineering organization.

The question is no longer just:

Can we generate it?

It is:

Can we safely merge it?

Qodo architecture

Use the architecture diagram to explain the shape of the system before opening the demo.

The point is not to show every implementation detail. The point is to show that Qodo is not just reading a diff.

flowchart LR
    PR["1. Developer opens<br/>Ai-assisted PR"]
    Inputs["2. Qodo gathers<br/>review inputs"]
    Orchestrator["3. Review<br/>Orchestrator"]
    Correctness["Correctness<br/>Agent"]
    Security["Security<br/>Agent"]
    Tests["Test<br/>Agent"]
    Architecture["Architecture<br/>Agent"]
    Compliance["Compliance<br/>Agent"]
    Synthesis["4. Synthesis +<br/>Prioritization"]
    Workflow["5. PR Comments +<br/>Required Actions"]
    Analytics["5. Governance<br/>Analytics"]

    Diff["PR diff"]
    Patterns["Repo patterns"]
    Context["Multi-repo context"]
    Plan["Ticket intent<br/>Acceptance criteria"]
    Rules["Governance rules<br/>Org / team / repo / path"]

    PR --> Inputs
    Inputs --> Diff
    Inputs --> Patterns
    Inputs --> Context
    Inputs --> Plan
    Inputs --> Rules
    Diff --> Orchestrator
    Patterns --> Orchestrator
    Context --> Orchestrator
    Plan --> Orchestrator
    Rules --> Orchestrator
    Orchestrator --> Correctness
    Orchestrator --> Security
    Orchestrator --> Tests
    Orchestrator --> Architecture
    Orchestrator --> Compliance
    Correctness --> Synthesis
    Security --> Synthesis
    Tests --> Synthesis
    Architecture --> Synthesis
    Compliance --> Synthesis
    Synthesis --> Workflow
    Synthesis --> Analytics

    classDef pr fill:#064e3b,stroke:#34d399,color:#fff
    classDef context fill:#164e63,stroke:#22d3ee,color:#fff
    classDef planning fill:#3b2f12,stroke:#fbbf24,color:#fff
    classDef rules fill:#4c1d95,stroke:#a78bfa,color:#fff
    classDef inputs fill:#581c87,stroke:#c084fc,color:#fff
    classDef orchestrator fill:#1e3a8a,stroke:#60a5fa,color:#fff
    classDef agent fill:#1f2937,stroke:#d1d5db,color:#fff
    classDef synthesis fill:#374151,stroke:#f9fafb,color:#fff
    classDef output fill:#7f1d1d,stroke:#f97316,color:#fff
    class PR pr
    class Diff,Patterns,Context context
    class Plan planning
    class Rules rules
    class Inputs inputs
    class Orchestrator orchestrator
    class Correctness,Security,Tests,Architecture,Compliance agent
    class Synthesis synthesis
    class Workflow,Analytics output

Talk track:

Qodo is not just reading a diff. It combines intent, codebase context, governance rules, and specialist review agents before writing back into the PR workflow.

The sequence is:

  1. A developer opens an Ai-assisted PR.
  2. Qodo gathers review inputs.
  3. The review orchestrator coordinates the review.
  4. Specialist agents review correctness, security, reliability, testability, architecture, performance, compliance, and ticket fit.
  5. Qodo synthesizes the findings into one PR workflow.

Why multi-agent review matters:

  • Code review is not one job.
  • Correctness, security, reliability, testability, architecture, performance, compliance, and ticket fit are different review responsibilities.
  • Separating review responsibilities lets Qodo produce focused findings.
  • Synthesis turns those findings back into one developer workflow.

Demo 1: review output

Open the Qodo review comment:

https://github.com/codium-ai/qodo-platform/pull/1768#issuecomment-4288244992

Talk track:

This is the first thing I want you to notice: Qodo is not returning one generic paragraph. It is producing a structured review. Findings are grouped with status, severity, and category. You can see bugs, rule violations, cross-repo conflicts, team insights, reliability, correctness, maintainability, and performance.

The important labels to call out:

  • Bug
  • Rule violation
  • Cross-repo conflict
  • Team insight
  • Action required
  • Remediation recommended
  • Resolved

The stage point:

The value is not just that Qodo found issues. The value is that it gives the team a review object they can work through inside the PR.

Demo 2: relevance and evidence

Stay on the same PR comment and open the Persona PATCH needs batching finding.

Talk track:

This is where the review becomes more interesting than a static analysis warning. Qodo is explaining relevance, prior context, and evidence.

Call out the parts of the finding:

  • Relevance: Medium
  • Similar prior PRs: PR-#1551 and PR-#1267
  • Recommendation generated from similar findings in past PRs
  • Evidence from backend code, unit tests, and frontend configuration behavior

The evidence text says the backend performs cross-field validation after merging a PATCH, and frontend configuration submits only dirty keys. That creates a likely partial-PATCH failure mode when a user flips auto-select without also touching persona_identifier.

The stage point:

This is what context-aware review should look like. The finding is not just "this line might be wrong." It connects code behavior, tests, frontend workflow, and prior team decisions.

Governance

Move from the demo back to the general governance story.

Governance is review judgment made explicit.

Before governance:

  • Standards live in documents.
  • Senior reviewers remember edge cases.
  • Security guidance is applied inconsistently.
  • Repeated PR feedback stays informal.
  • Leadership cannot see what still merges unresolved.

After governance:

  • Define the rule once.
  • Scope it by org, team, repository, or path.
  • Set severity.
  • Enforce it during review.
  • Measure violations.
  • Tune low-signal rules.

The key line:

Governance is not a meeting after development. It is your standards showing up in every PR.

Rules engine lifecycle

Explain rules as an operating loop, not a policy document.

Step Action Meaning
1 Discover Find repeated review feedback.
2 Codify Turn it into a clear rule.
3 Scope Apply it by org, repo, or path.
4 Enforce Surface it in every relevant PR.
5 Measure Track passed, detected, and merged violations.
6 Tune Adjust, disable, or retire noisy rules.

The line to land:

Rules become governance when they are enforced, measured, and improved.

Suggested rules

Suggested rules are the bridge from existing team behavior into formal governance.

Qodo can detect recurring review patterns and suggest rules. Suggestions are not policy. They become governance when an admin reviews, scopes, and activates them.

flowchart LR
    Pattern["Recurring<br/>review pattern"]
    Suggestion["Suggested<br/>rule"]
    Review["Admin<br/>review"]
    Scope["Scope"]
    Activate["Activate"]
    Governance["Governance"]

    Pattern --> Suggestion --> Review --> Scope --> Activate --> Governance

Talk track:

Ai can find the pattern, but the organization still decides what becomes policy.

Governance analytics

Governance only becomes credible when it is measurable.

The analytics framing is:

Metric Meaning
Passed The rule was evaluated and passed.
Detected A rule caught risk before merge.
Merged The PR merged with unresolved risk.
Leadership signal Merged violations show which risks still got through.

Talk track:

Merged violations are the leadership signal. They answer a different question from developer productivity: which risks are still making it through the process?

Trust: benchmarks

Once the workflow is clear, the trust question becomes:

Can we trust the reviewer?

Use the benchmark table as evidence, not as the main story. The agentic-review-benchmarks/benchmark-pr-mapping repository compares agentic code review tools on precision, recall, and F1.

Agent Precision Recall F1
Qodo - Exhaustive 63.8% 56.7% 60.1%
Qodo - Precise 74.5% 44.2% 55.4%
Augment 70.6% 32.1% 44.1%
Copilot 50.1% 37.4% 42.8%
Cursor 78.5% 26.2% 39.3%
Greptile 68.5% 27.2% 39.0%
Codex 83.0% 24.3% 37.6%
Coderabbit 53.7% 19.0% 28.0%
Sentry 85.3% 13.8% 23.7%

Source: https://github.com/agentic-review-benchmarks/benchmark-pr-mapping

Talk track:

Benchmarks help compare review systems, but they do not replace fit with your workflow, rules, codebase, and risk model.

Build vs buy

End on build vs buy. Keep it short.

Capability Build Qodo
PR review agent Platform-owned Productized
Rule enforcement Custom system Built into review
Multi-repo context Hard to maintain Context engine
Governance analytics Separate reporting Portal metrics
Time to value Months Days

The 10-minute talk track

Time Segment Talk track
0:00-0:45 Opening Ai did not remove the bottleneck. It moved it from writing code to trusting code.
0:45-2:00 The shift Code generation, PR size, review time, and budget are all moving. The PR is the pressure point.
2:00-3:00 Architecture Qodo combines PR context, planning intent, rules, multi-repo context, specialist agents, and synthesis.
3:00-5:00 Demo 1 Show the structured Qodo review output: bugs, rule violations, cross-repo conflict, team insight, action required.
5:00-6:30 Demo 2 Open one finding and show relevance, prior PRs, and evidence across backend, tests, and frontend workflow.
6:30-8:15 Governance Rules turn repeated review judgment into enforceable, scoped, measurable behavior.
8:15-9:15 Analytics and trust Merged violations show which risks still got through. Benchmarks help evaluate reviewer quality.
9:15-10:00 Build vs buy The hard part is not prototyping a reviewer. It is operating governed review across teams, repos, rules, and workflows.

References

  • Demo PR comment: https://github.com/codium-ai/qodo-platform/pull/1768#issuecomment-4288244992
  • Agentic review benchmark: https://github.com/agentic-review-benchmarks/benchmark-pr-mapping
  • Qodo governance hub: https://docs.qodo.ai/governance/governance-hub
  • Qodo rule enforcement: https://docs.qodo.ai/code-review/get-started/rule-enforcement
  • Qodo suggested rules: https://docs.qodo.ai/code-review/get-started/rule-enforcement/suggested-rules
  • Qodo rule analytics: https://docs.qodo.ai/qodo-documentation/code-review/get-started/rule-enforcement/analytics