Agentic Code Review: AI-Driven Code Reviews

Our team has been using AI agents for code review for six months. Not simple lint rules, but deep reviews that understand business logic. Here's the experience we've gathered.

What is Agentic Code Review

Traditional lint: checks syntax, formatting, known rules
AI Review: understands code intent, finds logic issues, suggests architectural improvements

Agentic = AI agent autonomously:
  1. Reads relevant code context
  2. Calls tools to gather more information
  3. Performs multi-step reasoning
  4. Generates structured review comments

Implementation

// scripts/ai-review.ts
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

interface ReviewResult {
  file: string;
  line: number;
  severity: "error" | "warning" | "suggestion";
  category: string;
  message: string;
  suggestion?: string;
}

async function reviewPR(
  diff: string,
  context: string,
): Promise<ReviewResult[]> {
  const response = await client.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 8192,
    system: `你是一个资深前端工程师，负责代码审查。
审查维度：
1. 正确性：逻辑错误、边界条件
2. 安全性：XSS、CSRF、数据泄露
3. 性能：不必要的重渲染、大列表、内存泄漏
4. 可维护性：命名、抽象、耦合度
5. TypeScript：类型安全、any 使用

输出格式：JSON 数组，每个元素包含 file, line, severity, category, message, suggestion`,
    messages: [
      {
        role: "user",
        content: `请审查以下代码变更：

## 上下文（相关文件）
${context}

## Diff
${diff}`,
      },
    ],
  });

  const content = response.content[0];
  if (content.type === "text") {
    return JSON.parse(content.text);
  }
  return [];
}

CI Integration

yaml

# .github/workflows/ai-review.yml
name: AI Code Review
on:
  pull_request:
    types: [opened, synchronize]

permissions:
  pull-requests: write

jobs:
  ai-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get changed files
        id: changes
        run: |
          DIFF=$(git diff origin/${{ github.base_ref }}...HEAD)
          echo "diff<<EOF" >> $GITHUB_OUTPUT
          echo "$DIFF" >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: Run AI Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: npx tsx scripts/ai-review.ts

      - name: Post Review Comments
        uses: actions/github-script@v7
        with:
          script: |
            const review = require('./review-results.json');
            for (const item of review) {
              await github.rest.pulls.createReviewComment({
                owner: context.repo.owner,
                repo: context.repo.repo,
                pull_number: context.issue.number,
                body: `**[${item.severity.toUpperCase()}]** ${item.message}\n\n${item.suggestion || ''}`,
                commit_id: context.payload.pull_request.head.sha,
                path: item.file,
                line: item.line,
              });
            }

Review Rule Examples

tsx

// Typical issues AI can catch:

// 1. Memory leak
function Dashboard() {
  const [data, setData] = useState([]);

  useEffect(() => {
    // 问题：没有 cleanup，组件卸载后还在更新状态
    fetch("/api/data")
      .then((r) => r.json())
      .then(setData);
  }, []);
}

// AI suggestion:
// "Async operations in useEffect need a cleanup.
//  Use AbortController to cancel the request."

// 2. Performance issue
function UserList({ users }: { users: User[] }) {
  return (
    <div>
      {users.map((user) => (
        // 问题：每次渲染都创建新的函数
        <UserCard
          key={user.id}
          user={user}
          onClick={() => navigate(`/users/${user.id}`)}
        />
      ))}
    </div>
  );
}

// AI suggestion:
// "onClick creates a new function on every render.
//  If UserCard uses memo, this will cause unnecessary re-renders."

// 3. Type safety
async function getUser(id: string) {
  const res = await fetch(`/api/users/${id}`);
  const data = await res.json(); // 问题：any 类型
  return data; // 缺少返回类型
}

// AI suggestion:
// "Add return type User, validate the response with zod or io-ts."

Review Effectiveness

3-month tracking data:
  Issues found by AI:        847
  Confirmed valid by humans: 623 (73.5%)
  False positives:           224 (26.5%)

  Most effective categories:
    1. TypeScript type issues    (92% valid)
    2. Security vulnerabilities  (88% valid)
    3. Performance issues        (75% valid)
    4. Code style                (70% valid)
    5. Architecture suggestions  (55% valid)

  Human supplemental reviews:
    Issues caught by humans but missed by AI: 15%
    (mainly business logic and requirement understanding)

Summary

Agentic Code Review doesn't replace human review; it's the first filter
AI excels at finding technical issues (types, security, performance); not good at business logic
Set reasonable severity thresholds to avoid too much review noise
Use AI review comments as a reference in PR; humans have the final say
Continuously refine prompts to reduce false positive rates

Agentic Code Review: AI-Driven Code Reviews

What is Agentic Code Review ​

Implementation ​

CI Integration ​

Review Rule Examples ​

Review Effectiveness ​

Summary ​

What is Agentic Code Review

Implementation

CI Integration

Review Rule Examples

Review Effectiveness

Summary