Skip links

Building Production-Grade Security Platforms with AI

The State of AI in Application Security

Application security tooling has been stuck in a frustrating pattern for years. Static analysis tools generate hundreds of findings, the majority of which are false positives. Dynamic scanners catch surface-level issues but miss business logic flaws. Manual penetration testing is expensive and infrequent. Development teams either drown in alerts they don’t trust or ignore security tooling entirely.

Article Overview

Building Production-Grade Security Platforms with AI

8 sections · Reading flow

01
The State of AI in Application Security
02
Why Pull Requests Are the Right Scope
03
Architecture of VibeGuard
04
False Positive Management
05
What VibeGuard Actually Catches
06
Performance and Cost
07
Integration With Our Other Tools
08
What's Next

HARBOR SOFTWARE · Engineering Insights

When we set out to build VibeGuard, we weren’t trying to replace any of these tools. We were trying to answer a specific question that none of them handle well: given a code change in a pull request, what are the security implications that a human reviewer should pay attention to?

Not a list of CVEs. Not a compliance checklist. A focused, contextual analysis that understands what the code is doing and identifies the specific patterns that introduce risk. That’s what VibeGuard does, and building it taught us more about the intersection of AI and security than any conference talk ever could.

Why Pull Requests Are the Right Scope

Early in the design process, we debated whether VibeGuard should scan entire repositories or focus on pull requests. The full-repo approach is what most SAST tools do, and it’s part of why they produce so much noise. Scanning a mature codebase will always surface hundreds of findings — some real, most contextual, all requiring triage time that developers don’t have.

Pull requests are different. A PR is a bounded unit of change with clear intent. The developer is actively working on it. They have context. They’re already in review mode. If you can surface security-relevant observations during review, the cost of addressing them is orders of magnitude lower than finding them in a quarterly scan.

We also realized that PR-level analysis gives us something repository scanning doesn’t: the diff as a signal. When a developer changes an authentication flow, adds a new API endpoint, or modifies input validation logic, those changes carry implicit risk that can be identified from the diff alone. You don’t need to understand the entire codebase — you need to understand what changed and why it matters.

There’s a third benefit we didn’t anticipate: PR-level analysis creates a natural feedback loop. Developers see the analysis immediately, while the code is fresh in their minds. They can confirm, dismiss, or correct findings in seconds. That feedback is gold for improving the system over time.

Architecture of VibeGuard

VibeGuard is structured as a GitHub App that triggers on pull request events. The pipeline has four stages, each designed to be independently testable and replaceable.

Stage 1: Diff Extraction and Enrichment

When a PR is opened or updated, VibeGuard fetches the diff via the GitHub API. But raw diffs lack context. A line that adds user_input to a SQL query is meaningless without knowing what user_input is and where it came from.

We enrich the diff by fetching surrounding context — typically 50 lines above and below each changed hunk — and resolving imports and function definitions that are referenced in the changed code. This gives the analysis enough context to reason about data flow without requiring a full codebase index.

interface EnrichedHunk {
  filePath: string;
  language: string;
  changedLines: DiffLine[];
  surroundingContext: string;
  resolvedImports: ResolvedImport[];
  functionSignatures: FunctionSig[];
  fileClassification: FileType;
}

type FileType = 'auth' | 'api' | 'database' | 'config' | 'middleware' 
  | 'validation' | 'ui' | 'test' | 'infrastructure' | 'other';

The fileClassification field is important. We classify each changed file based on its path and content patterns. A file in /api/routes/ that exports Express handlers is classified as api. A file that imports bcrypt or jwt is classified as auth. A Terraform or CloudFormation file is classified as infrastructure. This classification determines which security rules are most relevant to apply.

Classification uses a combination of path-based heuristics and content-based detection. The path heuristics are fast and handle 80% of cases (files in /middleware/ are middleware, files in /test/ or /__tests__/ are tests). For ambiguous files, we scan imports and function patterns. A file that calls db.query() or uses a Prisma client is classified as database regardless of where it lives in the directory tree.

Stage 2: Pattern-Based Pre-Filter

Before sending anything to an LLM, we run a fast pattern-based pre-filter. This catches the low-hanging fruit: hardcoded secrets, known dangerous functions (like eval() or dangerouslySetInnerHTML), disabled security headers, and dependency changes that introduce known vulnerable packages.

const securityPatterns: SecurityPattern[] = [
  {
    id: 'hardcoded-secret',
    pattern: /(?:api[_-]?key|secret|password|token)s*[:=]s*['"][^'"]{8,}['"]/gi,
    severity: 'critical',
    message: 'Potential hardcoded secret detected',
    fileTypes: ['all'],
  },
  {
    id: 'sql-concat',
    pattern: /(?:query|execute|raw)s*(s*[`'"].*${/gi,
    severity: 'high',
    message: 'Possible SQL injection via string interpolation',
    fileTypes: ['database', 'api', 'other'],
  },
  {
    id: 'eval-usage',
    pattern: /bevals*(/g,
    severity: 'high',
    message: 'Use of eval() — verify input is trusted',
    fileTypes: ['api', 'middleware', 'other'],
  },
  {
    id: 'cors-wildcard',
    pattern: /origin:s*['"]*['"]/gi,
    severity: 'medium',
    message: 'CORS configured with wildcard origin',
    fileTypes: ['config', 'middleware', 'api'],
  },
  {
    id: 'disabled-csrf',
    pattern: /csrfs*:s*false/gi,
    severity: 'high',
    message: 'CSRF protection appears to be disabled',
    fileTypes: ['config', 'middleware'],
  },
  // ... 40+ additional patterns
];

This pre-filter serves two purposes. First, it catches definitive issues without spending LLM tokens. A hardcoded API key is a hardcoded API key — you don’t need GPT-4 to identify it. Second, it provides signal to the AI analysis. If the pre-filter flags a potential SQL injection, the LLM analysis can focus on confirming whether the input is actually user-controlled and whether parameterization is used elsewhere in the same function.

The pre-filter also respects file classifications. We don’t flag eval() in test files because test code legitimately uses dynamic evaluation. We don’t flag hardcoded strings in fixture files. This classification-aware filtering eliminates roughly 30% of potential false positives before the AI analysis even begins.

Stage 3: AI-Powered Contextual Analysis

This is the core of VibeGuard. Each enriched hunk, along with its pre-filter results and file classification, is sent to an LLM for contextual security analysis. We use Claude for this stage because of its strong performance on code reasoning tasks and its ability to follow nuanced instructions without over-flagging.

The prompt engineering here was the hardest part of the entire project. Security analysis requires precision. A false positive erodes trust. A false negative is a missed vulnerability. We iterated on the system prompt for months, and the key breakthrough was framing the task as risk identification, not vulnerability detection.

The distinction matters. “This code has a SQL injection vulnerability” is a binary claim that’s often wrong when made without full codebase context. “This code constructs a SQL query using a variable that appears to originate from user input, which creates SQL injection risk if the input is not validated upstream” is a contextual observation that’s almost always useful, even when the developer has already handled validation elsewhere. The first statement demands certainty. The second communicates risk. Developers respond much better to the second framing.

const analysisPrompt = `You are a senior security engineer reviewing a pull request.
Your job is to identify security-relevant changes and explain their implications.

Rules:
1. Focus on what CHANGED, not pre-existing issues in surrounding code
2. Classify findings as: critical, high, medium, informational
3. For each finding, explain the SPECIFIC risk and a SPECIFIC remediation
4. Do NOT flag standard framework usage as insecure (e.g., Express middleware, React hooks)
5. Consider the file classification when assessing relevance
6. If a pre-filter flagged something, confirm or dismiss with context
7. If a pattern looks suspicious but you lack context to confirm, classify as informational
8. Test files should only be flagged for hardcoded secrets, never for code patterns
9. Output structured JSON matching the FindingSchema

File: ${hunk.filePath} (classified as: ${hunk.fileClassification})
Language: ${hunk.language}
Pre-filter results: ${JSON.stringify(preFilterResults)}

Changed code with context:
${hunk.surroundingContext}

Diff (lines prefixed with + are additions, - are removals):
${formatDiff(hunk.changedLines)}`;

Stage 4: Deduplication and Reporting

The AI analysis runs per-hunk, which means a PR with changes across multiple files can produce overlapping findings. A developer adding a new API endpoint might trigger findings about missing authentication, missing rate limiting, and missing input validation — all of which might reference the same root cause.

Stage 4 deduplicates findings using both exact matching (same file, same line range, same finding type) and semantic matching (findings from different files that describe the same underlying risk). It then ranks findings by severity and confidence and generates the final report.

We put significant effort into the report format. Early versions were verbose and developers skipped them. The current format leads with a one-line summary (“2 high-risk findings, 1 informational note”), followed by collapsible sections for each finding. Each finding includes the specific code, the risk explanation, and a concrete fix suggestion with sample code. Developers told us repeatedly: if you can’t show me the fix, don’t show me the problem.

False Positive Management

The single biggest challenge in building VibeGuard was false positive management. Developers will disable any tool that cries wolf. Our target was a false positive rate below 15%, and achieving that required several mechanisms working together.

First, the file classification system filters out irrelevant checks. There’s no point analyzing a CSS file for SQL injection. There’s no point flagging dangerouslySetInnerHTML in a Storybook story. This alone eliminated about 30% of false positives from our initial prototype.

Second, the pre-filter and AI analysis work as a two-pass system. The pre-filter casts a wide net, and the AI analysis narrows it with context. A regex match on password = " might fire on a test fixture, a type definition, or an environment variable placeholder. The AI analysis can usually determine from context whether it’s a real credential.

Third, we implemented a feedback loop. When a developer dismisses a finding with a reason (“this is a test file”, “input is validated in middleware”, “false positive — this is a type definition”), that feedback is stored and used to refine future analyses. After 500+ feedback events, we added a post-processing step that automatically downgrades findings matching common false positive patterns.

const adjustSeverity = (finding: Finding, feedbackHistory: Feedback[]): Finding => {
  const similarDismissals = feedbackHistory.filter(
    fb => fb.patternId === finding.patternId 
      && fb.action === 'dismiss'
      && fb.reason !== 'wont-fix' // wont-fix means "real issue, just not fixing now"
  );
  
  const totalForPattern = feedbackHistory.filter(
    fb => fb.patternId === finding.patternId
  ).length;
  
  const dismissalRate = similarDismissals.length / (totalForPattern || 1);

  if (dismissalRate > 0.7 && totalForPattern >= 5) {
    return { ...finding, severity: 'informational', 
      note: 'Auto-downgraded: frequently dismissed by reviewers in similar contexts' };
  }
  return finding;
};

Fourth, and this was non-obvious, we added a confidence threshold. The AI analysis assigns a confidence score to each finding. Findings below 0.6 confidence are silently dropped. Findings between 0.6 and 0.8 are shown as “informational” regardless of their severity classification. Only findings above 0.8 confidence are shown at their classified severity. This prevents the system from making bold claims about issues it’s uncertain about.

What VibeGuard Actually Catches

After six months in production across our internal projects and several client codebases, the findings that provide the most value fall into clear categories:

  • Authentication and authorization gaps: New API endpoints that forget to apply auth middleware. This is the most common real finding — developers add a route and forget to wrap it in the auth check that every other route has. VibeGuard catches this by noticing that other routes in the same file use requireAuth middleware and the new route doesn’t.
  • Input validation regressions: Changes to validation schemas that widen accepted input. A developer updating a Zod schema to make a field optional doesn’t always realize they’re also making it bypassable. VibeGuard flags when .optional() or .nullable() is added to fields that were previously required.
  • Dependency risk: New packages with known vulnerabilities, packages with unusually broad permissions in their package.json, or packages that haven’t been updated in years. We integrate with the OSV database for known CVEs and npm audit data for advisory information.
  • Secret exposure: This still happens more often than you’d expect. Not just API keys in code, but configuration files that reference environment variables by the wrong name (which means the fallback — often a hardcoded default — gets used in production). Also .env files accidentally added to commits when .gitignore is misconfigured.
  • Logging sensitive data: Debug logging statements that include request bodies, tokens, or user data. These often survive code review because the log statement itself looks harmless. console.log('Request:', req.body) in a login handler logs every password that passes through it.
  • Insecure defaults: New configuration options that default to the less secure choice. CORS set to * in development configs that might make it to production. Session cookies without secure or httpOnly flags.

Performance and Cost

VibeGuard needs to be fast enough that it doesn’t bottleneck the review process. Our target was under 60 seconds for a typical PR (10-30 changed files). The pre-filter stage runs in under 2 seconds. The enrichment stage takes 3-5 seconds depending on the number of imports to resolve. The AI analysis is the bottleneck — we parallelize across hunks, but each LLM call takes 2-8 seconds depending on the context size.

For a 20-file PR, total wall-clock time is typically 15-25 seconds. For large PRs (100+ files), we batch and prioritize, analyzing high-risk file types first and delivering incremental results. The GitHub comment updates as new findings come in, so reviewers can start reading findings while the analysis is still running on lower-priority files.

Cost-wise, the AI analysis runs roughly $0.02-0.08 per PR depending on size. At scale (hundreds of PRs per month), this is negligible compared to the cost of a single security incident that could have been caught in review. One client estimated that a single authentication bypass that VibeGuard caught before production would have cost $50,000+ in incident response and customer notification alone.

Integration With Our Other Tools

VibeGuard doesn’t exist in isolation. When we deploy it alongside Agent2WP for WordPress projects, it automatically applies WordPress-specific security checks: nonce verification in form handlers, capability checks before data modification, proper escaping of output using esc_html(), esc_attr(), and wp_kses(). The WordPress security landscape has its own vocabulary, and VibeGuard’s file classification system recognizes WordPress plugin and theme structures to apply the appropriate rules.

For our SparkAI project, VibeGuard’s analysis includes API rate limiting checks and data retention policies, since SparkAI handles social media data that has privacy implications. When code changes touch data fetching or storage functions, VibeGuard checks for proper data lifecycle management — ensuring that fetched Reddit data has TTL policies and that personal information isn’t stored beyond its intended retention period.

This cross-product integration isn’t coincidental. When you build multiple products within the same engineering culture, security patterns become reusable. The rule set that VibeGuard applies to Agent2WP grew out of real security findings we encountered during development. Every real bug found across our products becomes a new rule in VibeGuard’s pattern library.

What’s Next

We’re working on three major improvements. First, expanding beyond PR analysis to continuous monitoring of deployed configurations — cloud IAM policies, firewall rules, environment variables, Kubernetes manifests. The same AI-contextual-analysis approach that works for code diffs applies to infrastructure changes. Second, adding a “security score” that tracks a repository’s security posture over time — number of findings resolved, average time-to-fix, trend lines that show whether the codebase is getting more or less secure with each sprint. Third, building a VS Code extension that runs VibeGuard analysis locally before the code even gets to a PR, giving developers instant feedback while they’re writing code rather than during review.

The broader vision is that security analysis should be ambient — always present, never blocking, increasingly accurate. VibeGuard is the first step in that direction, and every PR it reviews makes it smarter through the feedback loop. That’s the kind of compound improvement that turns a good tool into an indispensable one. After six months, our internal teams don’t think about VibeGuard the same way they think about a security tool. They think about it the same way they think about syntax highlighting — it’s just there, and its absence would feel wrong.

Leave a comment

Explore
Drag