Skip links
Person in hoodie at cybersecurity workstation with multiple monitors showing threat detection

Automated Vulnerability Scanning: Beyond the Basics

Most engineering teams treat vulnerability scanning like a checkbox. Install Snyk or Dependabot, enable automated PRs, merge the updates, and call it a day. The problem is that this approach catches roughly 30% of the vulnerabilities that actually matter in production. The rest live in your container images, your IaC templates, your runtime configurations, and your custom code paths that no dependency scanner will ever touch.

Article Overview

Automated Vulnerability Scanning: Beyond the Basics

6 sections · Reading flow

01
The Dependency Scanner Trap
02
Building a Multi-Layer Scanning Pipeline
03
Triage: The Part Everyone Gets Wrong
04
Reducing False Positives
05
Metrics That Matter
06
Cost and Tooling Decisions

HARBOR SOFTWARE · Engineering Insights

After running security for multiple SaaS products over the past six years, we have learned that effective vulnerability scanning requires layering multiple tools across multiple stages of the development lifecycle. Here is what that looks like in practice, and why most teams get it wrong.

The Dependency Scanner Trap

Dependency scanners like Snyk, Dependabot, and npm audit are table stakes. Every team should have them. But they create a dangerous illusion of security when they are your only automated scanning tool.

Consider a typical Node.js application. Dependabot scans package-lock.json and flags outdated packages with known CVEs. Great. But it cannot see:

  • Vulnerabilities in your Docker base image (outdated OpenSSL, glibc issues, libc patches that upstream hasn’t released)
  • Misconfigurations in your Terraform modules (S3 buckets with public ACLs, security groups with 0.0.0.0/0 ingress, unencrypted RDS instances)
  • Hardcoded secrets in your codebase (API keys, database credentials, JWT signing secrets, private keys committed accidentally)
  • SQL injection, XSS, or SSRF vulnerabilities in your application code
  • Insecure default configurations in your Kubernetes manifests (running as root, no resource limits, no network policies, hostPath mounts)

Each of these categories requires a different scanning tool, and more importantly, a different integration point in your pipeline. A dependency scanner runs against your lockfile. A container scanner runs against a built image. An IaC scanner runs against your Terraform files. A SAST tool runs against your source code. A DAST tool runs against a live application. Treating any one of these as sufficient is like locking the front door and leaving every window open.

We ran an experiment internally last quarter. We disabled all scanners except Dependabot for one month and tracked which vulnerabilities were missed. The results were sobering: 14 container image CVEs (3 critical), 8 Terraform misconfigurations (2 high severity), 4 hardcoded secrets in test fixtures that referenced real staging credentials, and 2 SQL injection vulnerabilities in admin endpoints. All of these would have been caught by our full scanning pipeline.

Building a Multi-Layer Scanning Pipeline

The scanning pipeline we use at Harbor Software has five distinct layers, each triggered at different points in the development lifecycle. Here is the architecture and the reasoning behind each layer.

Layer 1: Pre-Commit (Developer Machine)

Before code even reaches your repository, two checks should run locally. These are lightweight, fast (under 3 seconds), and catch the most embarrassing vulnerabilities before they enter version control:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/zricethezav/gitleaks
    rev: v8.16.1
    hooks:
      - id: gitleaks
  - repo: https://github.com/hadolint/hadolint
    rev: v2.12.0
    hooks:
      - id: hadolint
        args: ['--ignore', 'DL3008']

Gitleaks catches secrets before they enter version control. This is critical because once a secret hits a Git repository, it lives in the history forever (or until you do a painful history rewrite). We have caught AWS access keys, Stripe secret keys, database connection strings, and even SSH private keys at this stage. One engineer accidentally included an .env file in a commit that contained our production database URL. Gitleaks blocked it before it ever left their machine.

Hadolint lints Dockerfiles against best practices. It flags things like using latest tags (non-deterministic builds), running as root, installing packages without pinning versions, and using ADD instead of COPY for local files. These are not vulnerabilities per se, but they create the conditions for vulnerabilities to appear in production. A Dockerfile that runs as root means any container escape gives the attacker root on the host. A Dockerfile that uses latest tags means your builds are not reproducible and you cannot audit exactly what packages are installed.

Pre-commit hooks require developer buy-in. We make them mandatory by checking in the .pre-commit-config.yaml and running pre-commit install as part of our onboarding script. New developers get the hooks automatically when they set up their development environment.

Layer 2: CI Pipeline (Pull Request)

When a PR is opened, four scanners run in parallel. This is the primary scanning layer and catches the majority of vulnerabilities:

# .github/workflows/security-scan.yml
jobs:
  dependency-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run Snyk
        uses: snyk/actions/node@master
        env:
          SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
        with:
          args: --severity-threshold=high

  container-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build image
        run: docker build -t app:${{ github.sha }} .
      - name: Run Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: app:${{ github.sha }}
          severity: CRITICAL,HIGH
          exit-code: 1

  iac-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run Checkov
        uses: bridgecrewio/checkov-action@master
        with:
          directory: ./terraform
          framework: terraform
          soft_fail: false

  sast-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run Semgrep
        uses: returntocorp/semgrep-action@v1
        with:
          config: p/owasp-top-ten p/nodejs

Snyk handles dependency vulnerabilities. We set the severity threshold to high to avoid drowning developers in low-priority noise. Medium and low findings go to a weekly triage report instead. Snyk’s advantage over raw npm audit is its reachability analysis: it can determine whether a vulnerable code path is actually called in your application, dramatically reducing false positives.

Trivy scans the built container image. This catches vulnerabilities in the OS packages of your base image, which are completely invisible to dependency scanners. We have found critical OpenSSL vulnerabilities in Alpine images, outdated curl versions with known exploits, and even vulnerabilities in the package manager itself. Trivy scans in about 30 seconds and produces clean, actionable output. We run it with exit-code: 1 so it fails the PR if critical or high vulnerabilities are found.

Checkov scans Terraform and Kubernetes manifests for misconfigurations. It checks over 1,000 policies covering AWS, GCP, Azure, and Kubernetes. Common findings include S3 buckets without encryption, RDS instances without backup retention, security groups with overly permissive rules, IAM policies with wildcard actions, and CloudTrail configurations that miss management events. Checkov’s output includes remediation guidance with specific code examples, which makes the findings immediately actionable.

Semgrep performs static application security testing (SAST) on the source code. Unlike traditional SAST tools that produce mountains of false positives by pattern-matching on syntactic structures without understanding data flow, Semgrep uses a combination of pattern-based rules and taint analysis that is remarkably accurate. The p/owasp-top-ten ruleset catches SQL injection, XSS, path traversal, SSRF, and other common web vulnerabilities with a false positive rate under 5% in our experience. We also maintain a set of custom rules for patterns specific to our codebase.

Layer 3: Post-Merge (Main Branch)

After code merges to main, we run a more thorough scan that includes DAST (Dynamic Application Security Testing). DAST scans test the running application the way an attacker would, by sending requests and analyzing responses:

# Nightly DAST scan against staging
curl -X POST "https://api.zap.example.com/scan" 
  -H "Content-Type: application/json" 
  -d '{
    "target": "https://staging.app.com",
    "scanType": "full",
    "alertThreshold": "high"
  }'

We use OWASP ZAP in daemon mode, pointed at our staging environment. ZAP crawls the application, submits forms, follows links, and attempts common attack patterns against every endpoint it discovers. This catches runtime vulnerabilities that static analysis misses entirely: CORS misconfigurations, missing security headers (CSP, X-Frame-Options, HSTS), authentication bypass issues, session management flaws, and information disclosure through verbose error messages.

DAST scans are slow (ours takes about 45 minutes for a medium-sized application), which is why we run them nightly rather than on every PR. The findings feed into the same triage workflow as everything else. We have also configured ZAP with authenticated scanning, providing it with valid session cookies so it can test authenticated endpoints that would otherwise return 401s.

Layer 4: Container Registry (Pre-Deployment)

Before any image is deployed to production, the container registry runs its own scan. We use AWS ECR’s built-in scanning (powered by Clair), but you can achieve the same with Harbor registry, Docker Hub’s scanning features, or Google Artifact Registry’s scanning capabilities.

This layer exists as a safety net. Even if the CI scan passed when the image was built, new CVEs are published daily. An image that was clean on Monday might have a critical vulnerability by Thursday. Registry scanning catches these drift vulnerabilities before they reach production. We have caught 3 critical CVEs this way in the past 6 months that were published after our images were built but before they were deployed.

# ECR scan on push policy
aws ecr put-image-scanning-configuration 
  --repository-name my-app 
  --image-scanning-configuration scanOnPush=true

# Also run periodic re-scans on existing images
aws ecr start-image-scan 
  --repository-name my-app 
  --image-id imageTag=production-latest

Layer 5: Runtime (Production)

The final layer monitors running containers for vulnerabilities that only manifest at runtime and for active exploitation attempts:

  • Falco watches for suspicious runtime behavior: unexpected network connections to external IPs, file system modifications in read-only paths, privilege escalation attempts via setuid/setgid, shell spawning inside containers, and unexpected process execution. Falco rules are surprisingly easy to write and incredibly effective at catching compromised containers.
  • AWS GuardDuty monitors AWS API calls for indicators of compromise: unusual EC2 instance launches, S3 bucket enumeration patterns, IAM credential abuse from unexpected geolocations, cryptocurrency mining detection, and DNS exfiltration patterns.
  • Runtime SCA tools like Datadog Application Security track which vulnerable libraries are actually loaded and called in production, helping you prioritize remediation based on actual runtime exposure rather than theoretical risk.

Runtime scanning is where you separate signal from noise. A vulnerable library that is installed but never imported is low priority. A vulnerable library that handles user input on every request is critical. Runtime analysis gives you that context, and it fundamentally changes how you prioritize remediation.

Triage: The Part Everyone Gets Wrong

Scanning tools are useless without a triage process. Here is the uncomfortable truth: most teams that adopt vulnerability scanning end up ignoring the results within three months. The alert fatigue is real. A medium-sized application can easily generate 200+ findings across all five layers, and without a systematic triage process, developers will either try to fix everything (burning out) or fix nothing (giving up).

Our triage process works like this:

  1. Critical + High severity findings with known exploits: Block the PR or deployment. Fix immediately. No exceptions. The developer who triggered the finding is responsible for the fix, with support from the security champion if needed.
  2. Critical + High severity findings without known exploits: Fix within the current sprint (2 weeks max). These go into the sprint backlog with a security label.
  3. Medium severity findings: Add to the backlog. Review weekly in the security champion’s triage session. Fix within 30 days.
  4. Low severity findings: Add to the backlog. Review monthly. Fix when convenient or as part of a quarterly security sprint.
  5. Informational findings: Ignore unless they indicate a pattern. Multiple info-level findings in the same area might indicate a design problem worth investigating.

The key insight is that severity alone is not enough. You need to factor in exploitability (is there a public exploit? is it being actively exploited in the wild?), exposure (is this internet-facing, internal-only, or only accessible via VPN?), and data sensitivity (does this component handle PII, financial data, or authentication credentials?). We use a simple scoring matrix that combines these factors to produce a priority score from 1 to 5, and we have automated this scoring in a small Python script that reads scanner output and annotates it with our priority scores.

Reducing False Positives

False positives are the silent killer of vulnerability scanning programs. Every false positive costs 15-30 minutes of developer time to investigate and dismiss, and each one erodes trust in the scanning tools. Once developers stop trusting the tools, they stop reading the findings. Here is how we keep our false positive rate under 3%:

Semgrep custom rules: Instead of running every available rule, we curate a ruleset specific to our tech stack. We started with p/owasp-top-ten and p/nodejs, then added custom rules for patterns specific to our codebase and removed rules that consistently produced false positives in our context:

# .semgrep/custom-rules.yml
rules:
  - id: unsafe-sql-query
    patterns:
      - pattern: |
          db.query($SQL, ...)
      - pattern-not: |
          db.query($SQL, $PARAMS)
      - metavariable-regex:
          metavariable: $SQL
          regex: '.*$.*'
    message: "SQL query uses string interpolation instead of parameterized queries"
    severity: ERROR
    languages: [javascript, typescript]

  - id: unsafe-redirect
    patterns:
      - pattern: |
          res.redirect($URL)
      - pattern-not-inside: |
          if (isAllowedRedirectUrl($URL)) { ... }
    message: "Redirect URL is not validated against an allowlist - potential open redirect"
    severity: WARNING
    languages: [javascript, typescript]

Trivy ignore file: For known false positives in container images (vulnerabilities in packages we do not use or that are disputed upstream), we maintain a .trivyignore file with mandatory justifications. The justification requirement is key; it forces the person adding the ignore to document why the vulnerability does not apply:

# .trivyignore
# CVE-2023-12345: affects libfoo's XML parsing module, which we don't use.
# Our application only uses libfoo for JSON serialization (verified via ldd).
CVE-2023-12345
# CVE-2023-67890: disputed by upstream maintainers, not a real vulnerability.
# See: https://github.com/example/libbar/issues/456
CVE-2023-67890

Checkov skip annotations: For IaC findings that are intentional design decisions (e.g., a public S3 bucket that serves static assets, or a security group that allows inbound HTTP from anywhere because it sits behind CloudFront), we use inline skip annotations with mandatory justifications:

# checkov:skip=CKV_AWS_18:This bucket intentionally serves public static assets behind CloudFront
# checkov:skip=CKV_AWS_20:Public read access required for website hosting
resource "aws_s3_bucket" "static_assets" {
  bucket = "my-app-static"
}

We review the ignore and skip lists quarterly. Stale entries (vulnerabilities that have been patched upstream, design decisions that have changed) are removed. This prevents the ignore lists from growing without bound.

Metrics That Matter

You cannot improve what you do not measure. We track four metrics for our scanning program, reported monthly to engineering leadership:

  1. Mean Time to Remediate (MTTR): How long from detection to fix? Our target is under 7 days for critical, under 30 days for high. We currently average 4.2 days for critical and 18 days for high. This metric tells you whether your triage process is working.
  2. False Positive Rate: Number of findings dismissed as false positives divided by total findings. We track this monthly and investigate any month where it exceeds 5%. A rising false positive rate means your scanning rules need tuning.
  3. Coverage: Percentage of repositories, container images, and IaC modules covered by scanning. Our target is 100% and we audit quarterly. Coverage gaps usually happen when new repositories are created without adding them to the scanning pipeline.
  4. Vulnerability Density: Number of open vulnerabilities per 1,000 lines of code. This trends downward over time if your program is working. A flat or rising trend means you are generating vulnerabilities faster than you fix them.

Cost and Tooling Decisions

Here is what our scanning stack costs for a team of 15 engineers with 12 repositories:

  • Snyk: $0 (free tier covers up to 200 tests/month, which is enough for our volume)
  • Trivy: $0 (open source, maintained by Aqua Security)
  • Checkov: $0 (open source, Bridgecrew/Prisma Cloud platform is paid but optional)
  • Semgrep: $0 (open source CLI, Semgrep Cloud is paid but we use the CLI directly)
  • Gitleaks: $0 (open source)
  • OWASP ZAP: $0 (open source)
  • Falco: $0 (open source, Falco Cloud is paid but optional)
  • AWS ECR scanning: Included in ECR pricing, negligible additional cost
  • AWS GuardDuty: ~$150/month for our workload size

Total cost: approximately $150/month plus the engineering time to set up and maintain the pipeline. That engineering time was about 2 weeks for initial setup and roughly 2 hours per week for ongoing maintenance and triage. Compare that to the average cost of a security incident ($4.35 million according to IBM’s 2022 Cost of a Data Breach Report), and the ROI is obvious.

Conclusion

Automated vulnerability scanning is not a product you buy. It is a pipeline you build. No single tool covers all the attack surface of a modern application. You need dependency scanning, container scanning, IaC scanning, SAST, DAST, and runtime monitoring working together across the entire development lifecycle, each tool covering the blind spots of the others.

Start with the CI pipeline layer (Layer 2) if you are building from scratch. It gives you the most coverage for the least effort. Add pre-commit hooks next for fast feedback. Then expand to registry scanning and runtime monitoring as your security program matures. Build the triage process from day one, because tools without triage produce noise without value.

The goal is not zero vulnerabilities. That is impossible. The goal is to find and fix the ones that matter before an attacker does, and to do it fast enough that your MTTR is measured in days, not months.

Leave a comment

Explore
Drag