Magnifying glass over laptop screen revealing code highlights and analysis

Technical Due Diligence: What We Look for in Software Acquisitions

Author James Mitchell

Published on: October 31, 2025

Over the past three years, Harbor Software has conducted technical due diligence on 11 software acquisition targets for private equity firms and strategic acquirers. The deal sizes ranged from $2 million to $45 million. We have recommended proceeding on 7 of those deals, recommended against 3, and recommended proceeding with significant price adjustments on 1. This post distills our due diligence methodology into the patterns we evaluate, the red flags that kill deals, and the technical debt signals that justify price reductions.

Article Overview

Technical Due Diligence: What We Look for in Software Acq…

5 sections · Reading flow

01
The Framework: Three Lenses

→

02
What We Actually Examine

→

03
The Red Flags That Kill Deals

→

04
What Good Looks Like

→

05
The Deliverable: What Our Reports Contain

HARBOR SOFTWARE · Engineering Insights

The Framework: Three Lenses

Every technical due diligence engagement evaluates the target through three lenses, in this order of priority:

Lens 1: Can the product continue to operate without the current team? This is the most important question for any acquisition. If the founders leave (which is common 12-24 months post-acquisition), can the acquiring team maintain and extend the product? This lens evaluates documentation quality, code readability, architecture complexity, deployment automation, test coverage of critical paths, and institutional knowledge concentration. A product that works perfectly but can only be maintained by its creator has a value ceiling determined by that creator’s continued involvement.

Lens 2: What is the true cost of ownership? The target’s current hosting bill is not the cost of ownership. The cost of ownership includes infrastructure (hosting, CDN, monitoring, backups), licensing (third-party APIs, commercial libraries, SaaS tools), operational labor (the engineering hours required to keep the system running, deploy updates, respond to incidents, manage infrastructure), security and compliance costs (penetration testing, SOC 2 maintenance, data handling compliance), and scaling costs (what does the infrastructure bill look like at 2x, 5x, and 10x current load?). A product that costs $2,000/month to host but requires a full-time engineer to babysit deployments and fix production issues has a true cost of ownership closer to $15,000/month when you account for that engineer’s salary.

Lens 3: What is the technical ceiling? How far can the current architecture scale before it needs a fundamental rewrite? If the acquirer’s growth plan requires 10x the current user base within 3 years, can the system handle it with incremental improvements (adding read replicas, implementing caching, optimizing hot paths), or does it need a platform migration (rewriting the backend in a different language, migrating from a monolith to microservices, replacing the database engine)? Rewrites are expensive—typically 12-18 months and $500K-$2M for the scale of products we evaluate—so the technical ceiling directly impacts the post-acquisition investment thesis. A product approaching its ceiling requires the acquirer to budget for a major platform investment within the first 2-3 years.

What We Actually Examine

A typical due diligence engagement takes 2-3 weeks and covers the following areas. For each area, I will describe what we look at and what we have found that materially impacted deal terms.

Codebase Health

We clone the repository and run a series of automated analyses before reading a single line of code manually. The automated pass takes about 2 hours and produces a quantitative profile of the codebase:

# Our standard first-pass analysis
cloc --json .                          # Lines of code by language
git log --oneline --since="1 year" | wc -l  # Commit frequency
git shortlog -sn --since="1 year"      # Contributor concentration
npm audit / pip audit / cargo audit    # Known vulnerabilities
npx depcheck                           # Unused dependencies
npx madge --circular --extensions ts . # Circular dependencies
pyright / tsc --noEmit                 # Type checking errors
# Custom script: calculate test coverage on critical paths
# Custom script: identify files with no test coverage
# Custom script: measure function complexity (cyclomatic)

The numbers that matter most:

Contributor concentration. If one developer authored more than 60% of commits in the past year, that is a knowledge concentration risk. We have seen this in 8 of 11 evaluations—small teams where the CTO wrote most of the code. This is not inherently a problem (it is normal for early-stage products), but it means the acquirer needs a knowledge transfer plan and should budget 3-6 months of overlap with the departing team. In one evaluation, the sole developer had also set up the infrastructure, configured the CI/CD pipeline, and managed the production database—all without documentation. The knowledge transfer risk was so high that we recommended adding a 6-month earnout clause tied to documentation deliverables.
Dependency age. We flag every dependency that is more than 2 major versions behind the current release. Outdated dependencies are not just a security risk; they are a compounding maintenance cost. Each outdated dependency makes the next upgrade harder because of cascading compatibility issues. In one evaluation, the target was running React 16.8 in 2025—four major versions behind. The estimated cost to upgrade was 6 weeks of full-time engineering work, primarily because the codebase used class components extensively, several deprecated lifecycle methods, and patterns that were incompatible with concurrent rendering. We factored this into the deal as a $75,000 technical debt liability.
Test coverage, but not the percentage. A codebase with 90% coverage that tests only utility functions and ignores API endpoints, data migrations, and payment flows is less reliable than a codebase with 40% coverage that tests the critical paths. We map the test coverage to the revenue-critical paths (user registration, payment processing, core product features, data export) and evaluate coverage there specifically. If the payment processing code has zero test coverage, that is a material finding regardless of the overall coverage percentage, because any change to payment processing is a high-risk change without tests to catch regressions.
Circular dependencies. Circular dependency count is a proxy for architectural coupling. A codebase with more than 10 circular dependency cycles is going to be difficult to modify incrementally—every change has unpredictable ripple effects. We use madge for JavaScript/TypeScript and custom scripts for Python. The most tangled codebase we evaluated had 67 circular dependency cycles. We estimated 3 months of full-time refactoring just to untangle the dependency graph enough to allow safe feature development.

Infrastructure and Deployment

We evaluate the deployment pipeline from commit to production. The specific questions:

Is deployment automated? If deploying requires SSH-ing into a server and running manual commands, that is a risk multiplier for every other finding. Automated deployment (CI/CD with GitHub Actions, GitLab CI, or similar) is table stakes. Manual deployment means every production update is a potential incident, and the frequency of deployments is limited by the availability of the person who knows the deployment procedure. In 4 of 11 evaluations, deployment was partially automated—CI ran tests, but the actual deploy was a manual SSH-and-pull process.
Is the infrastructure defined as code? Terraform, Pulumi, CloudFormation, CDK—the specific tool matters less than the existence of infrastructure-as-code. If the infrastructure was set up manually through a cloud console over 3 years of incremental changes, nobody knows the exact configuration, and reproducing the environment (for disaster recovery, staging, or scaling) is a manual, error-prone process that takes days instead of hours.
What is the disaster recovery posture? We ask three questions: (1) Are backups automated and tested? (2) What is the recovery time objective (RTO), and is it documented and agreed with stakeholders? (3) Has the team ever actually restored from a backup? In 6 of 11 evaluations, the answer to question 3 was “no.” Untested backups are not backups—they are hopes. We always recommend that the acquirer test backup restoration during the diligence period, before the deal closes.

In one memorable evaluation, the target’s entire production infrastructure was running on a single EC2 instance with no auto-scaling, no load balancer, no backup strategy beyond EBS snapshots that had not been tested, and no staging environment (the founder tested changes on production). The instance had been running continuously for 26 months without a reboot. The founder described the deployment process as “I SSH in and do a git pull, then I restart the service.” We recommended the acquirer budget $80,000 for infrastructure modernization as a condition of the deal, covering CI/CD setup, infrastructure-as-code migration, automated backups with tested restoration, a staging environment, and basic monitoring.

Data and Security

Data handling and security posture have become deal-critical in every evaluation since 2023, driven by increased regulatory scrutiny (GDPR enforcement actions, state privacy laws in the US) and buyer sensitivity to breach risk. A data breach post-acquisition is the acquirer’s liability.

PII handling. Where is personally identifiable information stored? Is it encrypted at rest? Is access logged and auditable? Can it be deleted on request (GDPR/CCPA compliance)? In 4 evaluations, we found PII in application logs—email addresses, IP addresses, and in one case, partial credit card numbers in error logs that were shipped to a third-party log aggregation service. This is fixable (redact PII from logs, rotate log storage) but represents a compliance risk that needs to be disclosed to the acquirer and remediated within a defined timeline.
Authentication and authorization. Is authentication handled by a mature library or managed service (Auth0, Clerk, Cognito, Supabase Auth), or is it custom-built? Custom authentication is not inherently bad, but it is a red flag that requires deeper evaluation. We have found password hashing with MD5 (in a 2024 codebase, unfortunately), session tokens stored in localStorage without expiration, API keys hardcoded in frontend JavaScript (visible to any user who opens browser dev tools), and a role-based access control system with a privilege escalation vulnerability that allowed any authenticated user to access admin endpoints by modifying a cookie value.
Secret management. Where are API keys, database credentials, and encryption keys stored? Environment variables on the server are acceptable as a minimum. Hardcoded in source code is a deal-critical finding. We have found production database credentials in committed .env files in 3 of 11 evaluations. In each case, the credentials were still valid and the Git history was publicly accessible on GitHub. One evaluation revealed an AWS root account access key committed to a public repository in 2022 and never rotated—the key was still active and had full administrative privileges.

Architecture and Scalability

We map the system architecture—services, databases, external dependencies, data flows, integration points—and evaluate it against the acquirer’s growth plan:

Database design. The database schema reveals more about the product’s maturity than any other artifact. We look for normalization issues (customer email stored in both the users table and the orders table, with no guarantee of consistency), missing indexes on frequently queried columns (we run EXPLAIN ANALYZE on the top 20 queries by frequency), lack of migration tooling (schema changes applied manually via SQL console), and data integrity gaps (foreign keys that should exist but do not, allowing orphaned records to accumulate). The database is the hardest thing to change post-acquisition because data migration is risky, schema changes affect every application layer, and the database often outlives every other component.
Coupling and modularity. Can individual components be modified independently, or does every change require touching multiple files across the codebase? Tight coupling is the most reliable predictor of slow future development velocity. We measure this by examining import graphs, identifying circular dependencies, and estimating the “blast radius” of common change types: add a field to a core entity (how many files change?), modify a business rule (how many modules are affected?), add a new API endpoint (how much boilerplate is required?). A well-modularized codebase has blast radii of 3-5 files for typical changes. A tightly coupled codebase has blast radii of 15-30 files.
Third-party dependencies. What happens if a critical third-party service goes down, changes its pricing by 5x, or shuts down entirely? We identify all external dependencies, classify them by criticality (“the product does not work without this” vs. “the product is degraded without this” vs. “the product is unaffected”), evaluate the switching cost for each critical dependency, and check whether the target has negotiated committed pricing or is on pay-as-you-go plans that can be repriced at any time.

The Red Flags That Kill Deals

In our experience, three findings consistently lead to a “do not acquire” recommendation:

The product works only because one person understands it. If the codebase is undocumented, untested, and architecturally complex, and the only person who can maintain it is the founder who is leaving post-acquisition, the acquirer is buying a product that will degrade rapidly. The cost of building institutional knowledge from scratch—by reverse-engineering the codebase, documenting the architecture, writing tests for critical paths, and training a new team—often exceeds the cost of building the product from scratch. We recommended against one deal specifically on this basis: the product’s value was in its user base and brand, not its technology, and the acquirer was better served building a replacement product and migrating users.
The core technology is a liability, not an asset. If the product is built on a deprecated framework (AngularJS, Rails 4, Django 1.x), an end-of-life language version (Python 2, PHP 5), or a technology stack that makes hiring difficult (proprietary languages, obscure frameworks with single-digit GitHub stars), the acquirer is buying technical debt that will consume engineering resources indefinitely. The cost of migration must be factored into the deal price, and if the migration cost exceeds 30% of the deal value, the deal economics rarely work.
The data is not clean. If the product’s primary value is its data (customer relationships, content library, training data, usage analytics), and that data has integrity issues (duplicates, missing fields, inconsistent formats, no audit trail, PII mixed with non-PII without classification), the acquirer is buying data that may cost more to clean than to regenerate. We evaluated a content platform where the database contained 340,000 content items, but after deduplication, correcting encoding errors, and removing items with missing required fields, only 180,000 were usable. The acquirer adjusted their valuation accordingly.

What Good Looks Like

The best acquisition targets we have evaluated share common traits: automated CI/CD pipelines, infrastructure as code, test suites that cover critical paths, clean separation of concerns in the codebase, documented architecture decisions (even informal ones in README files or decision records), and honest disclosure of known technical debt. Perfection is not the standard—every product has technical debt. The standard is awareness and management of that debt. A team that knows their database schema needs refactoring, has a migration plan documented, and has estimated the cost is in a fundamentally better position than a team that does not know (or will not acknowledge) that the problem exists.

If you are building a product that you expect to sell someday, the due diligence evaluation starts now—not when a buyer appears. Every decision you make about code quality, documentation, testing, deployment automation, and security posture either increases or decreases the value of your company when a buyer evaluates it. Technical due diligence does not evaluate your product’s features. It evaluates the sustainability of your product’s features. Build accordingly.

The Deliverable: What Our Reports Contain

Our due diligence report follows a consistent structure that acquirers and their legal teams expect. The report is typically 25-40 pages and includes:

Executive summary (1-2 pages): Go/no-go recommendation with the top 3 findings that drive the recommendation. Written for a non-technical audience (the PE partner or corporate development VP who is making the deal decision).
Architecture overview (3-5 pages): System diagram, technology stack, data flows, external dependencies. This section establishes the shared vocabulary that the rest of the report uses.
Detailed findings (10-20 pages): Each finding is classified as Critical (deal-breaker or requires immediate remediation), Major (requires remediation within 6 months post-acquisition), Minor (should be addressed but does not impact deal terms), or Informational (context for the acquirer’s technical team). Each finding includes a description, evidence, business impact assessment, and recommended remediation with estimated cost.
Technical debt inventory (3-5 pages): Itemized list of known technical debt with estimated remediation cost for each item. This section is the basis for price adjustments—if the codebase has $200,000 in technical debt that must be addressed within the first year, the acquirer can negotiate that into the deal price.
Risk register (2-3 pages): Forward-looking risks that are not current problems but could become problems under the acquirer’s growth plan. Example: “The database is currently adequate for 10,000 concurrent users, but the acquirer’s growth target of 50,000 concurrent users within 2 years will require a database migration estimated at $150,000-$250,000.”
Recommendations (1-2 pages): Prioritized list of actions for the first 90 days post-acquisition, including quick wins (security patches, dependency updates, monitoring setup) and strategic investments (infrastructure modernization, test coverage improvement, documentation projects).

The report is written for two audiences simultaneously: the deal team (who needs the executive summary and risk register to make a go/no-go decision and negotiate terms) and the acquirer’s engineering team (who needs the detailed findings and technical debt inventory to plan their post-acquisition integration work). Both audiences need the same information presented at different levels of technical detail, which is why the report structure separates strategic assessment from tactical recommendations.