CI/CD for Small Teams: Getting Maximum Value With Minimal Overhead
CI/CD pipelines at large companies are maintained by dedicated platform engineering teams with six-figure infrastructure budgets. At a 4-person startup, you are the platform engineering team, the SRE team, and the development team—simultaneously. Your CI budget is whatever GitHub Actions’ free tier provides, and every hour spent configuring pipelines is an hour not spent building product. The challenge is not building the most comprehensive pipeline—it is building the smallest effective pipeline that catches real problems before they reach production without becoming a maintenance burden that slows you down. At Harbor Software, our CI/CD setup runs on GitHub Actions, costs $0/month (within free tier limits), takes 4 minutes per run, and has caught 23 bugs before they reached production in the last three months. This post covers exactly what we run, what we deliberately skip, and why each decision matters.
The Pipeline We Actually Run
Here is our complete .github/workflows/ci.yml, annotated with the reasoning behind each choice:
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
lint-and-type-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 18
cache: 'npm'
- run: npm ci
- run: npm run lint
- run: npm run type-check
test:
runs-on: ubuntu-latest
needs: lint-and-type-check
services:
postgres:
image: postgres:14
env:
POSTGRES_USER: test
POSTGRES_PASSWORD: test
POSTGRES_DB: harbor_test
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
redis:
image: redis:7
ports:
- 6379:6379
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 18
cache: 'npm'
- run: npm ci
- run: npm run db:migrate
env:
DATABASE_URL: postgresql://test:test@localhost:5432/harbor_test
- run: npm test -- --coverage
env:
DATABASE_URL: postgresql://test:test@localhost:5432/harbor_test
REDIS_URL: redis://localhost:6379
NODE_ENV: test
- uses: actions/upload-artifact@v3
if: always()
with:
name: coverage-report
path: coverage/
build:
runs-on: ubuntu-latest
needs: test
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 18
cache: 'npm'
- run: npm ci
- run: npm run build
Three jobs, run sequentially with dependencies: lint + type-check, test (with real PostgreSQL and Redis), build. Total runtime: 3–4 minutes including dependency installation. Let me explain the reasoning behind each decision.
Job 1: Lint and Type-Check (90 seconds)
This is the fastest feedback loop and catches the most common issues at the lowest cost. It runs two commands:
npm run lint runs ESLint with our configured rules. The rules that matter for correctness (not style—we use Prettier for formatting and do not waste CI time on semicolons or quote styles):
// .eslintrc.js - only the rules that catch real bugs
module.exports = {
rules: {
// Prevent importing from internal feature modules (architecture boundary)
'no-restricted-imports': ['error', { patterns: ['@/features/*/!(index)'] }],
// Catch forgotten awaits on promises (causes silent failures)
'@typescript-eslint/no-floating-promises': 'error',
// Prevent unused variables (usually indicates incomplete refactoring)
'@typescript-eslint/no-unused-vars': ['error', { argsIgnorePattern: '^_' }],
// Catch accidental 'any' types (defeats the purpose of TypeScript)
'@typescript-eslint/no-explicit-any': 'warn',
// Prevent console.log in production code (use structured logger instead)
'no-console': ['error', { allow: ['warn', 'error'] }],
// Prevent == instead of === (type coercion bugs)
'eqeqeq': ['error', 'always'],
// Catch unreachable code after return/throw
'no-unreachable': 'error'
}
};
Each of these rules has caught at least one real bug in the last three months. The no-floating-promises rule is the single most valuable: a missing await on an async function call means the error is silently swallowed and the function appears to succeed. We caught 7 instances of this in code review—and 3 more that made it past review but were caught by the lint step in CI.
npm run type-check runs tsc --noEmit. With our strict TypeScript configuration (strict: true, noUncheckedIndexedAccess, exactOptionalPropertyTypes), the type checker catches bugs that would otherwise only surface at runtime. In the last three months, the type checker caught 12 issues in CI that would have been runtime errors: null reference errors from unchecked array access, incorrect function argument types after refactoring, and missing properties on objects passed between modules.
Why run lint and type-check before tests? Because they are fast (30 seconds each) and catch a different class of bugs than tests do. If lint or type-check fails, there is no point running a 2-minute test suite—you need to fix the type error or lint violation first. The sequential dependency (needs: lint-and-type-check) saves CI minutes and provides faster feedback.
Job 2: Test With Real Services (2 minutes)
Our test job spins up real PostgreSQL and Redis instances using GitHub Actions service containers. This is the most important job in the pipeline, and the decision to use real services instead of mocks was deliberate and consequential.
We explicitly rejected mocking the database in tests. Mocked database tests verify that your code calls the mock correctly; they do not verify that your SQL queries return correct results, that your migration scripts produce a valid schema, that your unique constraints actually prevent duplicates, or that your connection pooling handles concurrent requests without deadlocking. Every database-related bug we shipped to production in the first three months (four bugs total) would have been caught by a test running against a real PostgreSQL instance. Since adding real-database integration tests, we have shipped zero database bugs to production.
The npm run db:migrate step runs our migration suite against the test database. This serves double duty: it verifies the migrations themselves (a broken migration fails CI before anyone tries to run it in staging), and it creates the schema that the test suite needs. If you add a migration that conflicts with an existing one, or that creates an invalid foreign key reference, CI catches it immediately.
Our test suite is structured in three tiers, each providing different coverage:
// Tier 1: Unit tests (fast, no dependencies, test pure logic)
describe('calculateProration', () => {
it('prorates correctly for mid-month upgrade', () => {
const result = calculateProration({
oldPriceCents: 5000,
newPriceCents: 10000,
daysRemaining: 15,
totalDays: 30
});
expect(result).toBe(2500); // (10000 - 5000) * 15/30
});
it('returns zero when downgrading', () => {
const result = calculateProration({
oldPriceCents: 10000,
newPriceCents: 5000,
daysRemaining: 15,
totalDays: 30
});
expect(result).toBe(0); // No charge for downgrades
});
});
// Tier 2: Integration tests (real database, test data layer + business logic)
describe('SubscriptionService', () => {
beforeEach(async () => {
await db.query('DELETE FROM invoices');
await db.query('DELETE FROM subscriptions');
await db.query('DELETE FROM organizations');
});
it('creates subscription with correct billing period', async () => {
const org = await createTestOrganization();
const result = await subscriptionService.create({
organizationId: org.id,
planId: 'pro-monthly'
});
expect(result.success).toBe(true);
if (result.success) {
expect(result.data.currentPeriodEnd).toBeInstanceOf(Date);
expect(result.data.status).toBe('active');
}
});
it('prevents duplicate subscriptions for same organization', async () => {
const org = await createTestOrganization();
await subscriptionService.create({ organizationId: org.id, planId: 'pro-monthly' });
const duplicate = await subscriptionService.create({ organizationId: org.id, planId: 'enterprise-monthly' });
expect(duplicate.success).toBe(false);
});
});
// Tier 3: API tests (real HTTP, test full request/response cycle)
describe('POST /api/subscriptions', () => {
it('returns 201 with valid input and auth', async () => {
const org = await createTestOrganization();
const token = await createTestToken(org.id);
const response = await request(app)
.post('/api/subscriptions')
.set('Authorization', `Bearer ${token}`)
.send({ organizationId: org.id, planId: 'pro-monthly' });
expect(response.status).toBe(201);
expect(response.body.data.id).toBeDefined();
});
it('returns 401 without authentication', async () => {
const response = await request(app)
.post('/api/subscriptions')
.send({ organizationId: 'fake', planId: 'pro-monthly' });
expect(response.status).toBe(401);
});
});
The ratio in our codebase is roughly 40% unit tests, 40% integration tests, 20% API tests. Many testing guides recommend an inverted pyramid with mostly unit tests. For a SaaS application where the most dangerous bugs are at integration boundaries—database queries that return wrong results, API serialization that drops fields, authentication middleware that admits unauthorized requests—we find our balanced ratio catches more production-impacting bugs per test than a unit-test-heavy approach.
Job 3: Build Verification (60 seconds)
The build job runs npm run build (which executes tsc to compile TypeScript to JavaScript). This catches a specific class of issue that type-checking alone misses: import resolution problems, circular dependencies that cause runtime errors, and environment-specific build configurations that fail in production. tsc --noEmit (type-check) verifies types but does not verify that all imports resolve correctly in the compiled output. The build step verifies the entire compilation pipeline works end-to-end and produces deployable artifacts.
What We Deliberately Do Not Run in CI
The things you leave out of a CI pipeline matter as much as what you include. Every addition increases build time, maintenance burden, and the probability of flaky failures that erode team trust in the pipeline.
End-to-end tests. E2E tests with Playwright or Cypress are valuable but slow (5–15 minutes) and flaky (browser timing issues, network variability, animation delays). For a 4-person team, maintaining flaky E2E tests in CI is a net negative—you spend more time investigating false failures and adding waitFor hacks than you save from catching real bugs. We run E2E tests locally before major releases and in a nightly scheduled workflow that does not block PRs.
Docker image builds. Building a Docker image adds 3–5 minutes and requires Docker-in-Docker or Buildx configuration that is fragile across GitHub Actions runner updates. Our CI verifies code correctness; our deployment pipeline (separate workflow triggered on merge to main) builds the Docker image. Separation of concerns: CI answers “is the code correct?” and deployment answers “can we build and deploy it?”
Code coverage thresholds. We upload coverage reports as artifacts (useful for tracking trends over time and reviewing during code review) but do not fail CI when coverage drops below a threshold. Coverage thresholds incentivize writing low-value tests that exercise code paths without verifying behavior—adding a test that asserts expect(fn).toBeDefined() increases coverage without catching any bug. We prefer reviewing coverage data in PRs over enforcing a number.
Automated security scanning. We run npm audit weekly in a scheduled workflow and review results manually, not in every CI run. Running npm audit in CI generates noise—known vulnerabilities in transitive dependencies (three levels deep in your dependency tree) with no available fix and no practical exploit path. This noise desensitizes the team to real vulnerabilities. A weekly manual review with human triage is more effective than daily automated false positives.
Caching: The Biggest Speed Win
The cache: 'npm' directive in the setup-node action caches the npm dependency directory. Without it, npm ci downloads approximately 400 MB of packages on every run, taking 60–90 seconds depending on network speed. With caching, subsequent runs skip the download and resolve dependencies from the cache in 3–5 seconds. This single configuration line saves 85 seconds per CI run. Over 200 CI runs per month (our current rate), that is 4.7 hours of saved CI time—for one line of YAML.
If your project has additional expensive steps (building native dependencies, generating code, compiling assets), consider caching their outputs too:
- uses: actions/cache@v3
with:
path: |
~/.npm
node_modules/.cache
key: ${{ runner.os }}-deps-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-deps-
Branch Protection: Making CI Mandatory
CI is only valuable if you cannot bypass it. We configured GitHub branch protection rules on main with these settings:
- Require all three status checks to pass before merging
- Require at least one approval on pull requests
- Do not allow bypassing the above settings, even for administrators
- Require branches to be up to date before merging (prevents merge-day failures from stale branches)
The “even for administrators” setting is the one people skip and regret. Without it, the temptation to push a “quick fix” directly to main at 11 PM when production is broken is irresistible. The quick fix that introduces a regression because it was not tested is a startup cliche for a reason. Force yourself through the pipeline. The 4 minutes of CI are cheap insurance against a 4 AM debugging session.
Deployment: Separate Pipeline, Separate Trigger
Our deployment pipeline is intentionally separate from CI. CI runs on every push and pull request. Deployment runs only when code is merged to main:
name: Deploy
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
if: github.event_name == 'push'
steps:
- uses: actions/checkout@v3
- run: docker build -t harbor-api:${{ github.sha }} .
- run: docker push ${{ secrets.REGISTRY_URL }}/harbor-api:${{ github.sha }}
- run: ./scripts/deploy.sh ${{ github.sha }}
- run: ./scripts/health-check.sh ${{ secrets.PRODUCTION_URL }}
This separation provides three benefits: a failed deployment does not block CI for other PRs, CI results are available before merging (giving reviewers confidence), and the deployment pipeline can be re-run independently if the deployment fails for infrastructure reasons without re-running the full test suite.
Measuring Pipeline Effectiveness
A CI pipeline is a tool, and like any tool, its value should be measured. Over the last three months, here are our numbers:
- 23 bugs caught by CI that would have reached production without it (12 type errors, 6 integration test failures, 3 lint violations with correctness implications, 2 build failures from circular imports)
- Average CI time: 3.8 minutes (target: under 5 minutes; if CI exceeds 5 minutes, we investigate what changed)
- Flaky test rate: 1.2% (1–2 spurious failures per month, always timing-related in async tests; each flaky test gets fixed within one sprint)
- CI cost: $0/month (within GitHub Actions free tier of 2,000 minutes/month; we use approximately 800 minutes)
- Developer time spent on CI maintenance: ~1 hour/month (updating dependencies, fixing flaky tests, adjusting configurations)
The 23 bugs caught translate to roughly 2 bugs per week that would have required a production hotfix. At 30–60 minutes per hotfix (investigate + fix + deploy + verify + write incident notes), CI saves us 4–8 engineering hours per month in avoided incident response. The pipeline took 4 hours to set up initially and requires 1 hour of maintenance per month. The return on investment is strongly positive and improves over time as the codebase grows and the risk of regression increases.
Conclusion
An effective CI pipeline for a small team has three properties: it is fast (under 5 minutes), it catches real bugs (type errors and integration failures, not style nits), and it requires minimal maintenance (no flaky tests, no complex infrastructure). Our three-job pipeline—lint + type-check, integration tests with real databases, build verification—achieves all three. The most impactful elements are strict TypeScript configuration (catches bugs at the type level for zero runtime cost), integration tests against real services (catches the bugs that actually cause production incidents), and mandatory branch protection (prevents anyone from bypassing the pipeline under pressure). Skip the expensive additions (E2E in CI, automated security scanning, coverage thresholds) until your team is large enough to maintain them without resenting them. A simple pipeline that everyone trusts and that catches real bugs is infinitely more valuable than a comprehensive pipeline that everyone has learned to ignore because it cries wolf three times a week.
When to Add Complexity: A Growth Roadmap
Our pipeline has stayed simple for three months, but it will not stay this way forever. Here is our planned progression, mapped to team and product milestones rather than arbitrary dates:
When we reach 8–10 engineers: Add E2E tests to CI. At this team size, the risk of cross-feature regressions increases because not everyone understands every part of the codebase. E2E tests catch integration issues that unit and integration tests miss—broken API contracts between frontend and backend, authentication flows that work in isolation but fail end-to-end, payment flows where a Stripe webhook arrives before the database write commits. We will accept the 5–10 minute runtime increase because the bug-catching value justifies it at this team size.
When we handle sensitive data (SOC 2, HIPAA): Add automated dependency scanning with a human-reviewed allow-list for known false positives. At this point, the noise problem is outweighed by compliance requirements. We will use a tool like Snyk or Dependabot with configured ignore rules for vulnerabilities in transitive dependencies that have no practical exploit path.
When we deploy multiple services: Add contract testing (Pact or similar) to verify that service A’s expectations about service B’s API match service B’s actual implementation. This is the microservices equivalent of our current integration tests and becomes critical when deployments are no longer atomic—when service A and service B deploy independently, you need automated verification that they still agree on the API contract.
When build time exceeds 8 minutes: Investigate parallelization (running lint, test, and build concurrently instead of sequentially) and incremental builds (only running tests for files affected by the change). Tools like Turborepo and Nx excel at this. We will also evaluate whether test suite organization needs restructuring—large integration test suites that spin up databases can often be parallelized across multiple CI jobs with database-per-job isolation.
The key principle: add complexity in response to specific pain points, not in anticipation of hypothetical future needs. Every CI step you add today is a step you maintain tomorrow, whether or not it catches bugs. Wait for the problem, then solve it.