Skip links

Implementing Role-Based Access Control from Scratch

Access control is one of those features that seems simple in the elevator pitch and complex in the implementation. “Users have roles, roles have permissions, check the permission before doing the thing.” That covers about 20% of the real design. The remaining 80% involves permission inheritance, resource-level access, multi-tenancy isolation, role hierarchies, cache invalidation timing, frontend integration patterns, audit logging, and a dozen edge cases that only surface when real users interact with the system in ways you did not anticipate.

Article Overview

Implementing Role-Based Access Control from Scratch

7 sections · Reading flow

01
The Data Model
02
Seeding Permissions and Default Roles
03
The Authorization Middleware
04
Cache Invalidation: The Hard Part
05
Resource-Level Permissions
06
Frontend Integration
07
Testing Authorization Thoroughly

HARBOR SOFTWARE · Engineering Insights

We built RBAC from scratch at Harbor Software for our multi-tenant SaaS platform. We evaluated Auth0 Authorization, Casbin, OPA (Open Policy Agent), and AWS Cognito Groups before deciding to build our own. The deciding factors were our need for fine-grained resource-level permissions (not just route-level “can access this endpoint”), our multi-tenant data isolation requirements, the desire to keep authorization logic close to our business logic rather than in a separate service with its own latency and failure modes, and the need for our permission model to evolve rapidly as we added features. Here is the complete design, implementation, and the lessons we learned over twelve months of running this in production.

The Data Model

RBAC requires four core entities: users, roles, permissions, and the assignments connecting them. Our schema also includes tenants for multi-tenancy. Here is the complete schema with indexes:

-- Tenants (organizations / workspaces)
CREATE TABLE tenants (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name VARCHAR(255) NOT NULL,
  slug VARCHAR(100) UNIQUE NOT NULL,
  plan VARCHAR(50) DEFAULT 'free',
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Users (global, not tenant-scoped)
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email VARCHAR(255) UNIQUE NOT NULL,
  name VARCHAR(255) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Permissions (system-defined, immutable by end users)
CREATE TABLE permissions (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  resource VARCHAR(100) NOT NULL,  -- e.g., 'project', 'invoice', 'user'
  action VARCHAR(50) NOT NULL,     -- e.g., 'create', 'read', 'update', 'delete'
  description TEXT NOT NULL,
  UNIQUE(resource, action)
);

-- Roles (per-tenant, customizable)
CREATE TABLE roles (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
  name VARCHAR(100) NOT NULL,
  description TEXT,
  is_system BOOLEAN DEFAULT FALSE,  -- System roles cannot be modified or deleted
  created_at TIMESTAMPTZ DEFAULT NOW(),
  UNIQUE(tenant_id, name)
);
CREATE INDEX idx_roles_tenant ON roles(tenant_id);

-- Role-Permission mapping (which permissions each role grants)
CREATE TABLE role_permissions (
  role_id UUID NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
  permission_id UUID NOT NULL REFERENCES permissions(id) ON DELETE CASCADE,
  PRIMARY KEY (role_id, permission_id)
);
CREATE INDEX idx_role_permissions_role ON role_permissions(role_id);

-- User-Tenant-Role assignment (a user can have different roles in different tenants)
CREATE TABLE user_roles (
  user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
  role_id UUID NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
  assigned_by UUID REFERENCES users(id),
  assigned_at TIMESTAMPTZ DEFAULT NOW(),
  PRIMARY KEY (user_id, tenant_id, role_id)
);
CREATE INDEX idx_user_roles_user_tenant ON user_roles(user_id, tenant_id);

-- Audit log for all authorization-related changes
CREATE TABLE auth_audit_log (
  id BIGSERIAL PRIMARY KEY,
  action VARCHAR(100) NOT NULL,
  actor_id UUID NOT NULL,
  tenant_id UUID NOT NULL,
  target_type VARCHAR(50) NOT NULL,
  target_id UUID NOT NULL,
  details JSONB,
  created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_audit_tenant_time ON auth_audit_log(tenant_id, created_at DESC);

Key design decisions and their rationale:

  • Permissions are system-defined and immutable: End users cannot create permissions. The application defines the complete set at deployment time via a seed script. This prevents permission sprawl (hundreds of ad-hoc permissions with unclear meaning) and ensures every permission corresponds to an actual capability check in the codebase.
  • Roles are tenant-scoped and customizable: Each tenant can create custom roles that combine permissions in ways that match their organizational structure. One tenant might have “Developer” and “Designer” roles; another might have “Analyst” and “Manager”. System roles (admin, member, viewer) are created automatically for every new tenant and cannot be deleted.
  • Users can have multiple roles per tenant: A user might be both a “Developer” (can create/read/update projects) and a “Billing Admin” (can read/update billing settings). Their effective permissions are the union of all role permissions. This is more flexible than single-role assignment and reflects how real organizations work.
  • Users can belong to multiple tenants: With potentially different roles in each. This is essential for consultants, agencies, freelancers, and anyone who works across organizations. The user_roles table’s composite primary key on (user_id, tenant_id, role_id) enforces uniqueness while allowing this flexibility.

Seeding Permissions and Default Roles

Permissions are defined in code as the single source of truth and seeded to the database on every deployment. This ensures the permission set is version-controlled, consistent across environments, and always in sync with the authorization checks in the codebase:

// src/auth/permissions.ts - Single source of truth
export const PERMISSIONS = {
  project: ['create', 'read', 'update', 'delete', 'archive', 'export'],
  invoice: ['create', 'read', 'update', 'delete', 'send', 'void'],
  user: ['invite', 'read', 'update', 'remove', 'impersonate'],
  role: ['create', 'read', 'update', 'delete', 'assign'],
  report: ['read', 'export', 'schedule'],
  billing: ['read', 'update'],
  settings: ['read', 'update'],
  api_key: ['create', 'read', 'revoke'],
  webhook: ['create', 'read', 'update', 'delete', 'test'],
} as const;

// Type-safe permission string type
export type Permission = `${keyof typeof PERMISSIONS}:${typeof PERMISSIONS[keyof typeof PERMISSIONS][number]}`;

// Default role templates applied to every new tenant
export const DEFAULT_ROLES: Record<string, { description: string; permissions: Permission[] }> = {
  admin: {
    description: 'Full access to all resources and settings',
    permissions: Object.entries(PERMISSIONS).flatMap(
      ([resource, actions]) => actions.map(action => `${resource}:${action}` as Permission)
    ), // Every permission
  },
  member: {
    description: 'Can manage projects and view reports',
    permissions: [
      'project:create', 'project:read', 'project:update',
      'invoice:read',
      'report:read',
      'user:read',
      'webhook:read',
    ],
  },
  viewer: {
    description: 'Read-only access to projects and reports',
    permissions: [
      'project:read',
      'invoice:read',
      'report:read',
      'user:read',
    ],
  },
};

// Seed script: runs on every deployment (idempotent via ON CONFLICT)
async function seedPermissions() {
  let count = 0;
  for (const [resource, actions] of Object.entries(PERMISSIONS)) {
    for (const action of actions) {
      await db.query(
        `INSERT INTO permissions (resource, action, description)
         VALUES ($1, $2, $3)
         ON CONFLICT (resource, action) DO UPDATE SET description = $3`,
        [resource, action, `Can ${action} ${resource}s`]
      );
      count++;
    }
  }
  console.log(`Seeded ${count} permissions`);
}

The Authorization Middleware

The authorization check runs on every API request that requires access control. It must be fast (under 5ms including cache hit), correct (no false grants), and tenant-aware (a user with admin in Tenant A has no permissions in Tenant B):

// src/middleware/authorize.ts
import { Redis } from 'ioredis';
import type { Permission } from '../auth/permissions';

const redis = new Redis(process.env.REDIS_URL!);
const CACHE_TTL = 300; // 5 minutes

export function authorize(resource: string, action: string) {
  return async (req: Request, res: Response, next: NextFunction) => {
    const { userId, tenantId } = req.auth;

    if (!userId || !tenantId) {
      return res.status(401).json({
        error: { code: 'UNAUTHENTICATED', message: 'Authentication required' }
      });
    }

    const permitted = await checkPermission(userId, tenantId, resource, action);

    if (!permitted) {
      // Log the denial for security monitoring
      await logAuthEvent({
        action: 'authorization.denied',
        actorId: userId,
        tenantId,
        resource,
        resourceAction: action,
      });

      return res.status(403).json({
        error: {
          code: 'FORBIDDEN',
          message: `You do not have permission to ${action} this ${resource}`,
          required_permission: `${resource}:${action}`
        }
      });
    }

    next();
  };
}

async function checkPermission(
  userId: string,
  tenantId: string,
  resource: string,
  action: string
): Promise<boolean> {
  const cacheKey = `perms:${tenantId}:${userId}`;
  let permissions = await getCachedPermissions(cacheKey);

  if (!permissions) {
    permissions = await fetchPermissionsFromDB(userId, tenantId);
    await redis.set(cacheKey, JSON.stringify(permissions), 'EX', CACHE_TTL);
  }

  return permissions.includes(`${resource}:${action}`);
}

async function fetchPermissionsFromDB(userId: string, tenantId: string): Promise<string[]> {
  const result = await db.query(
    `SELECT DISTINCT p.resource || ':' || p.action AS permission
     FROM user_roles ur
     JOIN role_permissions rp ON rp.role_id = ur.role_id
     JOIN permissions p ON p.id = rp.permission_id
     WHERE ur.user_id = $1 AND ur.tenant_id = $2`,
    [userId, tenantId]
  );
  return result.rows.map((r: any) => r.permission);
}

// Usage in routes:
router.get('/projects', authorize('project', 'read'), listProjects);
router.post('/projects', authorize('project', 'create'), createProject);
router.put('/projects/:id', authorize('project', 'update'), updateProject);
router.delete('/projects/:id', authorize('project', 'delete'), deleteProject);

The Redis cache is keyed by tenant+user, not just user, because the same user can have different permissions in different tenants. The 5-minute TTL balances performance (avoiding a DB query on every request) with freshness (permission changes take effect within 5 minutes at most, or immediately via explicit invalidation).

Cache Invalidation: The Hard Part

Caching permissions is essential for performance but creates a consistency challenge. When a user’s role changes, the cache must be invalidated so the new permissions take effect. For security-critical changes (revoking access), even 5 minutes of stale cache is too long:

async function assignRole(userId: string, tenantId: string, roleId: string, assignedBy: string) {
  await db.query(
    'INSERT INTO user_roles (user_id, tenant_id, role_id, assigned_by) VALUES ($1, $2, $3, $4) ON CONFLICT DO NOTHING',
    [userId, tenantId, roleId, assignedBy]
  );

  // Immediately invalidate the cache
  await redis.del(`perms:${tenantId}:${userId}`);

  // Audit log
  await logAuthEvent({
    action: 'role.assigned',
    actorId: assignedBy,
    tenantId,
    targetType: 'user',
    targetId: userId,
    details: { roleId }
  });
}

async function revokeRole(userId: string, tenantId: string, roleId: string, revokedBy: string) {
  await db.query(
    'DELETE FROM user_roles WHERE user_id = $1 AND tenant_id = $2 AND role_id = $3',
    [userId, tenantId, roleId]
  );

  await redis.del(`perms:${tenantId}:${userId}`);

  await logAuthEvent({
    action: 'role.revoked',
    actorId: revokedBy,
    tenantId,
    targetType: 'user',
    targetId: userId,
    details: { roleId }
  });
}

// When a role's permissions change, invalidate ALL users with that role
async function updateRolePermissions(
  roleId: string,
  tenantId: string,
  newPermissionIds: string[],
  updatedBy: string
) {
  await db.query('BEGIN');
  try {
    await db.query('DELETE FROM role_permissions WHERE role_id = $1', [roleId]);
    for (const permId of newPermissionIds) {
      await db.query(
        'INSERT INTO role_permissions (role_id, permission_id) VALUES ($1, $2)',
        [roleId, permId]
      );
    }
    await db.query('COMMIT');
  } catch (err) {
    await db.query('ROLLBACK');
    throw err;
  }

  // Find ALL users who have this role and invalidate their caches
  const affected = await db.query(
    'SELECT DISTINCT user_id FROM user_roles WHERE role_id = $1 AND tenant_id = $2',
    [roleId, tenantId]
  );

  if (affected.rows.length > 0) {
    const pipeline = redis.pipeline();
    for (const { user_id } of affected.rows) {
      pipeline.del(`perms:${tenantId}:${user_id}`);
    }
    await pipeline.exec();
    console.log(`Invalidated cache for ${affected.rows.length} users after role permission update`);
  }

  await logAuthEvent({
    action: 'role.permissions_updated',
    actorId: updatedBy,
    tenantId,
    targetType: 'role',
    targetId: roleId,
    details: { permissionCount: newPermissionIds.length }
  });
}

The Redis pipeline for batch invalidation is important when a popular role (like “member”) is modified. If a tenant has 200 users with the “member” role, we need to invalidate 200 cache keys. A pipeline sends all 200 DEL commands in a single round trip rather than 200 separate round trips.

Resource-Level Permissions

Route-level RBAC (“can this user access the projects endpoint?”) is necessary but not sufficient for most real applications. You also need resource-level checks (“can this user access this specific project?”). This is where RBAC intersects with data ownership and team membership:

async function authorizeResourceAccess(
  userId: string,
  tenantId: string,
  resource: string,
  action: string,
  resourceId: string
): Promise<{ allowed: boolean; reason?: string }> {
  // Check 1: Does the user have the permission at all?
  const hasPermission = await checkPermission(userId, tenantId, resource, action);
  if (!hasPermission) {
    return { allowed: false, reason: 'missing_permission' };
  }

  // Check 2: Does the resource exist and belong to this tenant?
  const resourceRecord = await db.query(
    `SELECT id, tenant_id, created_by FROM ${resource}s WHERE id = $1`,
    [resourceId]
  );
  if (resourceRecord.rows.length === 0) {
    return { allowed: false, reason: 'not_found' };
  }
  if (resourceRecord.rows[0].tenant_id !== tenantId) {
    // Resource exists but belongs to a different tenant
    // Return not_found to avoid leaking existence information
    return { allowed: false, reason: 'not_found' };
  }

  // Check 3 (optional): Resource-level membership
  // For destructive actions on projects, require project membership
  if (resource === 'project' && ['update', 'delete', 'archive'].includes(action)) {
    const isMember = await db.query(
      'SELECT 1 FROM project_members WHERE project_id = $1 AND user_id = $2',
      [resourceId, userId]
    );
    if (isMember.rows.length === 0) {
      return { allowed: false, reason: 'not_member' };
    }
  }

  return { allowed: true };
}

Note the security detail in Check 2: when a resource belongs to a different tenant, we return “not_found” rather than “forbidden”. Returning “forbidden” would confirm that the resource exists, which is an information leak. A user should not be able to probe for the existence of resources in other tenants.

Frontend Integration

The frontend needs to know the current user’s permissions to show or hide UI elements. We expose permissions through a dedicated endpoint that is called once at login and cached client-side:

// GET /api/v1/me/permissions
{
  "permissions": [
    "project:create", "project:read", "project:update",
    "invoice:read",
    "report:read", "report:export"
  ],
  "roles": [
    { "id": "role_abc123", "name": "member", "is_system": true }
  ]
}

// React hook for permission-based UI
function usePermission(resource: string, action: string): boolean {
  const { permissions } = useAuth();
  return useMemo(
    () => permissions.includes(`${resource}:${action}`),
    [permissions, resource, action]
  );
}

// Component usage
function ProjectToolbar({ project }: { project: Project }) {
  const canEdit = usePermission('project', 'update');
  const canDelete = usePermission('project', 'delete');
  const canExport = usePermission('project', 'export');

  return (
    <div className="toolbar">
      {canEdit && <button onClick={handleEdit}>Edit</button>}
      {canExport && <button onClick={handleExport}>Export</button>}
      {canDelete && <button onClick={handleDelete} className="danger">Delete</button>}
    </div>
  );
}

Critical principle: frontend permission checks are for UX only. They hide buttons and menu items that the user cannot use, preventing confusion and reducing support requests. The real authorization check always happens on the server in the middleware. A sophisticated user who manipulates the frontend JavaScript to reveal hidden buttons will still be blocked by the server-side authorize() middleware when they attempt the action.

Testing Authorization Thoroughly

Authorization bugs are security vulnerabilities. A missing check, a wrong role assignment, or a cache invalidation failure can grant unauthorized access to sensitive data. We test at three levels with high coverage:

// Unit tests: permission logic
describe('checkPermission', () => {
  it('grants permission when user has it via a role', async () => {
    await assignRole(userId, tenantId, adminRoleId, systemUserId);
    expect(await checkPermission(userId, tenantId, 'project', 'delete')).toBe(true);
  });

  it('denies permission when user lacks it', async () => {
    await assignRole(userId, tenantId, viewerRoleId, systemUserId);
    expect(await checkPermission(userId, tenantId, 'project', 'delete')).toBe(false);
  });

  it('unions permissions across multiple roles', async () => {
    await assignRole(userId, tenantId, viewerRoleId, systemUserId);
    await assignRole(userId, tenantId, billingAdminRoleId, systemUserId);
    expect(await checkPermission(userId, tenantId, 'project', 'read')).toBe(true);  // from viewer
    expect(await checkPermission(userId, tenantId, 'billing', 'update')).toBe(true); // from billing
    expect(await checkPermission(userId, tenantId, 'project', 'delete')).toBe(false); // neither role
  });

  it('enforces tenant isolation strictly', async () => {
    await assignRole(userId, tenantA, adminRoleId, systemUserId);
    // Admin in tenant A has zero permissions in tenant B
    expect(await checkPermission(userId, tenantB, 'project', 'read')).toBe(false);
  });

  it('reflects cache invalidation after role revocation', async () => {
    await assignRole(userId, tenantId, adminRoleId, systemUserId);
    expect(await checkPermission(userId, tenantId, 'project', 'delete')).toBe(true);

    await revokeRole(userId, tenantId, adminRoleId, systemUserId);
    // Cache should be invalidated; permission should be denied immediately
    expect(await checkPermission(userId, tenantId, 'project', 'delete')).toBe(false);
  });
});

// Integration tests: API endpoint authorization
describe('DELETE /api/v1/projects/:id', () => {
  it('returns 401 for unauthenticated requests', async () => {
    const res = await request(app).delete(`/api/v1/projects/${projectId}`);
    expect(res.status).toBe(401);
  });

  it('returns 403 for viewers', async () => {
    const res = await request(app)
      .delete(`/api/v1/projects/${projectId}`)
      .set('Authorization', `Bearer ${viewerToken}`);
    expect(res.status).toBe(403);
    expect(res.body.error.code).toBe('FORBIDDEN');
  });

  it('returns 403 for non-members even with delete permission', async () => {
    const res = await request(app)
      .delete(`/api/v1/projects/${projectId}`)
      .set('Authorization', `Bearer ${adminOfOtherProjectToken}`);
    expect(res.status).toBe(403);
  });

  it('returns 404 for resources in other tenants (prevents existence leak)', async () => {
    const res = await request(app)
      .delete(`/api/v1/projects/${otherTenantProjectId}`)
      .set('Authorization', `Bearer ${adminToken}`);
    expect(res.status).toBe(404); // NOT 403
  });

  it('succeeds for project members with delete permission', async () => {
    const res = await request(app)
      .delete(`/api/v1/projects/${projectId}`)
      .set('Authorization', `Bearer ${projectMemberWithDeleteToken}`);
    expect(res.status).toBe(200);
  });
});

Conclusion

Building RBAC from scratch is a significant investment (we estimate it took about 3 weeks of focused development including tests, frontend integration, and audit logging), but it gives you complete control over your authorization model and eliminates the runtime dependency on an external authorization service. The core pattern is straightforward: permissions define capabilities, roles group permissions, users are assigned roles within tenants, and a middleware checks permissions on every request with a Redis cache for performance and explicit invalidation for consistency.

The complexity lies in the details: immediate cache invalidation when roles change, resource-level checks beyond route-level checks, tenant isolation that does not leak information across boundaries, frontend integration that is helpful but not authoritative, thorough testing of every authorization path including negative cases, and audit logging that satisfies compliance requirements.

If your authorization needs are simple (route-level checks, single-tenant, no resource-level permissions), use Auth0 Authorization or a similar managed service. If you need the flexibility and control of custom authorization logic deeply integrated with your business domain and data model, building from scratch is the right choice. Just budget more time than you think it will take, and invest heavily in test coverage because authorization bugs are security bugs.

Leave a comment

Explore
Drag