Engineering Deep Dive

Building DBPilot: An AI-Powered Database Assistant That Talks to Your Localhost

By Sreehari · April 2026 · 18 min read

01 The Problem

Every company has databases. Most people who need to query them — a support engineer triaging a bug, a product manager checking conversion numbers, an intern investigating a customer issue — either don't know SQL or don't know the schema well enough to write it confidently. The options today are all bad:

Ask a developer. Slow, disruptive, doesn't scale.
Build an internal dashboard. Expensive, goes stale, doesn't cover one-off needs.
Give everyone DBeaver. Now you have untrained people writing DELETE FROM users by accident.

But there's a deeper problem nobody talks about: the localhost access gap. Most development databases, staging environments, and internal tools run on private machines. A hosted AI tool can't just dial out to localhost:5432. You'd have to expose your database to the internet — which is exactly the kind of thing you should never do.

So I built DBPilot — an app that lets anyone query a database in plain English, with a secure tunnel that reaches databases running on your laptop. No SQL required. No port forwarding. No VPN.

02 What DBPilot Does

At its core, DBPilot takes a plain English question, generates the right query, runs it safely, and shows you the results in a clean table. That's the whole thing.

Natural language → SQL / MongoDB PostgreSQL MySQL MongoDB Read-only enforcement WebSocket local tunnel Schema caching Persistent chat history

The piece that makes it actually usable for real teams is the local tunnel. You run a small CLI tool on your machine (npx dbpilot-agent), it connects back to the DBPilot server over a WebSocket, and from that point your localhost database is queryable through the app — without any firewall changes, port forwarding, or exposing anything to the internet.

03 Tech Stack

Angular 18 PrimeNG v19 Tailwind CSS Node.js + Express Prisma + PostgreSQL OpenAI GPT-4o WebSockets (ws) Docker + Nginx AES-256-GCM node-sql-parser

Angular 18 with standalone components was the right call here — the UI has enough complexity (modals, guards, interceptors, conditional views) that a structured framework pays for itself. I used signals instead of RxJS for local component state; it's less ceremony for simple reactive values.

Prisma over a raw query builder because the app database schema is fixed and relational — two models, clean migrations, and type-safe queries without any boilerplate. The user's databases are a different story: those are connected at runtime with native drivers (pg, mysql2, mongodb), not through Prisma.

node-sql-parser is the unsung hero of the safety layer. It parses SQL into an AST, which lets me check the statement type reliably — no regex tricks, no edge-case bypasses.

04 Architecture Overview

The project is three pieces:


┌─────────────────────────────────────────────────────────────────┐
│                    FRONTEND (Angular 18)                        │
│                                                                 │
│  ┌───────────┐  ┌────────────┐  ┌───────────┐  ┌────────────┐   │
│  │  Landing  │  │ Dashboard  │  │   Chat    │  │ Connect DB │   │
│  │   Page    │  │(connections│  │ Interface │  │   Modal    │   │
│  └───────────┘  └────────────┘  └─────┬─────┘  └────────────┘   │
│                                       │                         │
└───────────────────────────────────────┼─────────────────────────┘
                                        │ HTTPS
┌───────────────────────────────────────┼─────────────────────────┐
│                  BACKEND (Node.js + Express)                    │
│                                       │                         │
│  ┌──────────┐  ┌────────────┐  ┌────────────┐  ┌───────────────┐│
│  │   Auth   │  │ Connections│  │  AI Query  │  │ Agent Gateway ││
│  │  (JWT)   │  │  (CRUD)    │  │  Engine    │  │  (WebSocket)  ││
│  └──────────┘  └────────────┘  └────────┬───┘  └───────┬───────┘│
│                                         │              │        │
│  ┌──────────────────────────────────────┘              │        │
│  │  Query Safety Layer (AST validation)              ┌─┘        │
│  └────────────┬───────────────────────────────────   │          │
│               │ Direct connection                    │ WebSocket│
│               ▼                                      ▼          │
│         Remote DB                           dbpilot-agent       │
│         (PostgreSQL /                       (on user's machine) │
│          MySQL / MongoDB)                         │             │
│                                                   ▼             │
│                                           localhost:5432        │
│                                                                 │
│       Application DB: Prisma + PostgreSQL                       │
└─────────────────────────────────────────────────────────────────┘

The backend is the brain — it handles auth, stores encrypted connection credentials, drives the AI, validates queries, and manages the WebSocket tunnel. The frontend is the interface. The agent CLI is the bridge to your local machine.

05 The AI Query System

The query flow is straightforward: you type a question, the backend grabs your database schema, sends both to GPT-4o with a tightly scoped system prompt, gets back a SQL query and an explanation, validates it, runs it, and returns the results.

The Schema Prompt

The AI gets just enough context to write a correct query — table names, column names, and types. No raw data, no credentials. For PostgreSQL and MySQL it looks like this:

Table: "orders"
Columns: "id" (integer), "user_id" (integer), "total" (numeric), "created_at" (timestamp)

Table: "users"
Columns: "id" (integer), "email" (text), "created_at" (timestamp)

For MongoDB, it's the same idea but with collections and field types. The schema is extracted once when you test a connection and cached in the app database — so queries don't pay the extraction cost every time.

Mock Fallback

If OPENAI_API_KEY isn't set, the AI service falls back to a keyword-matching mock engine that pattern-matches words like "count", "recent", "list" against the schema and constructs a plausible query. Every mock response is labelled [MOCK] in the UI. This was genuinely useful — the entire app worked end-to-end before I had a real API key wired in.

06 The Query Safety Layer

Users are connecting real databases. Piping AI output directly to a database driver is not an option. The safety layer has two jobs: block mutations and cap result size.

const BLOCKED_TYPES = new Set([
  'delete', 'update', 'insert', 'drop', 'alter',
  'truncate', 'create', 'replace', 'grant', 'revoke',
]);

export function validateAndSanitizeQuery(sql: string): string {
  const ast = parser.astify(sql);

  for (const stmt of ast) {
    if (stmt.type !== 'select') {
      throw new QuerySafetyError('Only SELECT statements are allowed.');
    }
  }

  // Append LIMIT 100 if not already present
  return /\bLIMIT\b/i.test(sql) ? sql : `${sql} LIMIT 100`;
}

Why AST, not regex?

A regex blocklist can be bypassed with SQL comments, alternate casing, or multi-statement batches. Parsing to an AST and checking stmt.type is the only approach that's actually reliable. No amount of obfuscation changes the parsed statement type.

For MongoDB, the pipeline coming from the AI is checked for write-stage operators — $out, $merge, and friends. If anything other than a read operation comes back from the model, a QuerySafetyError is thrown with a 400 before a single byte touches the database.

07 The Hardest Problem: Reaching Localhost

The app is hosted at dbpilot.iamsreehari.in. A user wants to query their local PostgreSQL running on localhost:5432. The server can't initiate a TCP connection to that address — it would just resolve to itself or fail. The standard solutions are all painful: set up a VPN, configure port forwarding, use ngrok. None of that is acceptable for a product targeting non-developers.

The Solution: Reverse WebSocket Tunnel

Key Insight

Instead of the server connecting to the user's machine (impossible through firewalls and NAT), the user's machine connects to the server. Outbound connections are always allowed. The server then sends query requests over that already-established connection — reversing the normal direction.

Here's the full flow:


Browser
  │  POST /api/ai/query  (question + dbId)
  ▼
DBPilot Server
  │  Sees isLocal = true
  │  Finds user's active WebSocket in agentSockets Map
  │  Sends QUERY_REQUEST over the socket
  ▼
dbpilot-agent  (running on your machine)
  │  Receives QUERY_REQUEST
  │  Runs query against localhost:5432
  │  Sends QUERY_RESPONSE
  ▼
DBPilot Server
  │  Resolves the pending Promise
  │  Returns rows to the browser
  ▼
Browser → renders results table

How the Server Side Works

The server keeps a Map<userId, WebSocket>. When a query comes in for a local connection, it looks up the right socket and sends a request with a UUID. The pending map holds the resolve/reject pair for each in-flight request — when the response arrives, the matching Promise resolves. Requests that don't come back within 30 seconds are rejected automatically.

export function executeQueryViaAgent(userId: string, params): Promise<QueryResult> {
  const ws = agentSockets.get(userId);
  if (!ws) throw new Error('Agent not connected');

  const requestId = uuidv4();
  return new Promise((resolve, reject) => {
    const timer = setTimeout(() => reject(new Error('Agent timeout')), 30000);
    pending.set(requestId, { resolve, reject, timer });
    ws.send(JSON.stringify({ type: 'QUERY_REQUEST', requestId, ...params }));
  });
}

One Subtle Detail: Sharing the HTTP Server

The WebSocket server shares the same port as Express. This isn't the obvious path — most examples show creating a separate HTTP server. The trick is to create the Node.js http.Server manually, mount Express on it, and pass the same reference to WebSocketServer. The WebSocket library handles the upgrade event internally.

const server = http.createServer(app);
initAgentGateway(server); // attaches WebSocket to the same HTTP server
server.listen(PORT);

08 The dbpilot-agent CLI

The agent is a standalone npm package. You run it with your JWT token (the same one you get when you log in) and it connects back to the server over a WebSocket:

npx dbpilot-agent --token <your-jwt-token>

Auto-Reconnect

Network connections drop. Corporate proxies time out idle WebSockets. The agent reconnects automatically with exponential backoff — starting at 5 seconds, doubling each attempt, capping at 60 seconds. When the connection comes back, the backoff resets.

URL Normalisation

Users copy URLs from their browser bar. The agent normalises whatever you pass — it handles http://, https://, and raw wss:// addresses and converts them to the right WebSocket URL. You can pass the app URL or the raw socket address and it just works.

Corporate Networks

Corporate SSL inspection creates self-signed certificates that break TLS validation. The agent supports --insecure for a quick fix (disables certificate verification) and NODE_EXTRA_CA_CERTS for the proper approach — pointing to your corporate CA bundle.

Agent Status Indicator

The connection modal in the frontend shows a live badge — Agent Connected or Agent Not Connected. One button hit checks GET /api/agent/status, which looks up the agentSockets map and returns { connected: true/false }. Simple, but it removes a huge amount of confusion for new users.

09 Authentication & Security

JWT Authentication

Signup and login produce a JWT access token signed with a minimum 32-character secret. Every protected route sits behind an authenticateToken middleware that verifies the token and attaches the userId to the request. The same token authenticates the WebSocket connection from the agent.

Credential Encryption

Every saved connection's credentials — host, port, username, password, database name — are encrypted with AES-256-GCM before being written to the app database. The key lives in an environment variable. Each field gets a unique IV, so identical values produce different ciphertexts. Credentials are decrypted in memory, used, and discarded — they never appear in API responses.

Design Note

The initial implementation only encrypted the password. In practice, the hostname and database name also reveal sensitive infrastructure details. The final version encrypts all credential fields separately.

Read-Only by Design

The SQL safety layer makes the app structurally incapable of running mutations. Even if GPT-4o hallucinated a DROP TABLE, the AST parser catches it before it reaches the database driver. No config flag, no mode switch — it's enforced at the code level.

10 Frontend Architecture

Angular 18 with standalone components throughout — no NgModules. State is signals and services, not a state library. PrimeNG v19 handles the heavy UI components (data tables, dialogs, dropdowns) while Tailwind handles layout and spacing.

Key Components

Component	What it does
`ChatComponent`	The main query interface — tracks messages as a signal, sends questions to the AI endpoint, renders the SQL and a collapsible result table
`DashboardComponent`	Grid of saved connections with type icons, local badges, schema cache timestamps, and quick-action buttons
`ConnectDbModalComponent`	Add/edit connection form — URL mode vs host/port mode, the "localhost database" toggle, and the agent status indicator
`LandingComponent`	Public marketing page

Auth Interceptor & Route Guard

An HttpInterceptor attaches the Authorization: Bearer … header to every outgoing request, reading the token from localStorage. An AuthGuard redirects unauthenticated users to /login. Both are registered once and just work everywhere.

11 Theming System

PrimeNG v19 dropped the old single-file theming approach in favour of separate CSS bundles per theme. That sounds simple, but dynamically switching themes at runtime — without a page reload — took some work.

The solution: all theme CSS bundles are listed in angular.json as named build entries with "inject": false. Angular builds them but doesn't load any of them on startup. At runtime, ThemeService reads the active theme from localStorage, builds the right bundle filename, and programmatically injects a <link> element. When the user switches themes, the old link is removed, the new one is inserted — no reload, no flash.

Theme Family	Variants	Feel
Noir	Light / Dark	High-contrast, minimal
Arctic	Light / Dark	Cool blue, clean
Ember	Light / Dark	Warm orange tones
Ocean	Light / Dark	Deep teal, calm
Aurora	Light / Dark	Violet-green energy

A sun/moon toggle in the topbar switches dark/light within the current family. A colour picker lets users choose the family. The preference persists in localStorage.

12 Deployment

Backend and frontend are containerised with Docker and deployed on a Linux VM behind Nginx. The interesting part is the WebSocket proxy config:

location /ws/agent {
    proxy_pass http://backend:3000;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_read_timeout 3600s;
}

The proxy_read_timeout is the important line. Without it, Nginx closes the WebSocket after 60 seconds of inactivity — which completely breaks the tunnel. Setting it to an hour keeps long-lived agent connections alive.

13 Lessons Learned

1. Sharing an HTTP port with WebSocket is not obvious

Most documentation shows running the WebSocket server on a separate port. Running them on the same port (which you want, so Nginx only needs one proxy rule) requires creating the http.Server manually and passing it to both Express and WebSocketServer. Once you see it, it's simple. Getting there took an embarrassing amount of time.

2. Always add timeouts to pending Promises

The first version of the agent gateway had no request timeout. If the local database was slow or the agent crashed mid-query, the server-side Promise would hang until the process restarted. A 30-second timeout on every pending entry fixed this — and the cleanup path in the WebSocket close handler rejects any remaining pending requests instead of leaking them.

3. AST parsing beats regex for safety-critical validation

I considered a regex blocklist early on. SQL has enough edge cases — inline comments, Unicode, multi-statement batches — that any regex approach can be bypassed. Parsing to an AST and checking stmt.type is the only reliable approach. This is one of those cases where the "simple" solution is actually the dangerous one.

4. Schema extraction lives in two places

Schema extraction is implemented once in the backend (for remote connections) and again in the agent (for local ones). Both sides return the same TableSchema[] structure, but the code is duplicated. It's a minor maintenance cost, but worth knowing about before you add a new database type.

5. Encrypt all credential fields, not just the password

The hostname, port, and database name reveal infrastructure details too. The final version encrypts every credential field separately with unique IVs. Encrypting only the password felt complete until I thought about what an attacker could infer from an unencrypted hostname.

6. PrimeNG's "inject: false" is the key to runtime theme switching

Once you understand that Angular's build pipeline can produce CSS bundles that aren't loaded automatically, the whole pattern clicks. Build all the themes, load none of them, inject the right one at runtime via a <link> tag. It works cleanly and the switch is instant.

14 What's Next

Query history — persist past queries per connection so you can review and replay them
Shareable results — generate a time-limited shareable link for a result set
Multiple agents — support one user running the agent on multiple machines at the same time
Team workspaces — shared connections across an organisation with role-based access