Database Table Profiler

The table profiler in Bollard generates statistical summaries of database tables natively on the SQL engine. It avoids loading raw rows into local memory, preventing memory spikes and high latency over network connections.

1. How it Works (Pushed Aggregation)

When you run profile_table(connection="prod", table_name="users"), Bollard inspects the cached schema columns and types. It constructs and executes parallel, single-round-trip sub-queries for each column.

No Pandas, Numpy, or local processing is used. The database server performs all calculations. Only a compact metadata payload containing the stats is sent back to the MCP server:

SELECT 
  COUNT(*) AS total_rows,
  COUNT("status") AS non_null_count,
  ROUND(100.0 * COUNT(*) FILTER (WHERE "status" IS NULL) / NULLIF(COUNT(*), 0), 2) AS null_pct,
  COUNT(DISTINCT "status") AS distinct_count
FROM "users";

2. Metrics Gathered

For each column, the profile aggregates the following statistics:

  • All Columns: Total row counts, non-null counts, and null percentages (NULL%).
  • Numeric Columns (integers, floats, decimals, serials): MIN, MAX, and AVG values.
  • Datetime Columns (dates, timestamps, time): MIN (earliest entry) and MAX (latest entry) ranges.
  • Categorical Columns: If a column contains 100 or fewer distinct values, Bollard runs a value distribution query and returns the Top-5 occurring values alongside their frequencies.
Table: orders24,891 rows • 8 columns • profiled in 0.43s
ColumnTypeNull%MinMaxUnique
idinteger0%12489124891
statusvarchar2%4
countryvarchar0%47
total_amountdecimal1%0.509999.99
created_attimestamp0%2021-012024-06
status — Top 5
active
58%
pending
22%
closed
15%
refund
5%
country — Top 5
US
41%
UK
18%
IN
12%
DE
9%
CA
8%

3. Security Policy Integration

The table profiler integrates with the connection's permission policy:

  • Forbidden Table Verifications: Prior to executing profile sub-queries, Bollard checks if the target table matches any pattern in the forbidden_tables list. If matched, the profile is aborted and returns an access rejection message.
  • Intent-Based SELECT * Interception: In preview_query, bare SELECT * FROM table queries on large tables (exceeding 15 rows) are automatically intercepted and redirected to profile_table. If the target table is forbidden, the interception is bypassed to let the query validator block the request, preventing metadata leaks.