pg_catalog_rs is a PostgreSQL system catalog compatibility layer for Apache DataFusion.
It enables PostgreSQL clients (e.g., DBeaver, pgAdmin, JDBC, ODBC, BI platforms) to connect to a DataFusion-backed database by emulating the behavior of PostgreSQL's pg_catalog schema.
This makes it possible to support tooling that expects PostgreSQL system metadata, even if the underlying engine is not Postgres.
Note: This is WIP heavily and API can change.
-
Emulates core system tables from the
pg_catalogschema:pg_classpg_attributepg_namespacepg_typepg_proc
-
Supports PostgreSQL-specific built-in functions:
pg_get_constraintdef(oid)current_database()has_schema_privilege(...)
-
Allows standard metadata queries used by:
- DBeaver, DataGrip, and other GUI tools
- BI tools using JDBC or ODBC
- Postgres CLI (
psql)
-
Compatible with
pgwirefor Postgres wire protocol handling -
Fully in-memory and extensible via DataFusion APIs
Add the crate from GitHub to your Cargo.toml:
[dependencies]
datafusion_pg_catalog = { git = "https://github.com/ybrs/pg_catalog" }Create a SessionContext preloaded with the catalog tables and register your own schema:
use std::collections::BTreeMap;
use pg_catalog_rs::{
get_base_session_context, register_user_database, register_schema,
register_user_tables, ColumnDef,
};
let (ctx, _log) = get_base_session_context(
Some("pg_catalog_data/pg_schema"),
"pgtry".to_string(),
"public".to_string(),
).await?;
register_user_database(&ctx, "crm").await?;
register_schema(&ctx, "crm", "public").await?;
let mut cols = BTreeMap::new();
cols.insert(
"id".to_string(),
ColumnDef { col_type: "int".to_string(), nullable: false },
);
register_user_tables(&ctx, "crm", "public", "users", vec![cols]).await?;Then you can run queries like:
SELECT oid, relname FROM pg_class WHERE relnamespace = 2200;
SELECT attname FROM pg_attribute WHERE attrelid = 12345;Or use DBeaver/psql to introspect the schema.
When executing SQL, call dispatch_query to automatically
route catalog queries to the internal handler while forwarding all other
statements to your application logic.
use pg_catalog_rs::router::dispatch_query;
let result = dispatch_query(&ctx, "SELECT * FROM pg_class", |ctx, sql| async move {
ctx.sql(sql).await?.collect().await
}).await?;- Works with the
pgwirecrate for wire protocol emulation - Can be combined with custom
MemTableor real storage backends (Parquet, Arrow, etc.) - Designed to be embedded in hybrid SQL engines or compatibility layers
Launch a PostgreSQL-compatible endpoint using the provided helper:
use std::sync::Arc;
use pg_catalog_rs::{get_base_session_context, start_server};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let (ctx, _log) = get_base_session_context(
Some("pg_catalog_data/pg_schema"),
"pgtry".to_string(),
"public".to_string(),
).await?;
start_server(Arc::new(ctx), "127.0.0.1:5433", "pgtry", "public", None).await?;
Ok(())
}If you already expose your own SQL server you can integrate pg_catalog_rs without
starting the bundled pgwire endpoint. Build a SessionContext and register your
tables, then call dispatch_query from your handler. Catalog
queries will be executed internally and all other statements are forwarded.
use pg_catalog_rs::router::dispatch_query;
async fn handle_request(ctx: &SessionContext, sql: &str) -> DFResult<Vec<RecordBatch>> {
let (batches, _schema) = dispatch_query(ctx, sql, None, None, |ctx, sql| async move {
ctx.sql(sql).await?.collect().await
}).await?;
Ok(batches)
}- ❌ No persistence — catalog is in-memory only
- 🟡 Partial function support (more can be added)
- 🟠 Schema reflection based on user-defined tables must be manually tracked
- ❌ No write-back support to catalog (read-only)
- Hook into
CREATE TABLEto auto-populate metadata - Add missing catalog tables (
pg_index,pg_constraint, etc.) - Catalog persistence to disk or external store
- Enhanced type inference and function overloads
Functional tests are written in Python and executed with pytest.
After installing the dependencies from requirements.txt, run:
pytestLicensed under either of:
at your option.
Pull requests are welcome. Contributions that improve PostgreSQL compatibility or support new clients are especially appreciated.
Before submitting a change please run the Rust unit tests and the Python functional tests:
cargo test
pytestThis project is inspired by:
- PostgreSQL's system catalog architecture
- Apache DataFusion as the execution backend
pgwirefor wire protocol support