Introduction

At a glance

pgRDF · Semantic web inside PostgreSQL · License MIT · Postgres 14 · 15 · 16 · 17 · Latest release v0.6.14 — the native staged bulk loader (full 8.2-billion-triple Wikidata truthy graph in a single instance) on top of the complete four-pillar surface: SPARQL SELECT/ASK/CONSTRUCT/DESCRIBE + property paths + full UPDATE, OWL 2 RL and RDFS materialization, SHACL Core 25/25 · GitHub releases · anonymous OCI bundle · GitHub · Install →

pgRDF is a PostgreSQL extension that turns your Postgres database into a high-performance semantic-web platform. Written in Rust on top of pgrx, it provides four engines under one extension surface:

Engine	What it does	Entry-point UDFs
Storage	Dictionary-encoded triples in LIST-partitioned tables with SPO / POS / OSP covering indexes. Turtle, TriG, and N-Quads ingest — plus the native staged bulk loader for graphs that won't fit a single transaction.	`pgrdf.load_turtle`, `pgrdf.load_turtle_staged_run`, `pgrdf.parse_turtle`, `pgrdf.parse_trig`, `pgrdf.parse_nquads`, `pgrdf.add_graph`
SPARQL	SPARQL 1.1 SELECT / ASK / CONSTRUCT / DESCRIBE + full UPDATE + property paths — parsed via `spargebra`, translated to dynamic SQL, executed with a per-backend plan cache.	`pgrdf.sparql`, `pgrdf.construct`, `pgrdf.describe`, `pgrdf.sparql_parse`
Inference	OWL 2 RL and RDFS forward-chaining via the `reasonable` reasoner. Writes inferences back as queryable rows.	`pgrdf.materialize`
Validation	Native SHACL Core (W3C SHACL Core 25/25) via the rudof project's `shacl` crate. Returns a W3C `sh:ValidationReport`-shape JSONB document.	`pgrdf.validate`

All four engines run inside Postgres. There's no separate service, no second deployment surface, no extra protocol. Every operation is addressable from any Postgres client (psql, psycopg, sqlx, pgx, postgres.js — anything that speaks the Postgres wire protocol).

sql

-- One-time install
CREATE EXTENSION pgrdf;

-- Load any Turtle file from the server filesystem
SELECT pgrdf.load_turtle('/fixtures/ontologies/foaf.ttl', 100);
--  → 631

-- Query with SPARQL
SELECT * FROM pgrdf.sparql(
  'PREFIX foaf: <http://xmlns.com/foaf/0.1/>
   SELECT ?s ?n WHERE { ?s foaf:name ?n }');

-- Materialize OWL 2 RL inferences
SELECT pgrdf.materialize(100);
--  → {"base_triples": 631, "inferred_triples_written": 1842, ...}

-- Validate against SHACL shapes
SELECT pgrdf.validate(100, 200);
--  → {"conforms": true, "results": []}

This documentation describes the v0.6.14 release — the complete, coherent four-pillar capability set plus the v0.6 native staged bulk loader. Every feature on this site is shipped, tagged, and exercised by the test bar (**294 pgrx + 93 pg_regress

51 W3C-SPARQL + 25 W3C SHACL Core + 3 LUBM**). The remaining backlog is the graph-carving work tracked across the v0.6.n line toward the v0.7.0 graduation; two SPARQL items are documented upstream gates (not pgRDF defects) — see the SPARQL forward edge.

Ingest at scale, reason on a slice

v0.6 adds a native, multi-backend staged bulk loader: a background-worker pipeline (STAGE → DICT → RESOLVE → INDEX) that commits per phase and is resumable on failure. It ingests the full 8.2-billion-triple Wikidata truthy graph into a single PostgreSQL instance — 4 h 53 m at 466 K triples per second on a 128-vCPU box, dictionary-encoded with a complete SPO/POS/OSP hexastore, and it self-tunes down to ordinary hardware so the same load runs out-of-the-box on stock PostgreSQL. See Scale & benchmarks.

This frames the operating model: ingest is parallel and scales with the box, but reasoning is single-threaded — so you reason over a right-sized graph carved to fit your hardware, not the entire source graph. The composable verb chains that make this work (import → seal → carve → reason → validate → query) are described in Processes & flows.

Why pgRDF, not a separate triplestore?

Knowledge graphs are usually deployed as a second database alongside Postgres — with their own backups, IAM, observability, SLOs, and operator burden. Application code glues two systems together over the wire.

pgRDF takes a different view:

Concern	Separate triplestore	pgRDF
Where the graph lives	Second process tree	Inside Postgres
Connection pool	New	Reuses your existing pool
Backup / WAL / PITR	Bespoke	Inherited from Postgres
IAM, auth, TLS	Bespoke	Inherited from Postgres
Multi-tenancy	Bespoke	LIST partitions per `graph_id`
Composition with SQL	Bridge layer	Native — SPARQL is a set-returning SQL function
Monitoring	Extra agent	Standard Postgres tooling + `pgrdf.stats()`
Install footprint	Cluster, operator, CRD	Three files bind-mounted onto a stock PG image

You also get the four standard semantic-web operations — load Turtle, SPARQL query, OWL 2 RL materialize, SHACL validate — first-class.

Audience

This site is the user-facing documentation. It assumes you know SQL and at least the basics of RDF, SPARQL, OWL, and SHACL — but not all four. Each pillar's documentation can be read independently.

groups Project managers — start with The four pillars to scope what pgRDF can absorb from your backlog.
query_stats Data scientists — start with Storage and SPARQL queries to see how you'd join graph data against your existing SQL tables.
school Ontologists — start with Inference and Validation to see which slice of OWL 2 RL and SHACL Core runs at the storage layer.
code Backend engineers — start with Compose with regular SQL to see how to call pgRDF from your existing connection pool.
settings Operators — start with Drop-in install, pgrdf.stats(), and Scale & benchmarks.

Where this content comes from

This site is the published companion to pgRDF's engineering specs — most directly SPEC.pgRDF.LLD.v0.6.14.md, the as-built low-level design. Each page is grounded in the shipped v0.6.14 source and cites the test fixture in styk-tv/pgRDF/tests/ that pins the contract — every feature is exercised by that test in CI. The whole v0.6.14 surface is callable on main today. The forward backlog — graph carving, per-graph index reorganization, RDF 1.2 triple terms, a native SHACL-SPARQL engine, federated SERVICE, incremental materialisation — is tracked in SPEC.pgRDF.LLD.v0.6-FUTURE.md and on the Roadmap; two of those depend on upstream crates that haven't shipped the required surface yet — they're documented upstream gates, not missing pgRDF features.

Get the bits

Releases — https://github.com/styk-tv/pgRDF/releases/tag/v0.6.14 — the v0.6.14 release ships per-PG binary tarballs (pg14–17 × amd64/arm64) plus SHA256SUMS, and is marked Latest.
OCI bundle — ghcr.io/styk-tv/pgrdf-bundle:0.6.14 — anonymously pullable, zero credentials: oras pull ghcr.io/styk-tv/pgrdf-bundle:0.6.14. Every digest carries a verifiable SLSA Build Provenance v1 attestation (gh attestation verify oci://ghcr.io/styk-tv/pgrdf-bundle:0.6.14 --repo styk-tv/pgRDF).
CHANGELOG — https://github.com/styk-tv/pgRDF/blob/main/CHANGELOG.md.
Source — https://github.com/styk-tv/pgRDF.
crates.io — https://crates.io/crates/pgrdf — namespace placeholder; the OCI bundle and release tarball are the consumption path.

See Drop-in install for the five-minute path on stock postgres:17 containers via per-file :ro bind mounts (.so / .control / .sql), and the credential-free OCI pull.

Next — the four pillars →

Introduction ​

Ingest at scale, reason on a slice ​

Why pgRDF, not a separate triplestore? ​

Audience ​

Where this content comes from ​

Get the bits ​