Skip to content

Introduction

At a glance

pgRDF · Semantic web inside PostgreSQL · License MIT · Postgres 14 · 15 · 16 · 17 · Latest release v0.6.14 — the native staged bulk loader (full 8.2-billion-triple Wikidata truthy graph in a single instance) on top of the complete four-pillar surface: SPARQL SELECT/ASK/CONSTRUCT/DESCRIBE + property paths + full UPDATE, OWL 2 RL and RDFS materialization, SHACL Core 25/25 · GitHub releases · anonymous OCI bundle · GitHub · Install →

pgRDF is a PostgreSQL extension that turns your Postgres database into a high-performance semantic-web platform. Written in Rust on top of pgrx, it provides four engines under one extension surface:

EngineWhat it doesEntry-point UDFs
StorageDictionary-encoded triples in LIST-partitioned tables with SPO / POS / OSP covering indexes. Turtle, TriG, and N-Quads ingest — plus the native staged bulk loader for graphs that won't fit a single transaction.pgrdf.load_turtle, pgrdf.load_turtle_staged_run, pgrdf.parse_turtle, pgrdf.parse_trig, pgrdf.parse_nquads, pgrdf.add_graph
SPARQLSPARQL 1.1 SELECT / ASK / CONSTRUCT / DESCRIBE + full UPDATE + property paths — parsed via spargebra, translated to dynamic SQL, executed with a per-backend plan cache.pgrdf.sparql, pgrdf.construct, pgrdf.describe, pgrdf.sparql_parse
InferenceOWL 2 RL and RDFS forward-chaining via the reasonable reasoner. Writes inferences back as queryable rows.pgrdf.materialize
ValidationNative SHACL Core (W3C SHACL Core 25/25) via the rudof project's shacl crate. Returns a W3C sh:ValidationReport-shape JSONB document.pgrdf.validate

All four engines run inside Postgres. There's no separate service, no second deployment surface, no extra protocol. Every operation is addressable from any Postgres client (psql, psycopg, sqlx, pgx, postgres.js — anything that speaks the Postgres wire protocol).

sql
-- One-time install
CREATE EXTENSION pgrdf;

-- Load any Turtle file from the server filesystem
SELECT pgrdf.load_turtle('/fixtures/ontologies/foaf.ttl', 100);
--  → 631

-- Query with SPARQL
SELECT * FROM pgrdf.sparql(
  'PREFIX foaf: <http://xmlns.com/foaf/0.1/>
   SELECT ?s ?n WHERE { ?s foaf:name ?n }');

-- Materialize OWL 2 RL inferences
SELECT pgrdf.materialize(100);
--  → {"base_triples": 631, "inferred_triples_written": 1842, ...}

-- Validate against SHACL shapes
SELECT pgrdf.validate(100, 200);
--  → {"conforms": true, "results": []}

This documentation describes the v0.6.14 release — the complete, coherent four-pillar capability set plus the v0.6 native staged bulk loader. Every feature on this site is shipped, tagged, and exercised by the test bar (**294 pgrx + 93 pg_regress

  • 51 W3C-SPARQL + 25 W3C SHACL Core + 3 LUBM**). The remaining backlog is the graph-carving work tracked across the v0.6.n line toward the v0.7.0 graduation; two SPARQL items are documented upstream gates (not pgRDF defects) — see the SPARQL forward edge.

Ingest at scale, reason on a slice

v0.6 adds a native, multi-backend staged bulk loader: a background-worker pipeline (STAGE → DICT → RESOLVE → INDEX) that commits per phase and is resumable on failure. It ingests the full 8.2-billion-triple Wikidata truthy graph into a single PostgreSQL instance — 4 h 53 m at 466 K triples per second on a 128-vCPU box, dictionary-encoded with a complete SPO/POS/OSP hexastore, and it self-tunes down to ordinary hardware so the same load runs out-of-the-box on stock PostgreSQL. See Scale & benchmarks.

This frames the operating model: ingest is parallel and scales with the box, but reasoning is single-threaded — so you reason over a right-sized graph carved to fit your hardware, not the entire source graph. The composable verb chains that make this work (import → seal → carve → reason → validate → query) are described in Processes & flows.

Why pgRDF, not a separate triplestore?

Knowledge graphs are usually deployed as a second database alongside Postgres — with their own backups, IAM, observability, SLOs, and operator burden. Application code glues two systems together over the wire.

pgRDF takes a different view:

ConcernSeparate triplestorepgRDF
Where the graph livesSecond process treeInside Postgres
Connection poolNewReuses your existing pool
Backup / WAL / PITRBespokeInherited from Postgres
IAM, auth, TLSBespokeInherited from Postgres
Multi-tenancyBespokeLIST partitions per graph_id
Composition with SQLBridge layerNative — SPARQL is a set-returning SQL function
MonitoringExtra agentStandard Postgres tooling + pgrdf.stats()
Install footprintCluster, operator, CRDThree files bind-mounted onto a stock PG image

You also get the four standard semantic-web operations — load Turtle, SPARQL query, OWL 2 RL materialize, SHACL validate — first-class.

Audience

This site is the user-facing documentation. It assumes you know SQL and at least the basics of RDF, SPARQL, OWL, and SHACL — but not all four. Each pillar's documentation can be read independently.

Where this content comes from

This site is the published companion to pgRDF's engineering specs — most directly SPEC.pgRDF.LLD.v0.6.14.md, the as-built low-level design. Each page is grounded in the shipped v0.6.14 source and cites the test fixture in styk-tv/pgRDF/tests/ that pins the contract — every feature is exercised by that test in CI. The whole v0.6.14 surface is callable on main today. The forward backlog — graph carving, per-graph index reorganization, RDF 1.2 triple terms, a native SHACL-SPARQL engine, federated SERVICE, incremental materialisation — is tracked in SPEC.pgRDF.LLD.v0.6-FUTURE.md and on the Roadmap; two of those depend on upstream crates that haven't shipped the required surface yet — they're documented upstream gates, not missing pgRDF features.

Get the bits

See Drop-in install for the five-minute path on stock postgres:17 containers via per-file :ro bind mounts (.so / .control / .sql), and the credential-free OCI pull.

Next — the four pillars →

MIT licensed. Documentation for pgRDF — built with VitePress, served via GitHub Pages.