storagePillar 1 — Semantic storage

RDF storage in PostgreSQL. Turtle goes in; dictionary-encoded, hexastore-indexed, LIST-partitioned quads come out — addressable from any Postgres client.

v0.6 headline — the native staged bulk loader

v0.6 adds a native, multi-backend staged bulk loader — a background-worker pipeline that commits per phase (STAGE → DICT → RESOLVE → INDEX) and is resumable on failure. It loads the full 8.2-billion-triple Wikidata truthy graph into a single PostgreSQL instance, dictionary-encoded with a complete SPO/POS/OSP hexastore, and self-tunes down to ordinary hardware so the same load works out-of-the-box on stock PostgreSQL.

Features in this pillar

description Load Turtle from disk — one UDF reads any .ttl / .nt file off the server filesystem and ingests it; on preloaded servers it auto-selects the staged path for N-Triples.
bolt Native staged bulk loader — load_turtle_staged_run: a commit-per-phase, resumable, background-worker pipeline that loads the full 8.2 B graph.
description Inline Turtle / TriG / N-Quads ingest — same parser family, no filesystem dependency; parse_trig / parse_nquads for quad-bearing serialisations.
query_stats Verbose ingest statistics — JSONB report of timing, cache hits, batch counts.
storage Per-graph LIST partitions — cheap whole-graph drops, isolated namespaces.
account_tree Named graphs (IRI ↔ id mapping) — symmetric IRI lookup for graph-scoped SPARQL.
account_tree Hexastore + dictionary — three covering indexes (SPO/POS/OSP), interned terms, proven at billion scale.
description Term types — typed literals, language tags, blank nodes, RDF collections.
bolt Bulk ingest — the loader family: prepared INSERT, parallel bulk_load, streaming windows, staged pipeline.
bolt Shared-memory dictionary cache — cross-backend hot path for repeated IRIs.
build Graph lifecycle UDFs — drop_graph, clear_graph, copy_graph, move_graph as partition-level primitives.

At a glance

sql

-- One-time
CREATE EXTENSION pgrdf;
SELECT pgrdf.add_graph(100);

-- Load
SELECT pgrdf.load_turtle('/fixtures/foaf.ttl', 100);
--  → 631

-- Inspect
SELECT pgrdf.count_quads(100);
SELECT * FROM pgrdf._pgrdf_dictionary
 WHERE term_type = 1 LIMIT 5;

Next — Load Turtle from disk →

auto_storiesTraining

A recommended path through Pillar 1 — read in this order and each page builds on the previous one:

description Start with Load Turtle from disk — the single UDF call that takes you from "file on disk" to "queryable quads". The simplest end-to-end win.
storage Then Per-graph LIST partitions — the partition-per-graph model is the structural choice everything else hangs off. Understand it before going further.
account_tree Then Named graphs (IRI ↔ id) → Hexastore + dictionary — how graphs are addressed and how terms are stored. The two pillars of cheap storage and fast lookup.
description Then Term types — datatypes, language tags, blank nodes, RDF lists. The detail you'll need before any FILTER or aggregate.
bolt Then Bulk ingest → Native staged bulk loader + Shared-memory dictionary cache — performance characteristics from ontology-scale loads up to the full 8.2 B graph.
build Finish with Graph lifecycle UDFs — once you understand the storage model, the lifecycle operations are partition-level primitives.

Learn more

info Refresh on RDF foundations with the RDF 1.1 Primer.
description The RDF 1.1 Turtle spec — what parse_turtle is implementing.
code Postgres internals — partitioning chapter of the PG manual.

storagePillar 1 — Semantic storage ​

Features in this pillar ​

At a glance ​

auto_storiesTraining ​

Learn more ​

storagePillar 1 — Semantic storage

Features in this pillar

At a glance

auto_storiesTraining

Learn more