Skip to content

Idempotence + operator safety

pgrdf.materialize(g) is idempotent. Re-run as often as you like; the prior is_inferred = TRUE rows are dropped first. The base graph is untouched.

The contract

PropertyGuaranteed by
Re-running yields the same shape of return JSONBAlgorithm is deterministic on the base graph
Re-running does not duplicate inferred rowsPrior is_inferred = TRUE rows are deleted before re-emit
Base rows are never touchedOnly rows with is_inferred = TRUE are removed
Empty graph → base_triples = 0, non-negative inferred countLocked by regression test
Inferred count is always ≥ 0Locked by regression test
A new materialize call after editing the base graph reflects the new baseReasoner reads is_inferred = FALSE rows only as input
previous_inferred_dropped on a back-to-back call equals the prior inferred_triples_writtenDrop-then-rederive sequencing

Why this matters to operators

Operators can wire materialization into:

  • A nightly cron — the call is safe to re-run; drift between the inferred and base sets is eliminated each run.
  • A trigger on bulk-ingest completion — call materialize at the tail of an ingest batch, no special "first run vs. subsequent" branching needed.
  • An on-demand SQL function — re-materialize during development without polluting the inferred set.

Because the closure write-back refreshes planner statistics automatically, queries against the enlarged graph stay fast after each run — no manual ANALYZE between materialize and query.

Size the graph, not just the schedule

materialize is single-threaded (#1), so the cron job runs on a right-sized graph — the one your hardware can close in your batch window — not the billion-scale graph. See Scale of reasoning.

Worked example

sql
SELECT pgrdf.materialize(100);
--  → {"base_triples": 3, "inferred_triples_written": 11, "previous_inferred_dropped": 0, ...}

SELECT pgrdf.materialize(100);
--  → {"base_triples": 3, "inferred_triples_written": 11, "previous_inferred_dropped": 11, ...}
-- Same result. No row duplication; the prior inferred set was dropped first.

-- Now edit the base graph.
SELECT pgrdf.parse_turtle('
@prefix ex: <http://example.com/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
ex:bob rdf:type ex:Engineer .
', 100);

SELECT pgrdf.materialize(100);
--  → {"base_triples": 4, "inferred_triples_written": 14, ...}
-- Base is now 4; inferred count grew accordingly.

Empty-graph safety

sql
SELECT pgrdf.add_graph(999);
-- No data loaded.
SELECT pgrdf.materialize(999);
--  → {"base_triples": 0, "inferred_triples_written": 0, ...}
-- Safe; no error.

Removing inferred state without re-deriving

sql
DELETE FROM pgrdf._pgrdf_quads
 WHERE graph_id = $1 AND is_inferred = TRUE;

Fast under partition-pruning. The base graph is preserved. The next materialize call re-derives.

Tests

MIT licensed. Documentation for pgRDF — built with VitePress, served via GitHub Pages.