Skip to main content
The Large Knowledge Graph (LKG) is the substrate the Graph Foundation Model is pre-trained on. It captures companies, individuals, assets, and the events that connect them — as a temporal graph, not a snapshot. Most data products treat relationships as enrichment: a column appended to a row. The LKG treats relationships as structure. Ownership chains, supply paths, judicial and regulatory events, and geographic context are modeled as edges in a graph, not flattened into features.

What goes into the graph

Corporate and legal records

Public registries, corporate filings, ownership structures, and judicial records form the authoritative skeleton of how entities are formally connected.

Behavioral and alternative data

Licensed datasets capturing economic activity, transaction patterns, and operational signals — the parts of how entities behave that official records do not see.

Geographic and macroeconomic context

Regional dynamics, infrastructure, and sector conditions situate every entity in the economy it actually operates in.

Your business context

Customer-provided signals enter through your Relational Foundation Model — separate from the LKG, composed at inference time.

Relationships as first-class structure

The graph encodes the relationships that actually drive outcomes:
  • Ownership and control — direct and indirect participation, holding structures, beneficial ownership
  • Counterparty and supply — observed business relationships, transaction proximity, dependency chains
  • Judicial and regulatory — proceedings, sanctions, and compliance events that cascade through networks
  • Geographic and infrastructure — regional clustering, shared facilities, supply-route proximity
  • Domain-specific — franchise networks, branch hierarchies, group structures, industry associations
Each relationship type carries its own semantics in the graph. The GFM learns from all of them simultaneously.

Temporal by design

A graph that only describes today cannot predict tomorrow.
  • Every node and edge is versioned. We track when a relationship appears, changes, and disappears.
  • Historical states are preserved. The GFM reasons about velocity and momentum, not just current state.
  • Decision dates are explicit. Predictions for a past date see only the graph as it existed then — no future leakage, by construction.

Why it matters

The LKG is not a deliverable on its own. It is the structured substrate that lets the GFM, your RFM, and every downstream model reason about relationships at scale.
  • Cold-start coverage — new entities are positioned relative to their neighbors, not blanked out
  • Multi-hop reasoning — risk and opportunity propagate through the graph, not just through direct connections
  • Stable temporal grounding — backtests, replays, and shadow evaluations operate on the graph as it existed at decision time