Skip to main content
Your proprietary business data is what trains your Relational Foundation Model. Because Avra is a relational AI platform built on top of a Large Knowledge Graph, we natively ingest multi-table datasets — customer masters, invoices, payments, collections, support tickets, and beyond — and align them to the graph through shared keys. We accept both relational and tabular structures in the same handoff. Send the tables you already manage, link them with consistent identifiers (legal documents or your own surrogate keys), and we will reconcile them into a consolidated relational view before training.

Data Types We Support

Relational Data

Complex, interconnected datasets that capture business relationships:
  • Customer Networks: Account hierarchies, subsidiary relationships, partnership structures
  • Transaction Chains: Multi-party transactions, payment flows, supplier relationships
  • Event Sequences: Customer journey data, interaction timelines, lifecycle events

Tabular Data

Structured datasets from your operational systems:
  • Customer Records: Demographics, firmographics, account details
  • Transaction History: Payments, purchases, service usage, billing events
  • Outcome Data: Defaults, renewals, upgrades, churn events

Critical Success Factors

Data Quality Requirements

The quality of your Relational Foundation Model and every downstream model trained from it depends directly on the quality of your historical data: Good Data Definitions: Clear, consistent definitions of outcomes, customer states, and business events across your historical dataset. Event Time Columns: Every table should expose event timestamps (for example, created_at, updated_at, effective_at, or occurred_at) so we can reconstruct the state of the world at any point in time. These fields are critical for leakage-safe training runs and for replaying historical decisions. Accurate As-Of Dates: Complement event timestamps with clear “as-of” semantics indicating when the information became available for decision-making. Sufficient History: Adequate volume of historical outcomes to enable robust model training and validation.

Preventing Data Leakage

Data leakage occurs when future information accidentally influences past predictions. Our data ingestion process includes:
  • Temporal Validation: Ensuring all features were available at the time of decision
  • As-Of Date Enforcement: Strict temporal boundaries for training data
  • Outcome Window Definitions: Clear separation between prediction time and outcome measurement

Implementation Process

Implementation details are tailored to each customer’s specific data architecture, regulatory requirements, and technical constraints. Next Steps: Contact your account representative to begin the data discovery process and design a custom ingestion strategy that ensures high-quality data flows while maintaining compliance with your internal governance requirements.