Skip to content

RDF Graph Context

Purpose

RdfGraphContextBase serves as the central graph management system for RDF data within the application. It is an abstract base class that acts as a container and lifecycle manager for multiple named RDF graphs, providing operations to create, load, query, and dispose of graph instances.

The graph context is the foundation for both read and write operations performed by repositories.

Key Responsibilities

  1. Graph Lifecycle Management

    • Create and dispose of RDF graphs
    • Load RDF data from multiple sources (files, URIs, strings, streams)
    • Track graph metadata (creation time, load count, triple count, namespace prefix count)
    • Dispose of graphs properly (sync and async)
  2. Multi-Source Data Loading

    • File-based loading with automatic format detection
    • URI/HTTP-based loading with content negotiation
    • String-based loading for in-memory RDF content
    • Stream-based loading with optimized buffering strategies
  3. Format Support

    • RDF/XML (.rdf, .xml)
    • Turtle (.ttl)
    • N-Triples (.nt)
    • Automatic format detection from file extensions and content
  4. Thread-Safe Operations

    • Concurrent access to graphs using ConcurrentDictionary for graph storage
    • Employs SemaphoreSlim for async operation locking
    • Supports cancellation tokens for long-running operations
  5. Configuration Management

    • Configurable via RdfGraphContextOptions
    • Supports graph initializers for custom setup
    • Provides options for error handling, auto-reset, and memory management
    • Configurable base URI template for graphs

Architecture

graph TD
    A["RdfGraphContextBase"] --> B["ConcurrentDictionary<string, GraphMetadata>"]
    A --> C["RdfGraphContextOptions"]
    A --> D["SemaphoreSlim"]

    B --> E["GraphMetadata"]
    E --> F["IGraph"]
    E --> G["Metadata<br/>(Created, LastLoaded, TripleCount, NamespacePrefixCount)"]

    C --> H["DefaultGraphName"]
    C --> I["AutoResetOnReload"]
    C --> J["GraphInitializers"]
    C --> K["BaseUriResolver"]

    A --> L["Load Operations"]
    L --> M["LoadFromFileAsync"]
    L --> N["LoadFromUriAsync"]
    L --> O["LoadFromStreamAsync"]
    L --> P["LoadFromStringAsync"]

Key Features

Graph Creation

public virtual IGraph CreateGraph(string? name = null, bool overwrite = false)

public virtual Task<IGraph> CreateGraphAsync(string? name = null, bool overwrite = false, CancellationToken cancellationToken = default)
  • Creates a new graph with a specified name (defaults to DefaultGraphName)
  • Supports overwriting existing graphs
  • Applies graph initializers for custom namespace prefixes and configuration
  • Sets the graph's BaseUri using the configured BaseUriResolver
  • Tracks creation metadata

Graph Retrieval

public virtual IGraph GetGraph(string? name = null)

public virtual Task<IGraph> GetGraphAsync(string? name = null, CancellationToken cancellationToken = default)

public virtual bool TryGetGraph(out IGraph graph, string? name = null)

public virtual bool GraphExists(string? name = null)

public virtual IReadOnlyList<string> GetGraphNames()
  • Retrieves existing graphs by name
  • Provides both synchronous and asynchronous access
  • Includes safe TryGetGraph pattern
  • Check existence with GraphExists
  • List all graph names with GetGraphNames

Graph Removal

public virtual bool RemoveGraph(string name)

public virtual ValueTask<bool> RemoveGraphAsync(string name)

public virtual void Clear(string? name = null)
  • Remove graphs by name (cannot remove default graph)
  • Clear all triples from a graph without removing it

Data Loading

// From file with auto-detection
await context.LoadFromFileAsync("data.ttl", "myGraph");

// From URI with content negotiation
await context.LoadFromUriAsync(new Uri("https://example.org/data.rdf"), "myGraph");

// From URI string
await context.LoadFromUriAsync("https://example.org/data.rdf", "myGraph");

// From string with explicit format
await context.LoadFromStringAsync(rdfContent, RdfFormat.Turtle, "myGraph");

// From stream with format specification
await context.LoadFromStreamAsync(stream, RdfFormat.RdfXml, "myGraph");

Format Detection

The context uses a multi-stage format detection strategy:

  • File Extension: .ttl, .rdf, .xml, .nt
  • Content-Type Header: For HTTP responses (turtle, rdf+xml, n-triples)
  • Content Sniffing: Analyzes initial 4KB bytes of content for format hints
  • Fallback: Defaults to RDF/XML if undetectable

Memory Management

The context implements several strategies for efficient memory management:

  • Configurable memory threshold (MemoryStreamThreshold, default 10MB)
  • Streams larger than the threshold are copied to memory for better async performance
  • Smaller streams use direct parsing for efficiency
  • Proper disposal patterns (sync and async)
  • Configurable stream buffer size (StreamBufferSize, default 8KB)

Graph Initializers

Graph initializers implement the IRdfGraphInitializer interface and are automatically executed when graphs are created or loaded. They provide a mechanism to configure graphs with default settings, namespace prefixes, or seed data.

Graph initializers are designed to:

  1. Pre-register Common Namespace Prefixes: Avoid repetitive namespace declarations in RDF documents
  2. Support CIM Standards: Configure graphs for Common Information Model (CIM) data exchange
  3. Enable Domain-Specific Extensions: Add custom namespaces for industry standards (ENTSO-E, CGMES, etc.)
  4. Seed Default Data: Optionally add baseline triples or vocabulary definitions
  5. Ensure Consistency: Guarantee that all graphs follow organizational naming conventions

Initialization Flow

The initialization flow differs based on the operation:

For new empty graphs (CreateGraph / CreateGraphAsync):

  1. Create new graph with a fallback base URI from Options.ResolveBaseUri(name)
  2. Apply initializers immediately (no data to detect from)

For loading operations (LoadFromStreamAsync, LoadFromFileAsync, etc.):

  1. Create new graph with a fallback base URI from Options.ResolveBaseUri(name)
  2. Load RDF data first (namespaces from file are now available)
  3. Apply initializers after loading so version detection can inspect loaded namespaces
  4. Re-resolve base URI using Options.ResolveBaseUri(name, graph) which can inspect loaded data
  5. Update metadata (counts, timestamps, source)
sequenceDiagram
    participant Client
    participant Context as RdfGraphContextBase
    participant Graph as IGraph
    participant Parser as IRdfReader
    participant Initializer as IRdfGraphInitializer

    Client->>Context: LoadFromStreamAsync(stream, format)
    Context->>Graph: Create new Graph()
    Context->>Graph: Set fallback BaseUri via ResolveBaseUri(name)
    Context->>Parser: Load(graph, stream)
    Parser->>Graph: Parse and add triples/namespaces
    Note over Graph: Namespaces from RDF file are now available
    Context->>Initializer: Initialize(graph)
    Initializer->>Graph: Inspect namespaces, add missing prefixes
    Context->>Context: ResolveBaseUri(name, graph)
    Note over Context: Can now detect version from loaded namespaces
    Context->>Graph: Update BaseUri with resolved value
    Context-->>Client: Graph ready with correct configuration

Adding Custom Initializers

options.GraphInitializers.Add(new CustomNamespaceInitializer());

Configuration Options

Configurable via RdfGraphContextOptions:

public class RdfGraphContextOptions
{
  // Default graph name when not specified 
  public string DefaultGraphName { get; set; } = "Default";

  // Strategy for resolving base URI for graphs.
  // Receives the graph name and the graph instance (which may contain loaded data).
  // May return null if no base URI can be resolved.
  public Func<string, IGraph?, Uri?> BaseUriResolver { get; set; }

  // Ignore validation/parse errors and keep the old graph if load fails
  public bool IgnoreValidationErrors { get; set; } = false;

  // Automatically clear previous triples before loading new content
  public bool AutoResetOnReload { get; set; } = true;

  // Preserve existing data in the graph when loading new content
  public bool PreserveExistingDataOnLoad { get; set; } = false;

  // Clear the graph if parsing fails when loading new content
  public bool ClearOnParseFailure { get; set; } = true;

  // Streams larger than this threshold (bytes) copied to memory for parsing
  public int MemoryStreamThreshold { get; set; } = 10 * 1024 * 1024; // 10MB

  // Buffer size for stream reading operations
  public int StreamBufferSize { get; set; } = 8192; // 8KB

  // Timeout for HTTP requests when loading from URIs
  public TimeSpan HttpTimeout { get; set; } = TimeSpan.FromSeconds(30);

  // Maximum number of concurrent HTTP connections
  public int MaxHttpConnections { get; set; } = 10;

  // User-Agent string for HTTP requests
  public string HttpUserAgent { get; set; } = "GridLabRdfGraph/1.0";

  // Graph initializers executed after a new graph is created or reloaded
  public List<IRdfGraphInitializer> GraphInitializers { get; set; } = new();

  // Resolve base URI for a graph, using the configured BaseUriResolver
  public Uri? ResolveBaseUri(string graphName, IGraph? graph = null)
}

The BaseUriResolver is a delegate that allows custom logic for resolving base URIs. The default implementation generates URIs using the template https://gridlab.io/graphs/{name}. Custom resolvers can inspect the graph's loaded namespaces to determine version-specific base URIs.

Utility Methods

// Check if graph has been loaded (has triples)
public virtual bool IsLoaded(string? name = null)

// Get timestamp when graph was last loaded
public virtual DateTimeOffset GetLoadedAt(string? name = null)

// Get namespaces defined in the graph
public virtual Dictionary<string, Uri> GetNamespaces(string? name = null)

// Get number of triples in the graph
public virtual int GetTripleCount(string? name = null)

// Get number of namespace prefixes in the graph
public virtual int GetNamespacePrefixCount(string? name = null)

Architectural Relationships

The RdfGraphContextBase is a foundational component within the RDF data management layer of the application.

It interacts with RdfGraphRepository component. Repositories use the context to access and manipulate RDF graphs for both read and write operations, while the context handles the underlying graph lifecycle and data loading.

Dependency Injection

RdfGraphContextBase is an abstract class designed to be extended by concrete implementations. It uses ABP Framework's lazy service provider for dependency resolution:

public abstract class RdfGraphContextBase : IRdfGraphContext
{
    public IAbpLazyServiceProvider LazyServiceProvider { get; set; }

    public ILogger<RdfGraphContextBase> Logger => 
        LazyServiceProvider.LazyGetService<ILogger<RdfGraphContextBase>>(
            NullLogger<RdfGraphContextBase>.Instance);

    public RdfGraphContextBase(IOptions<RdfGraphContextOptions> options)
    {
        Options = options?.Value ?? new RdfGraphContextOptions();
    }
}

Concrete implementations can be registered as scoped dependencies:

public class MyRdfGraphContext : RdfGraphContextBase
{
    public MyRdfGraphContext(IOptions<RdfGraphContextOptions> options) : base(options) { }
}

Key Integration Points

Graph Context as Data Source

RdfGraphRepository depends on IRdfGraphContext (typically a concrete implementation of RdfGraphContextBase) to access the underlying RDF graph data:

public class RdfGraphRepository<TGraphContext, TEntity, TKey>
    where TGraphContext : class, IRdfGraphContext
{
    protected TGraphContext GraphContext { get; }

    public RdfGraphRepository(TGraphContext graphContext, ...)
    {
        GraphContext = graphContext;
    }
}

Context handles low-level RDF operations; Repository handles high-level entity mapping. This separation enables clean architecture and testability.

See: Rdf Graph Repository Documentation for detailed information.

Dependency Flow

graph LR
    A["Client Application"] --> B["RdfGraphRepository"]
    B --> C["RdfGraphContextBase"]
    B --> D["IRdfInstanceParser"]
    B --> E["IRdfInstanceWriter"]
    B --> F["IRdfMapperFactory"]

    C --> G["IGraph<br/>(VDS.RDF)"]
    C --> H["IRdfGraphInitializer"]
    C --> I["RdfGraphContextOptions"]
    I --> J["BaseUriResolver"]
    I --> K["GraphInitializers"]
    D --> G
    E --> G
    F --> L["IRdfEntityMapper"]
    L --> M["Domain Entity"]

Graph Modification

While the RdfGraphContextBase itself doesn't directly modify graph content (that's the responsibility of IRdfInstanceWriter), it provides the IGraph instances that are modified:

  1. Context provides graphs: context.GetGraph(name) returns the mutable IGraph
  2. Writer modifies graphs: writer.Insert(graph, definition) asserts triples
  3. Changes are immediate: Modifications are applied directly to the in-memory graph
  4. Persistence is separate: Saving graphs back to files requires explicit serialization

In-Memory vs Persistent Changes

// Load graph
await context.LoadFromFileAsync("model.rdf", "cim-graph");

// Get graph reference
var graph = context.GetGraph("cim-graph");

// Modifications via repository/writer are immediate in memory
repository.Insert(newEntity);

// To persist changes, serialize the graph
var writer = new RdfXmlWriter();
using var stream = File.Create("model-updated.rdf");
writer.Save(graph, stream);

GraphMetadata

The context tracks metadata for each graph internally:

Property Type Description
Name string Graph identifier
Graph IGraph The underlying VDS.RDF graph
Created DateTimeOffset When the graph was created
LastLoaded DateTimeOffset When data was last loaded into the graph
IsModified bool Whether the graph has been modified
Source string? Source identifier (e.g., "stream_Turtle")
LoadCount long Number of times data has been loaded
TripleCount int Cached count of triples
NamespacePrefixCount int Cached count of namespace prefixes

References