RDF Graph Context¶

Purpose¶

RdfGraphContextBase serves as the central graph management system for RDF data within the application. It is an abstract base class that acts as a container and lifecycle manager for multiple named RDF graphs, providing operations to create, load, query, and dispose of graph instances.

The graph context is the foundation for both read and write operations performed by repositories.

Key Responsibilities¶

Graph Lifecycle Management
- Create and dispose of RDF graphs
- Load RDF data from multiple sources (files, URIs, strings, streams)
- Track graph metadata (creation time, load count, triple count, namespace prefix count, model type)
- Dispose of graphs properly (sync and async)
Multi-Source Data Loading
- File-based loading with automatic format detection
- URI/HTTP-based loading with content negotiation
- String-based loading for in-memory RDF content
- Stream-based loading with optimized buffering strategies
Format Support
- RDF/XML (.rdf, .xml)
- Turtle (.ttl)
- JSON-LD (.jsonld, .json)
- Automatic format detection from file extensions and content
Thread-Safe Operations
- Concurrent access to graphs using ConcurrentDictionary for graph storage
- Employs SemaphoreSlim for async operation locking
- Supports cancellation tokens for long-running operations
Configuration Management
- Configurable via RdfGraphContextOptions
- Supports graph initializers for custom setup
- Provides options for error handling, auto-reset, and memory management

Architecture¶

graph TD
    A["RdfGraphContextBase"] --> B["ConcurrentDictionary&lt;string, GraphMetadata&gt;"]
    A --> C["RdfGraphContextOptions"]
    A --> D["SemaphoreSlim"]
    A --> T["IRdfFormatDetector"]

    B --> E["GraphMetadata"]
    E --> F["IGraph"]
    E --> G["Metadata<br/>(Created, LastLoaded, TripleCount, NamespacePrefixCount)"]
    E --> EX["Extensions&lt;Type, object&gt;"]

    C --> H["DefaultGraphName"]
    C --> I["AutoResetOnReload"]
    C --> J["GraphInitializers"]
    C --> K["BaseUriResolver"]
    C --> L["HTTP Configuration<br/>(Timeout, MaxConnections, UserAgent)"]
    C --> M["Memory Management<br/>(MemoryStreamThreshold, StreamBufferSize)"]

    A --> N["Load Operations"]
    N --> O["LoadFromFileAsync"]
    N --> P["LoadFromUriAsync"]
    N --> Q["LoadFromStreamAsync"]
    N --> R["LoadFromStringAsync"]

    T --> S["Format Detection"]
    S --> U["Content Analysis"]
    S --> V["Confidence Scoring"]
    S --> W["Hint Adjustment"]

Key Features¶

Graph Creation¶

public virtual IGraph CreateGraph(string? name = null, bool overwrite = false)

public virtual Task<IGraph> CreateGraphAsync(string? name = null, bool overwrite = false, CancellationToken cancellationToken = default)

Creates a new graph with a specified name (defaults to DefaultGraphName)
Supports overwriting existing graphs (preserves LoadCount from existing graph when overwriting)
Applies graph initializers for custom namespace prefixes and configuration
Sets the graph's BaseUri using the configured BaseUriResolver
Tracks creation metadata
Implements proper error handling and resource cleanup on failures

Graph Retrieval¶

public virtual IGraph GetGraph(string? name = null)

public virtual Task<IGraph> GetGraphAsync(string? name = null, CancellationToken cancellationToken = default)

public virtual bool TryGetGraph(out IGraph graph, string? name = null)

public virtual bool GraphExists(string? name = null)

public virtual IReadOnlyList<string> GetGraphNames()

Retrieves existing graphs by name
Provides both synchronous and asynchronous access
Includes safe TryGetGraph pattern
Check existence with GraphExists
List all graph names with GetGraphNames

Graph Removal¶

public virtual bool RemoveGraph(string name)

public virtual ValueTask<bool> RemoveGraphAsync(string name)

public virtual void Clear(string? name = null)

Remove graphs by name (cannot remove default graph)
Clear all triples from a graph without removing it

Data Loading¶

// From file with auto-detection (source is automatically set to file path)
await context.LoadFromFileAsync("data.ttl", "myGraph");

// From URI with content negotiation (source is automatically set to URI)
await context.LoadFromUriAsync(new Uri("https://example.org/data.rdf"), "myGraph");

// From URI string
await context.LoadFromUriAsync("https://example.org/data.rdf", "myGraph");

// From string content with auto-detection (format inferred from content)
await context.LoadFromStringAsync(rdfContent, name: "myGraph");

// From string content with explicit format
await context.LoadFromStringAsync(rdfContent, RdfFormat.Turtle, "myGraph");

// From stream with format specification and optional source tracking
await context.LoadFromStreamAsync(stream, RdfFormat.Turtle, "myGraph");
await context.LoadFromStreamAsync(stream, RdfFormat.Turtle, "myGraph", "custom-source-id");

Method Signatures:

Task LoadFromFileAsync(string filePath, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)

Task LoadFromUriAsync(Uri uri, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)

Task LoadFromUriAsync(string uriString, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)

Task LoadFromStringAsync(string rdfContent, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)

Task LoadFromStreamAsync(Stream stream, RdfFormat? format = null, string? name = null, string? source = null, CancellationToken cancellationToken = default)

Key Features:

Asynchronous loading with proper cancellation support
Automatic format detection for files, HTTP responses, and string content
HTTP content negotiation with Accept headers for common RDF formats
Error handling with configurable behavior for parse failures
Memory optimization for large streams (configurable threshold)
Source tracking - automatically tracks where data was loaded from (file path, URI, string content, or custom source)

Supported RdfFormat values:

RdfFormat.Turtle - Turtle serialization
RdfFormat.RdfXml - RDF/XML serialization
RdfFormat.JsonLd - JSON-LD serialization
RdfFormat.Unknown - Used when format cannot be determined

Format Detection¶

The context delegates format detection to IRdfFormatDetector, which analyzes stream content with confidence scoring.

Detection Strategy:

Content Analysis: Reads a sample of bytes from the stream and analyzes patterns
Hint Adjustment: Applies bonuses/penalties based on file extension and MIME type hints
Confidence Threshold: Returns the format if confidence meets the threshold (default: 50%)
Fallback: Returns RdfFormat.Unknown if no format meets the threshold

Detection Result:

public class DetectionResult
{
    public RdfFormat Format { get; set; }
    public int Confidence { get; set; }      // 0-100%
    public string? DetectedBy { get; set; }  // "content", "extension", etc.
    public string? Warning { get; set; }
    public bool IsReliable => Confidence >= 70;
}

Supported Formats:

Turtle: .ttl files, text/turtle Content-Type
RDF/XML: .rdf/.xml files, application/rdf+xml Content-Type
JSON-LD: .jsonld/.json files, application/ld+json Content-Type

HTTP Configuration¶

The context provides comprehensive HTTP client configuration for URI-based loading:

Timeout Configuration: Configurable request timeout (default: 30 seconds)
Connection Pooling: Configurable maximum concurrent connections (default: 10)
User-Agent Header: Configurable User-Agent string (default: "GridLabRdfGraph/1.0")
Content Negotiation: Automatic Accept headers for RDF formats
Error Handling: Proper handling of network errors and timeouts

HTTP Accept Headers sent:

Accept: text/turtle, application/rdf+xml, application/ld+json, text/plain

Memory Management¶

The context implements several strategies for efficient memory management:

Configurable memory threshold (MemoryStreamThreshold, default 10MB)
Large stream optimization: Streams larger than the threshold are buffered to memory for better async performance
Small stream efficiency: Smaller streams use direct parsing for efficiency
Proper disposal patterns: Sync and async disposal of graphs and metadata
Configurable stream buffer size (StreamBufferSize, default 8KB)
Resource cleanup: Automatic cleanup on errors and exceptions

Graph Initializers¶

Graph initializers implement the IRdfGraphInitializer interface and are automatically executed when graphs are created or loaded. They provide a mechanism to configure graphs with default settings, namespace prefixes, or seed data.

Graph initializers are designed to:

Pre-register Common Namespace Prefixes: Avoid repetitive namespace declarations in RDF documents
Support CIM Standards: Configure graphs for Common Information Model (CIM) data exchange
Enable Domain-Specific Extensions: Add custom namespaces for industry standards (ENTSO-E, CGMES, etc.)
Seed Default Data: Optionally add baseline triples or vocabulary definitions
Ensure Consistency: Guarantee that all graphs follow organizational naming conventions

Initialization Flow¶

The initialization flow differs based on the operation:

For new empty graphs (CreateGraph / CreateGraphAsync):

Create new graph instance
Resolve base URI using Options.ResolveBaseUri(name, metaData)
Apply initializers immediately (no data to detect from)
Update metadata (counts, timestamps)

For loading operations (LoadFromStreamAsync, LoadFromFileAsync, etc.):

Create new graph or get existing graph
Set fallback base URI from Options.ResolveBaseUri(name, metaData)
Load RDF data first (namespaces from file are now available)
Apply initializers after loading so detection can inspect loaded graph metadata
Re-resolve base URI using Options.ResolveBaseUri(name, metaData) which can inspect loaded graph metadata
Update metadata (counts, timestamps, source, load count)

sequenceDiagram
    participant Client
    participant Context as RdfGraphContextBase
    participant Graph as IGraph
    participant Parser as IRdfReader
    participant Initializer as IRdfGraphInitializer

    Client->>Context: LoadFromStreamAsync(stream, format, name, source)
    Context->>Graph: Create new Graph()
    Context->>Graph: Set fallback BaseUri via ResolveBaseUri(name, metaData)
    Context->>Parser: LoadRdfDataAsync(parser, graph, stream)
    Parser->>Graph: Parse and add triples/namespaces
    Note over Graph: Namespaces from RDF file are now available
    Context->>Initializer: ApplyInitializers
    Initializer->>Graph: Inspect namespaces, add missing prefixes
    Context->>Context: ResolveBaseUri(name, metaData)
    Note over Context: Can now detect version from loaded namespaces
    Context->>Graph: Update BaseUri with resolved value
    Context->>Context: Update metadata (Source, LoadCount++)
    Context-->>Client: Graph ready with correct configuration

Adding Custom Initializers¶

options.GraphInitializers.Add(new CustomNamespaceInitializer());

Configuration Options¶

Configurable via RdfGraphContextOptions:

public class RdfGraphContextOptions
{
  /// <summary>
  /// Default graph name when not specified.
  /// </summary>
  public string DefaultGraphName { get; set; } = "Default";

  /// <summary>
  /// Strategy for resolving base URI for graphs.
  /// Receives the graph name and the graph instance (which may contain loaded data).
  /// May return null if no base URI can be resolved.
  /// </summary>
  public Func<string, IGraph?, Uri?> BaseUriResolver { get; set; }

  /// <summary>
  /// Ignore validation/parse errors and keep the old graph if load fails.
  /// </summary>
  public bool IgnoreValidationErrors { get; set; } = false;

  /// <summary>
  /// Automatically clear previous triples before loading new content.
  /// </summary>
  public bool AutoResetOnReload { get; set; } = true;

  /// <summary>
  /// Clear the graph if parsing fails when loading new content.
  /// </summary>
  public bool ClearOnParseFailure { get; set; } = true;

  /// <summary>
  /// Streams larger than this threshold (bytes) copied to memory for parsing.
  /// </summary>
  public int MemoryStreamThreshold { get; set; } = 10 * 1024 * 1024; // 10MB

  /// <summary>
  /// Buffer size for stream reading operations.
  /// </summary>
  public int StreamBufferSize { get; set; } = 8192; // 8KB

  /// <summary>
  /// Timeout for HTTP requests when loading from URIs.
  /// </summary>
  public TimeSpan HttpTimeout { get; set; } = TimeSpan.FromSeconds(30);

  /// <summary>
  /// Maximum number of concurrent HTTP connections.
  /// </summary>
  public int MaxHttpConnections { get; set; } = 10;

  /// <summary>
  /// User-Agent string for HTTP requests.
  /// </summary>
  public string HttpUserAgent { get; set; } = "GridLabRdfGraph/1.0";

  /// <summary>
  /// Graph initializers executed after a new graph is created or reloaded.
  /// </summary>
  public List<IRdfGraphInitializer> GraphInitializers { get; set; } = new();

  /// <summary>
  /// Resolve base URI for a graph, using the configured BaseUriResolver.
  /// </summary>
  public Uri? ResolveBaseUri(string? graphName = null, GraphMetadata? metaData = null)
}

The BaseUriResolver is a delegate that allows custom logic for resolving base URIs. The default implementation generates URIs using the static https://gridlab.io/models/1/ value. Ensure URIs end with /.

Utility Methods¶

// Check if graph has been loaded (has triples)
public virtual bool IsLoaded(string? name = null)

// Get timestamp when graph was last loaded
public virtual DateTimeOffset GetLoadedAt(string? name = null)

// Get namespaces defined in the graph
public virtual Dictionary<string, Uri> GetNamespaces(string? name = null)

// Get number of triples in the graph
public virtual int GetTripleCount(string? name = null)

// Get number of namespace prefixes in the graph
public virtual int GetNamespacePrefixCount(string? name = null)

// Get complete metadata for a graph
public virtual GraphMetadata? GetGraphMetadata(string? name = null)

Note: These methods provide access to graph metadata and statistics for monitoring and diagnostic purposes. Domain-specific metadata like CimModelType is stored in the Extensions dictionary and accessed via GraphMetadataExtensions.

Error Handling¶

The RDF Graph Context provides comprehensive error handling for various failure scenarios:

HTTP Loading Errors¶

When loading from URIs, the context handles:

HttpRequestException: Network or HTTP protocol errors → throws RdfLoadException
TaskCanceledException: Request timeouts → throws RdfTimeoutException
Content negotiation failures: Falls back to format detection from file extension or content sniffing

Parse Errors¶

RdfException: RDF parsing failures from VDS.RDF
Configurable behavior via IgnoreValidationErrors option
Automatic cleanup with ClearOnParseFailure option
Error logging with context information

Resource Management¶

Automatic disposal of resources on errors
Graph cleanup on creation failures
Memory leak prevention through proper exception handling

Example Error Handling¶

try
{
    await context.LoadFromUriAsync("https://example.org/invalid.rdf", "test-graph");
}
catch (RdfLoadException ex)
{
    // Handle HTTP/network errors
    Logger.LogError("Failed to load from URI: {Uri}", ex.Data["Uri"]);
}
catch (RdfTimeoutException ex)
{
    // Handle timeout errors
    Logger.LogError("Request timeout for URI: {Uri}", ex.Data["Uri"]);
}
catch (RdfParseException ex)
{
    // Handle RDF parsing errors
    Logger.LogError("Failed to parse RDF content: {Message}", ex.Message);
}

Architectural Relationships¶

The RdfGraphContextBase is a foundational component within the RDF data management layer of the application.

It interacts with RdfGraphRepository component. Repositories use the context to access and manipulate RDF graphs for both read and write operations, while the context handles the underlying graph lifecycle and data loading.

Dependency Injection¶

RdfGraphContextBase is an abstract class designed to be extended by concrete implementations to provide RDF graph processing capabilities.

public abstract class RdfGraphContextBase : IRdfGraphContext
{
    public ITransientCachedServiceProvider CachedServiceProvider { get; set; } = default!;

    public ILogger<RdfGraphContextBase> Logger => 
        CachedServiceProvider.GetService<ILogger<RdfGraphContextBase>>(
            NullLogger<RdfGraphContextBase>.Instance);

    public RdfGraphContextBase(IOptions<RdfGraphContextOptions> options)
    {
        Options = options?.Value ?? new RdfGraphContextOptions();
    }
}

Concrete implementations are registered as scoped dependencies and can be configured in different ways:

public class MyRdfGraphContext : RdfGraphContextBase
{
    public MyRdfGraphContext(IOptions<RdfGraphContextOptions> options) : base(options) { }
}

Method 1: AddCimGraphContext, CGMES-aware registration

// Option A: Uses CGMES defaults automatically
// Applies: DefaultNamespaceInitializer, DefaultModelTypeInitializer, CgmesBaseUriResolver
services.AddCimGraphContext<MyRdfGraphContext>();

// Option B: CGMES defaults + custom overrides
services.AddCimGraphContext<MyRdfGraphContext>(options =>
{
    options.DefaultGraphName = "CustomDefault";
    options.AutoResetOnReload = false;
    // CGMES defaults are still applied after your custom configuration
});

With configure = null: Full CGMES configuration is applied automatically
With custom configure:: Your custom settings are applied first, then CGMES defaults fill in any missing configuration
Always includes: CIM/CGMES namespace prefixes, model type detection, and URI resolution

When using AddCimGraphContext, these components are automatically configured:

DefaultNamespaceInitializer - Adds standard CIM/CGMES namespace prefixes
DefaultModelTypeInitializer - Detects FullModel vs DifferenceModel from graph headers
CgmesBaseUriResolver - Resolves base URIs for CGMES graphs with proper versioning

Method 2: AddRdfGraphContext, Pure custom registration

// Full control - no CGMES configuration is applied
services.AddRdfGraphContext<MyRdfGraphContext>(options =>
{
    // You must configure everything manually
    // No CGMES-specific features are enabled by default
    options.NamespaceInitializer = new CustomNamespaceInitializer();
    options.ModelTypeResolver = new CustomModelTypeResolver();
    options.BaseUriResolver = new CustomBaseUriResolver();
});

The CGMES configuration is applied through CgmesGraphContextOptionsConfigurator which implements IConfigureOptions<RdfGraphContextOptions>:

public sealed class CgmesGraphContextOptionsConfigurator : IConfigureOptions<RdfGraphContextOptions>
{
    public void Configure(RdfGraphContextOptions options)
    {
        // Applies CGMES configuration using dependency-injected services
        options.UseCgmes(_cimVersionOptions, _cgmesVersionOptions, _cimModelTypeResolver);
    }
}

Key Integration Points¶

Graph Context as Data Source¶

RdfGraphRepository depends on IRdfGraphContext (typically a concrete implementation of RdfGraphContextBase) to access the underlying RDF graph data:

public class RdfGraphRepository<TGraphContext, TRdfDefinition, TEntity, TKey>
    where TGraphContext : class, IRdfGraphContext
    where TRdfDefinition : class, IRdfDefinition
    where TEntity : class, IEntity<TKey>
{
    protected TGraphContext GraphContext { get; }

    public RdfGraphRepository(TGraphContext graphContext, ...)
    {
        GraphContext = graphContext;
    }
}

Context handles low-level RDF operations; Repository handles high-level entity mapping. This separation enables clean architecture and testability.

See: Rdf Graph Repository Documentation for detailed information.

Dependency Flow¶

graph LR
    A["Client Application"] --> B["RdfGraphRepository"]
    B --> C["RdfGraphContextBase"]
    B --> D["IRdfDefinitionReader"]
    B --> E["IRdfDefinitionWriter"]
    B --> F["IRdfEntityMapperFactory"]

    C --> G["IGraph<br/>(VDS.RDF)"]
    C --> H["IRdfGraphInitializer"]
    C --> I["RdfGraphContextOptions"]
    I --> J["BaseUriResolver"]
    I --> K["GraphInitializers"]
    D --> G
    E --> G
    F --> L["IRdfEntityMapper"]
    L --> M["Domain Entity"]

Graph Modification¶

While the RdfGraphContextBase itself doesn't directly modify graph content (that's the responsibility of IRdfDefinitionWriter), it provides the IGraph instances that are modified:

Context provides graphs: context.GetGraph(name) returns the mutable IGraph
Writer modifies graphs: writer.Insert(graph, definition) asserts triples
Changes are immediate: Modifications are applied directly to the in-memory graph
Persistence is separate: Saving graphs back to files requires explicit serialization

In-Memory vs Persistent Changes¶

// Load graph
await context.LoadFromFileAsync("model.rdf", "cim-graph");

// Get graph reference
var graph = context.GetGraph("cim-graph");

// Modifications via repository/writer are immediate in memory
repository.Insert(newEntity);

// To persist changes, serialize the graph
var writer = new RdfXmlWriter();
using var stream = File.Create("model-updated.rdf");
writer.Save(graph, stream);

GraphMetadata¶

The context tracks metadata for each graph internally:

Property	Type	Description
`Name`	string	Graph identifier
`Graph`	IGraph	The underlying VDS.RDF graph
`Created`	DateTimeOffset	When the graph was created
`LastLoaded`	DateTimeOffset	When data was last loaded into the graph
`IsModified`	bool	Whether the graph has been modified
`Source`	string?	Source identifier - file path, URI, "string", or "stream:{format}" for direct stream loads
`LoadCount`	long	Number of times data has been loaded (incremented on each load, preserved on overwrite)
`TripleCount`	int	Cached count of triples (updated via `UpdateCounts()`)
`NamespacePrefixCount`	int	Cached count of namespace prefixes (updated via `UpdateCounts()`)
`Extensions`	IDictionary	Typed extension dictionary for custom metadata (e.g., CimModelType)