RDF Graph Context¶
Purpose¶
RdfGraphContextBase serves as the central graph management system for RDF data within the application. It is an abstract base class that acts as a container and lifecycle manager for multiple named RDF graphs, providing operations to create, load, query, and dispose of graph instances.
The graph context is the foundation for both read and write operations performed by repositories.
Key Responsibilities¶
-
Graph Lifecycle Management
- Create and dispose of RDF graphs
- Load RDF data from multiple sources (files, URIs, strings, streams)
- Track graph metadata (creation time, load count, triple count, namespace prefix count, model type)
- Dispose of graphs properly (sync and async)
-
Multi-Source Data Loading
- File-based loading with automatic format detection
- URI/HTTP-based loading with content negotiation
- String-based loading for in-memory RDF content
- Stream-based loading with optimized buffering strategies
-
Format Support
- RDF/XML (.rdf, .xml)
- Turtle (.ttl)
- JSON-LD (.jsonld, .json)
- Automatic format detection from file extensions and content
-
Thread-Safe Operations
- Concurrent access to graphs using ConcurrentDictionary for graph storage
- Employs SemaphoreSlim for async operation locking
- Supports cancellation tokens for long-running operations
-
Configuration Management
- Configurable via
RdfGraphContextOptions - Supports graph initializers for custom setup
- Provides options for error handling, auto-reset, and memory management
- Configurable via
Architecture¶
graph TD
A["RdfGraphContextBase"] --> B["ConcurrentDictionary<string, GraphMetadata>"]
A --> C["RdfGraphContextOptions"]
A --> D["SemaphoreSlim"]
A --> T["IRdfFormatDetector"]
B --> E["GraphMetadata"]
E --> F["IGraph"]
E --> G["Metadata<br/>(Created, LastLoaded, TripleCount, NamespacePrefixCount)"]
E --> EX["Extensions<Type, object>"]
C --> H["DefaultGraphName"]
C --> I["AutoResetOnReload"]
C --> J["GraphInitializers"]
C --> K["BaseUriResolver"]
C --> L["HTTP Configuration<br/>(Timeout, MaxConnections, UserAgent)"]
C --> M["Memory Management<br/>(MemoryStreamThreshold, StreamBufferSize)"]
A --> N["Load Operations"]
N --> O["LoadFromFileAsync"]
N --> P["LoadFromUriAsync"]
N --> Q["LoadFromStreamAsync"]
N --> R["LoadFromStringAsync"]
T --> S["Format Detection"]
S --> U["Content Analysis"]
S --> V["Confidence Scoring"]
S --> W["Hint Adjustment"] Key Features¶
Graph Creation¶
public virtual IGraph CreateGraph(string? name = null, bool overwrite = false)
public virtual Task<IGraph> CreateGraphAsync(string? name = null, bool overwrite = false, CancellationToken cancellationToken = default)
- Creates a new graph with a specified name (defaults to DefaultGraphName)
- Supports overwriting existing graphs (preserves
LoadCountfrom existing graph when overwriting) - Applies graph initializers for custom namespace prefixes and configuration
- Sets the graph's BaseUri using the configured
BaseUriResolver - Tracks creation metadata
- Implements proper error handling and resource cleanup on failures
Graph Retrieval¶
public virtual IGraph GetGraph(string? name = null)
public virtual Task<IGraph> GetGraphAsync(string? name = null, CancellationToken cancellationToken = default)
public virtual bool TryGetGraph(out IGraph graph, string? name = null)
public virtual bool GraphExists(string? name = null)
public virtual IReadOnlyList<string> GetGraphNames()
- Retrieves existing graphs by name
- Provides both synchronous and asynchronous access
- Includes safe TryGetGraph pattern
- Check existence with GraphExists
- List all graph names with GetGraphNames
Graph Removal¶
public virtual bool RemoveGraph(string name)
public virtual ValueTask<bool> RemoveGraphAsync(string name)
public virtual void Clear(string? name = null)
- Remove graphs by name (cannot remove default graph)
- Clear all triples from a graph without removing it
Data Loading¶
// From file with auto-detection (source is automatically set to file path)
await context.LoadFromFileAsync("data.ttl", "myGraph");
// From URI with content negotiation (source is automatically set to URI)
await context.LoadFromUriAsync(new Uri("https://example.org/data.rdf"), "myGraph");
// From URI string
await context.LoadFromUriAsync("https://example.org/data.rdf", "myGraph");
// From string content with auto-detection (format inferred from content)
await context.LoadFromStringAsync(rdfContent, name: "myGraph");
// From string content with explicit format
await context.LoadFromStringAsync(rdfContent, RdfFormat.Turtle, "myGraph");
// From stream with format specification and optional source tracking
await context.LoadFromStreamAsync(stream, RdfFormat.Turtle, "myGraph");
await context.LoadFromStreamAsync(stream, RdfFormat.Turtle, "myGraph", "custom-source-id");
Method Signatures:
Task LoadFromFileAsync(string filePath, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)
Task LoadFromUriAsync(Uri uri, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)
Task LoadFromUriAsync(string uriString, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)
Task LoadFromStringAsync(string rdfContent, RdfFormat? format = null, string? name = null, CancellationToken cancellationToken = default)
Task LoadFromStreamAsync(Stream stream, RdfFormat? format = null, string? name = null, string? source = null, CancellationToken cancellationToken = default)
Key Features: - Asynchronous loading with proper cancellation support - Automatic format detection for files, HTTP responses, and string content - HTTP content negotiation with Accept headers for common RDF formats - Error handling with configurable behavior for parse failures - Memory optimization for large streams (configurable threshold) - Source tracking - automatically tracks where data was loaded from (file path, URI, string content, or custom source)
Supported RdfFormat values: - RdfFormat.Turtle - Turtle serialization - RdfFormat.RdfXml - RDF/XML serialization - RdfFormat.JsonLd - JSON-LD serialization - RdfFormat.NTriples - N-Triples serialization - RdfFormat.Unknown - Used when format cannot be determined
Format Detection¶
The context delegates format detection to IRdfFormatDetector, which analyzes stream content with confidence scoring.
Detection Strategy: 1. Content Analysis: Reads a sample of bytes from the stream and analyzes patterns 2. Hint Adjustment: Applies bonuses/penalties based on file extension and MIME type hints 3. Confidence Threshold: Returns the format if confidence meets the threshold (default: 50%) 4. Fallback: Returns RdfFormat.Unknown if no format meets the threshold
Detection Result:
public class DetectionResult
{
public RdfFormat Format { get; set; }
public int Confidence { get; set; } // 0-100%
public string? DetectedBy { get; set; } // "content", "extension", etc.
public string? Warning { get; set; }
public bool IsReliable => Confidence >= 70;
}
Supported Formats: - Turtle: .ttl files, text/turtle Content-Type - RDF/XML: .rdf/.xml files, application/rdf+xml Content-Type - JSON-LD: .jsonld/.json files, application/ld+json Content-Type - N-Triples: .nt files, application/n-triples Content-Type
HTTP Configuration¶
The context provides comprehensive HTTP client configuration for URI-based loading:
- Timeout Configuration: Configurable request timeout (default: 30 seconds)
- Connection Pooling: Configurable maximum concurrent connections (default: 10)
- User-Agent Header: Configurable User-Agent string (default: "GridLabRdfGraph/1.0")
- Content Negotiation: Automatic Accept headers for RDF formats
- Error Handling: Proper handling of network errors and timeouts
HTTP Accept Headers sent:
Accept: text/turtle, application/rdf+xml, application/ld+json, text/plain
Memory Management¶
The context implements several strategies for efficient memory management:
- Configurable memory threshold (
MemoryStreamThreshold, default 10MB) - Large stream optimization: Streams larger than the threshold are buffered to memory for better async performance
- Small stream efficiency: Smaller streams use direct parsing for efficiency
- Proper disposal patterns: Sync and async disposal of graphs and metadata
- Configurable stream buffer size (
StreamBufferSize, default 8KB) - Resource cleanup: Automatic cleanup on errors and exceptions
Graph Initializers¶
Graph initializers implement the IRdfGraphInitializer interface and are automatically executed when graphs are created or loaded. They provide a mechanism to configure graphs with default settings, namespace prefixes, or seed data.
Graph initializers are designed to:
- Pre-register Common Namespace Prefixes: Avoid repetitive namespace declarations in RDF documents
- Support CIM Standards: Configure graphs for Common Information Model (CIM) data exchange
- Enable Domain-Specific Extensions: Add custom namespaces for industry standards (ENTSO-E, CGMES, etc.)
- Seed Default Data: Optionally add baseline triples or vocabulary definitions
- Ensure Consistency: Guarantee that all graphs follow organizational naming conventions
Initialization Flow¶
The initialization flow differs based on the operation:
For new empty graphs (CreateGraph / CreateGraphAsync): 1. Create new graph instance 2. Resolve base URI using Options.ResolveBaseUri(name, metaData) 3. Apply initializers immediately (no data to detect from) 4. Update metadata (counts, timestamps)
For loading operations (LoadFromStreamAsync, LoadFromFileAsync, etc.): 1. Create new graph or get existing graph 2. Set fallback base URI from Options.ResolveBaseUri(name, metaData) 3. Load RDF data first (namespaces from file are now available) 4. Apply initializers after loading so detection can inspect loaded graph metadata 5. Re-resolve base URI using Options.ResolveBaseUri(name, metaData) which can inspect loaded graph metadata 6. Update metadata (counts, timestamps, source, load count)
sequenceDiagram
participant Client
participant Context as RdfGraphContextBase
participant Graph as IGraph
participant Parser as IRdfReader
participant Initializer as IRdfGraphInitializer
Client->>Context: LoadFromStreamAsync(stream, format, name, source)
Context->>Graph: Create new Graph()
Context->>Graph: Set fallback BaseUri via ResolveBaseUri(name, metaData)
Context->>Parser: LoadRdfDataAsync(parser, graph, stream)
Parser->>Graph: Parse and add triples/namespaces
Note over Graph: Namespaces from RDF file are now available
Context->>Initializer: ApplyInitializers
Initializer->>Graph: Inspect namespaces, add missing prefixes
Context->>Context: ResolveBaseUri(name, metaData)
Note over Context: Can now detect version from loaded namespaces
Context->>Graph: Update BaseUri with resolved value
Context->>Context: Update metadata (Source, LoadCount++)
Context-->>Client: Graph ready with correct configuration Adding Custom Initializers¶
options.GraphInitializers.Add(new CustomNamespaceInitializer());
Configuration Options¶
Configurable via RdfGraphContextOptions:
public class RdfGraphContextOptions
{
/// <summary>
/// Default graph name when not specified.
/// </summary>
public string DefaultGraphName { get; set; } = "Default";
/// <summary>
/// Strategy for resolving base URI for graphs.
/// Receives the graph name and the graph instance (which may contain loaded data).
/// May return null if no base URI can be resolved.
/// </summary>
public Func<string, IGraph?, Uri?> BaseUriResolver { get; set; }
/// <summary>
/// Ignore validation/parse errors and keep the old graph if load fails.
/// </summary>
public bool IgnoreValidationErrors { get; set; } = false;
/// <summary>
/// Automatically clear previous triples before loading new content.
/// </summary>
public bool AutoResetOnReload { get; set; } = true;
/// <summary>
/// Clear the graph if parsing fails when loading new content.
/// </summary>
public bool ClearOnParseFailure { get; set; } = true;
/// <summary>
/// Streams larger than this threshold (bytes) copied to memory for parsing.
/// </summary>
public int MemoryStreamThreshold { get; set; } = 10 * 1024 * 1024; // 10MB
/// <summary>
/// Buffer size for stream reading operations.
/// </summary>
public int StreamBufferSize { get; set; } = 8192; // 8KB
/// <summary>
/// Timeout for HTTP requests when loading from URIs.
/// </summary>
public TimeSpan HttpTimeout { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>
/// Maximum number of concurrent HTTP connections.
/// </summary>
public int MaxHttpConnections { get; set; } = 10;
/// <summary>
/// User-Agent string for HTTP requests.
/// </summary>
public string HttpUserAgent { get; set; } = "GridLabRdfGraph/1.0";
/// <summary>
/// Graph initializers executed after a new graph is created or reloaded.
/// </summary>
public List<IRdfGraphInitializer> GraphInitializers { get; set; } = new();
/// <summary>
/// Resolve base URI for a graph, using the configured BaseUriResolver.
/// </summary>
public Uri? ResolveBaseUri(string? graphName = null, GraphMetadata? metaData = null)
}
The BaseUriResolver is a delegate that allows custom logic for resolving base URIs. The default implementation generates URIs using the static https://gridlab.io/models/1/ value. Ensure URIs end with /.
Utility Methods¶
// Check if graph has been loaded (has triples)
public virtual bool IsLoaded(string? name = null)
// Get timestamp when graph was last loaded
public virtual DateTimeOffset GetLoadedAt(string? name = null)
// Get namespaces defined in the graph
public virtual Dictionary<string, Uri> GetNamespaces(string? name = null)
// Get number of triples in the graph
public virtual int GetTripleCount(string? name = null)
// Get number of namespace prefixes in the graph
public virtual int GetNamespacePrefixCount(string? name = null)
// Get complete metadata for a graph
public virtual GraphMetadata? GetGraphMetadata(string? name = null)
Note: These methods provide access to graph metadata and statistics for monitoring and diagnostic purposes. Domain-specific metadata like CimModelType is stored in the Extensions dictionary and accessed via GraphMetadataExtensions.
Error Handling¶
The RDF Graph Context provides comprehensive error handling for various failure scenarios:
HTTP Loading Errors¶
When loading from URIs, the context handles: - HttpRequestException: Network or HTTP protocol errors → throws RdfLoadException - TaskCanceledException: Request timeouts → throws RdfTimeoutException - Content negotiation failures: Falls back to format detection from file extension or content sniffing
Parse Errors¶
RdfException: RDF parsing failures from VDS.RDF- Configurable behavior via
IgnoreValidationErrorsoption - Automatic cleanup with
ClearOnParseFailureoption - Error logging with context information
Resource Management¶
- Automatic disposal of resources on errors
- Graph cleanup on creation failures
- Memory leak prevention through proper exception handling
Example Error Handling¶
try
{
await context.LoadFromUriAsync("https://example.org/invalid.rdf", "test-graph");
}
catch (RdfLoadException ex)
{
// Handle HTTP/network errors
Logger.LogError("Failed to load from URI: {Uri}", ex.Data["Uri"]);
}
catch (RdfTimeoutException ex)
{
// Handle timeout errors
Logger.LogError("Request timeout for URI: {Uri}", ex.Data["Uri"]);
}
catch (RdfParseException ex)
{
// Handle RDF parsing errors
Logger.LogError("Failed to parse RDF content: {Message}", ex.Message);
}
Architectural Relationships¶
The RdfGraphContextBase is a foundational component within the RDF data management layer of the application.
It interacts with RdfGraphRepository component. Repositories use the context to access and manipulate RDF graphs for both read and write operations, while the context handles the underlying graph lifecycle and data loading.
Dependency Injection¶
RdfGraphContextBase is an abstract class designed to be extended by concrete implementations to provide RDF graph processing capabilities.
public abstract class RdfGraphContextBase : IRdfGraphContext
{
public ITransientCachedServiceProvider CachedServiceProvider { get; set; } = default!;
public ILogger<RdfGraphContextBase> Logger =>
CachedServiceProvider.GetService<ILogger<RdfGraphContextBase>>(
NullLogger<RdfGraphContextBase>.Instance);
public RdfGraphContextBase(IOptions<RdfGraphContextOptions> options)
{
Options = options?.Value ?? new RdfGraphContextOptions();
}
}
Concrete implementations are registered as scoped dependencies and can be configured in different ways:
public class MyRdfGraphContext : RdfGraphContextBase
{
public MyRdfGraphContext(IOptions<RdfGraphContextOptions> options) : base(options) { }
}
Method 1: AddCimGraphContext, CGMES-aware registration
// Option A: Uses CGMES defaults automatically
// Applies: DefaultNamespaceInitializer, DefaultModelTypeInitializer, CgmesBaseUriResolver
services.AddCimGraphContext<MyRdfGraphContext>();
// Option B: CGMES defaults + custom overrides
services.AddCimGraphContext<MyRdfGraphContext>(options =>
{
options.DefaultGraphName = "CustomDefault";
options.AutoResetOnReload = false;
// CGMES defaults are still applied after your custom configuration
});
- With
configure = null: Full CGMES configuration is applied automatically - With custom
configure:: Your custom settings are applied first, then CGMES defaults fill in any missing configuration - Always includes: CIM/CGMES namespace prefixes, model type detection, and URI resolution
When using AddCimGraphContext, these components are automatically configured:
DefaultNamespaceInitializer- Adds standard CIM/CGMES namespace prefixesDefaultModelTypeInitializer- Detects FullModel vs DifferenceModel from graph headersCgmesBaseUriResolver- Resolves base URIs for CGMES graphs with proper versioning
Method 2: AddRdfGraphContext, Pure custom registration
// Full control - no CGMES configuration is applied
services.AddRdfGraphContext<MyRdfGraphContext>(options =>
{
// You must configure everything manually
// No CGMES-specific features are enabled by default
options.NamespaceInitializer = new CustomNamespaceInitializer();
options.ModelTypeResolver = new CustomModelTypeResolver();
options.BaseUriResolver = new CustomBaseUriResolver();
});
The CGMES configuration is applied through CgmesGraphContextOptionsConfigurator which implements IConfigureOptions<RdfGraphContextOptions>:
public sealed class CgmesGraphContextOptionsConfigurator : IConfigureOptions<RdfGraphContextOptions>
{
public void Configure(RdfGraphContextOptions options)
{
// Applies CGMES configuration using dependency-injected services
options.UseCgmes(_cimVersionOptions, _cgmesVersionOptions, _cimModelTypeResolver);
}
}
Key Integration Points¶
Graph Context as Data Source¶
RdfGraphRepository depends on IRdfGraphContext (typically a concrete implementation of RdfGraphContextBase) to access the underlying RDF graph data:
public class RdfGraphRepository<TGraphContext, TRdfDefinition, TEntity, TKey>
where TGraphContext : class, IRdfGraphContext
where TRdfDefinition : class, IRdfDefinition
where TEntity : class, IEntity<TKey>
{
protected TGraphContext GraphContext { get; }
public RdfGraphRepository(TGraphContext graphContext, ...)
{
GraphContext = graphContext;
}
}
Context handles low-level RDF operations; Repository handles high-level entity mapping. This separation enables clean architecture and testability.
See: Rdf Graph Repository Documentation for detailed information.
Dependency Flow¶
graph LR
A["Client Application"] --> B["RdfGraphRepository"]
B --> C["RdfGraphContextBase"]
B --> D["IRdfInstanceParser"]
B --> E["IRdfInstanceWriter"]
B --> F["IRdfMapperFactory"]
C --> G["IGraph<br/>(VDS.RDF)"]
C --> H["IRdfGraphInitializer"]
C --> I["RdfGraphContextOptions"]
I --> J["BaseUriResolver"]
I --> K["GraphInitializers"]
D --> G
E --> G
F --> L["IRdfEntityMapper"]
L --> M["Domain Entity"] Graph Modification¶
While the RdfGraphContextBase itself doesn't directly modify graph content (that's the responsibility of IRdfInstanceWriter), it provides the IGraph instances that are modified:
- Context provides graphs:
context.GetGraph(name)returns the mutableIGraph - Writer modifies graphs:
writer.InsertInstance(graph, definition)asserts triples - Changes are immediate: Modifications are applied directly to the in-memory graph
- Persistence is separate: Saving graphs back to files requires explicit serialization
In-Memory vs Persistent Changes¶
// Load graph
await context.LoadFromFileAsync("model.rdf", "cim-graph");
// Get graph reference
var graph = context.GetGraph("cim-graph");
// Modifications via repository/writer are immediate in memory
repository.Insert(newEntity);
// To persist changes, serialize the graph
var writer = new RdfXmlWriter();
using var stream = File.Create("model-updated.rdf");
writer.Save(graph, stream);
GraphMetadata¶
The context tracks metadata for each graph internally:
| Property | Type | Description |
|---|---|---|
Name | string | Graph identifier |
Graph | IGraph | The underlying VDS.RDF graph |
Created | DateTimeOffset | When the graph was created |
LastLoaded | DateTimeOffset | When data was last loaded into the graph |
IsModified | bool | Whether the graph has been modified |
Source | string? | Source identifier - file path, URI, "string", or "stream:{format}" for direct stream loads |
LoadCount | long | Number of times data has been loaded (incremented on each load, preserved on overwrite) |
TripleCount | int | Cached count of triples (updated via UpdateCounts()) |
NamespacePrefixCount | int | Cached count of namespace prefixes (updated via UpdateCounts()) |
Extensions | IDictionary | Typed extension dictionary for custom metadata (e.g., CimModelType) |
Key Features: - Automatic count updates after loading or modifying graphs - Source tracking for debugging and audit purposes - Extensible metadata via typed extension dictionary (use GraphMetadataExtensions for typed access) - Load statistics for performance monitoring
Metadata Extensions¶
The Extensions dictionary allows storing typed metadata using GraphMetadataExtensions:
// Set an extension
metadata.SetExtension(new CimModelTypeExtension { ModelType = CimModelType.FullModel });
// Get an extension
var ext = metadata.GetExtension<CimModelTypeExtension>();
// Get or create an extension
var ext = metadata.GetOrCreateExtension<CimModelTypeExtension>();
// Check if extension exists
bool exists = metadata.HasExtension<CimModelTypeExtension>();
// Remove an extension
bool removed = metadata.RemoveExtension<CimModelTypeExtension>();
Implementation Notes¶
Current Limitations¶
-
Format Detection: Content sniffing for format detection is limited and may not work reliably for all RDF content types. Explicit format specification is recommended when the format is known.
-
Stream Seeking: HTTP response streams that don't support seeking may cause issues during format detection. The system attempts to handle this gracefully but may throw
NotSupportedExceptionin some cases.
Thread Safety¶
- Graph Storage: Uses
ConcurrentDictionaryfor thread-safe graph storage - Async Operations: Protected by
SemaphoreSlimto ensure proper serialization of async operations - Graph Access: Individual graph instances are not thread-safe; external synchronization required for concurrent access to graph contents
Performance Considerations¶
- Memory Threshold: Large streams (>10MB by default) are buffered to memory for better async performance
- HTTP Connection Pooling: Configurable connection limits prevent resource exhaustion
- Lazy Disposal: Graphs are disposed asynchronously when possible to avoid blocking operations
- Metadata Caching: Triple and namespace counts are cached and updated only when necessary
Disposal Pattern¶
The context implements both IDisposable and IAsyncDisposable:
// Synchronous disposal
context.Dispose();
// Asynchronous disposal (preferred)
await context.DisposeAsync();
// Using statement (automatically disposes)
using var context = serviceProvider.GetRequiredService<MyRdfGraphContext>();
Best Practices¶
- Use async methods when possible for better scalability
- Specify formats explicitly when known to avoid detection overhead
- Configure memory thresholds based on your application's memory constraints
- Implement custom base URI resolvers for domain-specific requirements
- Use graph initializers to ensure consistent namespace configurations
Related Resources¶
- Repository Documentation - How repositories use context for CRUD operations
- Mappers Documentation - Bidirectional entity mapping
References¶
- IEC 61970-301:2016 - Common Information Model (CIM) standard
- IEC 61970-552:2016 - CIMXML Model Exchange Specification
- ENTSO-E CGMES Documentation
- RDF 1.1 Concepts
- W3C RDF Schema
- VDS.RDF (dotNetRDF) - The underlying RDF library