growgraph · alexander-belikov · Feb 17, 2026 · Feb 17, 2026 · Feb 17, 2026
diff --git a/.github/workflows/pre-commit.yml b/.github/workflows/pre-commit.yml
@@ -15,6 +15,6 @@ jobs:
             with:
                 version: "0.9.28"
         -   name: Install dependencies (including dev)
-            run: uv sync --group dev
+            run: uv sync --group dev --extra sparql
         -   name: Run pre-commit
             run: uv run pre-commit run --all-files
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,7 +5,21 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
-## [Unreleased]
+## [1.6.0] - 2026-02-17
+
+### Added
+- **SPARQL / RDF resource support**: Ingest data from SPARQL endpoints (e.g. Apache Fuseki) and local RDF files (`.ttl`, `.rdf`, `.n3`, `.jsonld`) into property graphs
+  - New `SparqlPattern` for mapping `rdf:Class` instances to resources, alongside existing `FilePattern` and `TablePattern`
+  - New `RdfDataSource` abstract parent with shared RDF-to-dict conversion logic; concrete subclasses `RdfFileDataSource` (local files via rdflib) and `SparqlEndpointDataSource` (remote endpoints via SPARQLWrapper)
+  - New `SparqlEndpointConfig` (extends `DBConfig`) with `from_docker_env()` for Fuseki containers
+  - New `RdfInferenceManager` auto-infers graflo `Schema` from OWL/RDFS ontologies: `owl:Class` to vertices, `owl:DatatypeProperty` to fields, `owl:ObjectProperty` to edges
+  - `GraphEngine.infer_schema_from_rdf()` and `GraphEngine.create_patterns_from_rdf()` for the RDF inference workflow
+  - `Patterns` class extended with `sparql_patterns` and `sparql_configs` dicts
+  - `RegistryBuilder` handles `ResourceType.SPARQL` to create the appropriate data sources
+  - `ResourceType.SPARQL`, `DataSourceType.SPARQL`, `DBType.SPARQL` enum values
+  - `rdflib` and `SPARQLWrapper` available as the `sparql` optional extra (`pip install graflo[sparql]`)
+  - Docker scripts (`start-all.sh`, `stop-all.sh`, `cleanup-all.sh`) updated to include Fuseki
+  - Test suite with 22 tests: RDF file parsing, ontology inference, and live Fuseki integration
 
 ### Changed
 - **Top-level imports optimized**: Key classes are now importable directly from `graflo`:
@@ -17,6 +31,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **`graflo.filter` package exports**: `FilterExpression`, `ComparisonOperator`, and `LogicalOperator` are now re-exported from `graflo.filter.__init__` (previously only available via `graflo.filter.onto`)
 
 ### Documentation
+- Added data-flow diagram (Pattern -> DataSource -> Resource -> GraphContainer -> Target DB) to Concepts page
 - Added **Mermaid class diagrams** to Concepts page showing:
   - `GraphEngine` orchestration: how `GraphEngine` delegates to `InferenceManager`, `ResourceMapper`, `Caster`, and `ConnectionManager`
   - `Schema` architecture: the full hierarchy from `Schema` through `VertexConfig`/`EdgeConfig`, `Resource`, `Actor` subtypes, `Field`, and `FilterExpression`

diff --git a/README.md b/README.md
@@ -1,8 +1,8 @@
 # GraFlo <img src="https://raw.githubusercontent.com/growgraph/graflo/main/docs/assets/favicon.ico" alt="graflo logo" style="height: 32px; width:32px;"/>
 
-A framework for transforming **tabular** (CSV, SQL) and **hierarchical** data (JSON, XML) into property graphs and ingesting them into graph databases (ArangoDB, Neo4j, **TigerGraph**, **FalkorDB**, **Memgraph**).
+A framework for transforming **tabular** (CSV, SQL), **hierarchical** (JSON, XML), and **RDF/SPARQL** data into property graphs and ingesting them into graph databases (ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph).
 
-> **⚠️ Package Renamed**: This package was formerly known as `graphcast`.
+> **Package Renamed**: This package was formerly known as `graphcast`.
 
 ![Python](https://img.shields.io/badge/python-3.11%2B-blue.svg) 
 [![PyPI version](https://badge.fury.io/py/graflo.svg)](https://badge.fury.io/py/graflo)
@@ -11,56 +11,35 @@ A framework for transforming **tabular** (CSV, SQL) and **hierarchical** data (J
 [![pre-commit](https://github.com/growgraph/graflo/actions/workflows/pre-commit.yml/badge.svg)](https://github.com/growgraph/graflo/actions/workflows/pre-commit.yml)
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15446131.svg)]( https://doi.org/10.5281/zenodo.15446131)
 
-## Core Concepts
+## Overview
 
-### Property Graphs
-graflo works with property graphs, which consist of:
+graflo reads data from multiple source types, transforms it according to a declarative schema, and writes property-graph vertices and edges to a target graph database. The pipeline is:
 
-- **Vertices**: Nodes with properties and optional unique identifiers
-- **Edges**: Relationships between vertices with their own properties
-- **Properties**: Both vertices and edges may have properties
+**Pattern** (where data lives) --> **DataSource** (how to read it) --> **Resource** (what to extract) --> **GraphContainer** --> **Target DB**
 
-### Schema
-The Schema defines how your data should be transformed into a graph and contains:
+### Supported sources
 
-- **Vertex Definitions**: Specify vertex types, their properties, and unique identifiers
-  - Fields can be specified as strings (backward compatible) or typed `Field` objects with types (INT, FLOAT, STRING, DATETIME, BOOL)
-  - Type information enables better validation and database-specific optimizations
-- **Edge Definitions**: Define relationships between vertices and their properties
-  - Weight fields support typed definitions for better type safety
-- **Resource Mapping**: describe how data sources map to vertices and edges
-- **Transforms**: Modify data during the casting process
-- **Automatic Schema Inference**: Generate schemas automatically from PostgreSQL 3NF databases
+| Source type | Pattern | DataSource | Schema inference |
+|---|---|---|---|
+| CSV / JSON / JSONL / Parquet files | `FilePattern` | `FileDataSource` | manual |
+| PostgreSQL tables | `TablePattern` | `SQLDataSource` | automatic (3NF with PK/FK) |
+| RDF files (`.ttl`, `.rdf`, `.n3`) | `SparqlPattern` | `RdfFileDataSource` | automatic (OWL/RDFS ontology) |
+| SPARQL endpoints (Fuseki, ...) | `SparqlPattern` | `SparqlEndpointDataSource` | automatic (OWL/RDFS ontology) |
+| REST APIs | -- | `APIDataSource` | manual |
+| In-memory (list / DataFrame) | -- | `InMemoryDataSource` | manual |
 
-### Resources
-Resources are your data sources that can be:
+### Supported targets
 
-- **Table-like**: CSV files, database tables
-- **JSON-like**: JSON files, nested data structures
+ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph -- same API for all.
 
 ## Features
 
-- **Graph Transformation Meta-language**: A powerful declarative language to describe how your data becomes a property graph:
-    - Define vertex and edge structures with typed fields
-    - Set compound indexes for vertices and edges
-    - Use blank vertices for complex relationships
-    - Specify edge constraints and properties with typed weight fields
-    - Apply advanced filtering and transformations
-- **Typed Schema Definitions**: Enhanced type support throughout the schema system
-    - Vertex fields support types (INT, FLOAT, STRING, DATETIME, BOOL) for better validation
-    - Edge weight fields can specify types for improved type safety
-    - Backward compatible: fields without types default to None (suitable for databases like ArangoDB)
-- **🚀 PostgreSQL Schema Inference**: **Automatically generate schemas from PostgreSQL 3NF databases** - No manual schema definition needed!
-    - Introspect PostgreSQL schemas to identify vertex-like and edge-like tables
-    - Automatically map PostgreSQL data types to graflo Field types (INT, FLOAT, STRING, DATETIME, BOOL)
-    - Infer vertex configurations from table structures with proper indexes
-    - Infer edge configurations from foreign key relationships
-    - Create Resource mappings from PostgreSQL tables automatically
-    - Direct database access - ingest data without exporting to files first
-- **Async ingestion**: Efficient async/await-based ingestion pipeline for better performance
-- **Parallel processing**: Use as many cores as you have
-- **Database support**: Ingest into ArangoDB, Neo4j, **TigerGraph**, **FalkorDB**, and **Memgraph** using the same API (database agnostic). Source data from PostgreSQL and other SQL databases.
-- **Server-side filtering**: Efficient querying with server-side filtering support (TigerGraph REST++ API)
+- **Declarative graph transformation**: Define vertex/edge structures, indexes, weights, and transforms in YAML or Python dicts. Resources describe how each data source maps to vertices and edges.
+- **Schema inference**: Automatically generate schemas from PostgreSQL 3NF databases (PK/FK heuristics) or from OWL/RDFS ontologies (class/property introspection).
+- **RDF / SPARQL ingestion**: Read `.ttl` files via rdflib or query SPARQL endpoints (e.g. Apache Fuseki). `owl:Class` maps to vertices, `owl:ObjectProperty` to edges, `owl:DatatypeProperty` to vertex fields.
+- **Typed fields**: Vertex fields and edge weights support types (`INT`, `FLOAT`, `STRING`, `DATETIME`, `BOOL`) for validation and database-specific optimisation.
+- **Parallel batch processing**: Configurable batch sizes and multi-core execution.
+- **Database-agnostic**: Single API targeting ArangoDB, Neo4j, TigerGraph, FalkorDB, and Memgraph. Source data from PostgreSQL, SPARQL endpoints, files, APIs, or in-memory objects.
 
 ## Documentation
 Full documentation is available at: [growgraph.github.io/graflo](https://growgraph.github.io/graflo)
@@ -69,6 +48,9 @@ Full documentation is available at: [growgraph.github.io/graflo](https://growgra
 
 ```bash
 pip install graflo
+
+# With RDF / SPARQL support (adds rdflib + SPARQLWrapper)
+pip install graflo[sparql]
 ```
 
 ## Usage Examples
@@ -187,6 +169,34 @@ caster = Caster(schema)
 # ... continue with ingestion
 ```
 
+### RDF / SPARQL Ingestion
+
+```python
+from pathlib import Path
+from graflo.hq import GraphEngine
+from graflo.db.connection.onto import ArangoConfig
+
+engine = GraphEngine()
+
+# Infer schema from an OWL/RDFS ontology file
+ontology = Path("ontology.ttl")
+schema = engine.infer_schema_from_rdf(source=ontology)
+
+# Create data-source patterns (reads a local .ttl file per rdf:Class)
+patterns = engine.create_patterns_from_rdf(source=ontology)
+
+# Or point at a SPARQL endpoint instead:
+# from graflo.db.connection.onto import SparqlEndpointConfig
+# sparql_cfg = SparqlEndpointConfig(uri="http://localhost:3030", dataset="mydata")
+# patterns = engine.create_patterns_from_rdf(
+#     source=ontology,
+#     endpoint_url=sparql_cfg.query_endpoint,
+# )
+
+target = ArangoConfig.from_docker_env()
+engine.define_and_ingest(schema=schema, target_db_config=target, patterns=patterns)
+```
+
 ## Development
 
 To install requirements
@@ -235,25 +245,32 @@ FalkorDB from [falkordb docker folder](./docker/falkordb) by
 docker-compose --env-file .env up falkordb
 ```
 
-and Memgraph from [memgraph docker folder](./docker/memgraph) by
+Memgraph from [memgraph docker folder](./docker/memgraph) by
 
 ```shell
 docker-compose --env-file .env up memgraph
 ```
 
+and Apache Fuseki from [fuseki docker folder](./docker/fuseki) by
+
+```shell
+docker-compose --env-file .env up fuseki
+```
+
 To run unit tests
 
 ```shell
 pytest test
 ```
 
-> **Note**: Tests require external database containers (ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph) to be running. CI builds intentionally skip test execution. Tests must be run locally with the required database images started (see [Test databases](#test-databases) section above).
+> **Note**: Tests require external database containers (ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph, Fuseki) to be running. CI builds intentionally skip test execution. Tests must be run locally with the required database images started (see [Test databases](#test-databases) section above).
 
 ## Requirements
 
 - Python 3.11+ (Python 3.11 and 3.12 are officially supported)
 - python-arango
 - sqlalchemy>=2.0.0 (for PostgreSQL and SQL data sources)
+- rdflib>=7.0.0 + SPARQLWrapper>=2.0.0 (optional, install with `pip install graflo[sparql]`)
 
 ## Contributing
 

diff --git a/docker/cleanup-all.sh b/docker/cleanup-all.sh
@@ -9,7 +9,7 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 cd "$SCRIPT_DIR"
 
 # Database directories
-DATABASES=("arango" "neo4j" "postgres" "falkordb" "memgraph" "nebula" "tigergraph")
+DATABASES=("arango" "neo4j" "postgres" "falkordb" "memgraph" "nebula" "tigergraph" "fuseki")
 
 # Colors for output
 GREEN='\033[0;32m'

diff --git a/docker/fuseki/.env b/docker/fuseki/.env
@@ -0,0 +1,7 @@
+IMAGE_VERSION=secoresearch/fuseki:5.1.0
+SPEC=graflo
+CONTAINER_NAME="${SPEC}.fuseki"
+TS_PORT=3032
+TS_PASSWORD="abc123-qwe"
+TS_USERNAME="admin"
+TS_DATASET="test"
diff --git a/docker/fuseki/docker-compose.yml b/docker/fuseki/docker-compose.yml
@@ -0,0 +1,16 @@
+services:
+    fuseki:
+        image: ${IMAGE_VERSION}
+        user: "${UID}:${GID}"
+        restart: "no"
+        profiles: ["${CONTAINER_NAME}"]
+        ports:
+        -   "${TS_PORT}:3030"
+        container_name: ${CONTAINER_NAME}
+        volumes:
+        -   fuseki_data:/fuseki
+        environment:
+        -   ADMIN_PASSWORD=${TS_PASSWORD}
+volumes:
+    fuseki_data:
+        driver: local
diff --git a/docker/start-all.sh b/docker/start-all.sh
@@ -8,7 +8,7 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 cd "$SCRIPT_DIR"
 
 # Database directories
-DATABASES=("arango" "neo4j" "postgres" "falkordb" "memgraph" "nebula" "tigergraph")
+DATABASES=("arango" "neo4j" "postgres" "falkordb" "memgraph" "nebula" "tigergraph" "fuseki")
 
 # Colors for output
 GREEN='\033[0;32m'

diff --git a/docker/stop-all.sh b/docker/stop-all.sh
@@ -8,7 +8,7 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 cd "$SCRIPT_DIR"
 
 # Database directories
-DATABASES=("arango" "neo4j" "postgres" "falkordb" "memgraph" "nebula" "tigergraph")
+DATABASES=("arango" "neo4j" "postgres" "falkordb" "memgraph" "nebula" "tigergraph" "fuseki")
 
 # Colors for output
 GREEN='\033[0;32m'

diff --git a/docs/concepts/index.md b/docs/concepts/index.md
@@ -10,6 +10,51 @@ graflo transforms data sources into property graphs through a pipeline of compon
 
 Each component plays a specific role in this transformation process.
 
+### Data flow: Pattern → DataSource → Resource → GraphContainer → Target DB
+
+The diagram below shows how different data sources (files, SQL tables, RDF/SPARQL)
+flow through the unified ingestion pipeline.
+
+```mermaid
+flowchart LR
+    subgraph sources [Data Sources]
+        TTL["*.ttl / *.rdf files"]
+        Fuseki["SPARQL Endpoint\n(Fuseki)"]
+        Files["CSV / JSON files"]
+        PG["PostgreSQL"]
+    end
+    subgraph patterns [Patterns]
+        FP[FilePattern]
+        TP[TablePattern]
+        SP[SparqlPattern]
+    end
+    subgraph datasources [DataSource Layer]
+        subgraph rdfFamily ["RdfDataSource (abstract)"]
+            RdfDS[RdfFileDataSource]
+            SparqlDS[SparqlEndpointDataSource]
+        end
+        FileDS[FileDataSource]
+        SQLDS[SQLDataSource]
+    end
+    subgraph pipeline [Shared Pipeline]
+        Res[Resource Pipeline]
+        GC[GraphContainer]
+        DBW[DBWriter]
+    end
+
+    TTL --> SP --> RdfDS --> Res
+    Fuseki --> SP --> SparqlDS --> Res
+    Files --> FP --> FileDS --> Res
+    PG --> TP --> SQLDS --> Res
+    Res --> GC --> DBW
+```
+
+- **Patterns** describe *where* data comes from (file paths, SQL tables, SPARQL endpoints).
+- **DataSources** handle *how* to read data in batches from each source type.
+- **Resources** define *what* to extract from each document (vertices, edges, transforms).
+- **GraphContainer** collects the resulting vertices and edges.
+- **DBWriter** pushes the graph data into the target database (ArangoDB, Neo4j, TigerGraph, etc.).
+
 ## Class Diagrams
 
 ### GraphEngine orchestration
@@ -28,6 +73,8 @@ classDiagram
         +introspect(postgres_config) SchemaIntrospectionResult
         +infer_schema(postgres_config) Schema
         +create_patterns(postgres_config) Patterns
+        +infer_schema_from_rdf(source) Schema
+        +create_patterns_from_rdf(source) Patterns
         +define_schema(schema, target_db_config)
         +define_and_ingest(schema, target_db_config, ...)
         +ingest(schema, target_db_config, ...)
@@ -63,6 +110,7 @@ classDiagram
     class Patterns {
         +file_patterns: list~FilePattern~
         +table_patterns: list~TablePattern~
+        +sparql_patterns: list~SparqlPattern~
     }
 
     class DBConfig {
@@ -293,7 +341,7 @@ The `Schema` is the central configuration that defines how data sources are tran
 - Resource mappings
 - Data transformations
 - Index configurations
-- Automatic schema inference from normalized PostgreSQL databases (3NF) with proper primary keys (PK) and foreign keys (FK) using intelligent heuristics
+- Automatic schema inference from normalized PostgreSQL databases (3NF with PK/FK) or from OWL/RDFS ontologies
 
 ### Vertex
 A `Vertex` describes vertices and their database indexes. It supports:
@@ -386,11 +434,13 @@ Edges in graflo support a rich set of attributes that enable flexible relationsh
 A `DataSource` defines where data comes from and how it's retrieved. graflo supports multiple data source types:
 
 - **File Data Sources**: JSON, JSONL, CSV/TSV files
+- **RDF File Data Sources**: Turtle (`.ttl`), RDF/XML (`.rdf`), N3 (`.n3`), JSON-LD files -- parsed via `rdflib`, triples grouped by subject into flat dictionaries
+- **SPARQL Data Sources**: Remote SPARQL endpoints (e.g. Apache Fuseki) queried via `SPARQLWrapper` with pagination
 - **API Data Sources**: REST API endpoints with pagination, authentication, and retry logic
 - **SQL Data Sources**: SQL databases via SQLAlchemy with parameterized queries
 - **In-Memory Data Sources**: Python objects (lists, DataFrames) already in memory
 
-Data sources are separate from Resources - they handle data retrieval, while Resources handle data transformation. Many data sources can map to the same Resource, allowing data to be ingested from multiple sources.
+Data sources are separate from Resources -- they handle data retrieval, while Resources handle data transformation. Many data sources can map to the same Resource, allowing data to be ingested from multiple sources.
 
 ### Resource
 A `Resource` is a set of mappings and transformations that define how data becomes a graph, defined as a hierarchical structure of `Actors`. Resources are part of the Schema and define:
@@ -431,7 +481,8 @@ A `Transform` defines data transforms, from renaming and type-casting to arbitra
 - **Edge Constraints**: Ensure edge uniqueness based on source, target, and weight
 - **Reusable Transforms**: Define and reference transformations by name
 - **Vertex Filtering**: Filter vertices based on custom conditions
-- **PostgreSQL Schema Inference**: Automatically infer schemas from normalized PostgreSQL databases (3NF) with proper primary keys (PK) and foreign keys (FK) decorated, using intelligent heuristics to detect vertices and edges
+- **PostgreSQL Schema Inference**: Infer schemas from normalized PostgreSQL databases (3NF) with PK/FK constraints
+- **RDF / OWL Schema Inference**: Infer schemas from OWL/RDFS ontologies -- `owl:Class` becomes vertices, `owl:ObjectProperty` becomes edges, `owl:DatatypeProperty` becomes vertex fields
 
 ### Performance Optimization
 - **Batch Processing**: Process large datasets in configurable batches (`batch_size` parameter of `Caster`)
@@ -453,6 +504,7 @@ A `Transform` defines data transforms, from renaming and type-casting to arbitra
    - Specify types for weight fields when using databases that require type information (e.g., TigerGraph)
    - Use typed `Field` objects or dicts with `type` key for better validation
 8. Leverage key matching (`match_source`, `match_target`) for complex matching scenarios
-9. Use PostgreSQL schema inference for automatic schema generation from normalized databases (3NF) with proper PK/FK constraints - the heuristics work best when primary keys and foreign keys are properly decorated
-10. Specify field types for better validation and database-specific optimizations, especially when targeting TigerGraph
+9. Use PostgreSQL schema inference for automatic schema generation from normalized databases (3NF) with proper PK/FK constraints
+10. Use RDF/OWL schema inference (`infer_schema_from_rdf`) when ingesting data from SPARQL endpoints or `.ttl` files with a well-defined ontology
+11. Specify field types for better validation and database-specific optimizations, especially when targeting TigerGraph