Transform your data into semantic, machine-readable knowledge graphs - no coding required.
🌐 Demo: http://rdfconverter.matolab.org/
RDFConverter bridges the gap between raw data (JSON, CSV, RDF) and standardized semantic ontologies using declarative YARRRML mappings. Whether you're publishing FAIR research data, integrating legacy systems, or implementing industry standards like Catena-X, RDFConverter makes semantic data transformation simple and maintainable.
- ✅ Declarative Mappings - Use human-readable YARRRML instead of imperative code
- ✅ Template-Based - Reuse semantic method templates across datasets
- ✅ Industry Standards - Built-in support for CSVW, SAMM, PROV-O
- ✅ Multi-Format - JSON, CSV, RDF/Turtle, JSON-LD - all work seamlessly
- ✅ Validation Built-In - SHACL shapes ensure data quality
- ✅ Production-Ready - RESTful API, Docker deployment, CI/CD tested
- ✅ Provenance Tracking - Automatic PROV-O metadata generation
RDFConverter doesn't reinvent YARRRML or RML - it packages the official tools (yarrrml-parser, rmlmapper-java) into a production-ready containerized service with a REST API.
✅ With RDFConverter:
docker compose up- Done! No dependency installation needed- Call from any language via REST API (Python, R, JavaScript, curl, etc.)
- Pre-configured for production (logging, error handling, provenance)
- Built-in testing and validation endpoints
- Works on any OS (Linux, Mac, Windows)
📦 Without RDFConverter (Manual Setup):
- Install Node.js for yarrrml-parser
- Install Java for RML Mapper
- Configure both tools separately
- Write glue code to connect them
- Handle errors and edge cases yourself
- Set up logging and monitoring
For full YARRRML/RML capabilities, see:
- Docker and Docker Compose installed and running
- Ports 3001, 4000, and 6003 available
# 1. Clone the repository
git clone https://github.com/Mat-O-Lab/RDFConverter.git
cd RDFConverter
# 2. Create configuration
cat > .env << EOF
PARSER_PORT=3001
MAPPER_PORT=4000
APP_PORT=6003
CONVERTER_PORT=5000
YARRRML_URL=http://yarrrml-parser:3001
MAPPER_URL=http://rmlmapper:4000
APP_MODE=development
SSL_VERIFY=True
EOF
# 3. Launch services
docker compose -f docker-compose.develop.yml up -d
# 4. Wait for services to start (30 seconds)
sleep 30
# 5. Access the application
# Web UI: http://localhost:6003/
# API docs: http://localhost:6003/api/docsTest the example CSVW mapping that transforms tensile test data:
curl -X POST http://localhost:6003/api/createrdf \
-H "Content-Type: application/json" \
-d '{"mapping_url":"https://github.com/Mat-O-Lab/RDFConverter/raw/main/examples/csvw-template-map.yaml"}'You'll get back a JSON response with:
graph: The generated RDF/Turtle outputnum_mappings_applied: Number of successful mapping rulesnum_mappings_skipped: Rules that didn't match data
Challenge: Laboratory generates CSV data from tensile testing machines. Data needs to be FAIR (Findable, Accessible, Interoperable, Reusable) and comply with DIN EN ISO 527-3 standard.
Solution: Use csvw-template-map.yaml to map CSV data to the MSEO (Materials Science and Engineering Ontology) using a standardized method template.
Result:
- Raw CSV becomes semantic RDF
- Data structured according to ISO standard
- Template ensures consistency across all tests
- Other researchers can understand and reuse the data
📁 See: examples/csvw-template-map.yaml
Challenge: Automotive partners in the Catena-X data space need to exchange batch manufacturing data in a standardized, interoperable format.
Solution: Use catenax-batch-map.yaml to transform JSON batch payloads into SAMM (Semantic Aspect Meta Model) compliant RDF that references the official io.catenax.batch:3.0.1 specification.
Result:
- JSON payloads become semantic documents
- Full compliance with Catena-X standards
- Traceability across supply chain partners
- Automated validation against SAMM models
📁 See: examples/catenax-batch-map.yaml
Have JSON/CSV/RDF data that needs semantic structure?
- Identify your data source format
- Find or create an ontology/template
- Write a YARRRML mapping (see examples)
- Test with RDFConverter
- Deploy and share!
| Approach | Setup Time | Use Case | Best For |
|---|---|---|---|
| RDFConverter | 5 minutes | Reusable workflows, teams, production | Declarative mappings + REST API |
| SPARQL CONSTRUCT | Immediate | One-off transformations | Already know SPARQL, simple transforms |
| Custom Scripts | Varies | Complex custom logic | Unique requirements, performance critical |
| GUI Tools (e.g., OntoRefine) | Minutes | Visual mapping | Non-developers, exploratory work |
Note: All approaches have their place. RDFConverter excels when you need maintainable, version-controlled mappings exposed via API.
Q: What transformations can RDFConverter handle? A: Any transformation supported by YARRRML/RML, including:
- CSV/JSON/XML to RDF
- RDF to RDF (vocabulary transformation)
- Complex joins and conditions
- Nested data structures
- Functions and data manipulation
Q: Can I transform between different RDF vocabularies? A: Yes! Use RDF as your data source in YARRRML mappings to map from one vocabulary to another (e.g., Dublin Core → Schema.org). See RML.io examples for RDF source handling.
Q: Does my data leave my server? A: No. All processing is local. RDFConverter only fetches URLs you provide in mappings.
Q: Can I use this in air-gapped/offline environments? A: Yes, once Docker images are pulled. No internet required for processing.
Q: How do I learn YARRRML syntax? A: See YARRRML Tutorial and try YARRRML Matey online editor.
Q: What about performance for large datasets? A: Performance depends on the underlying RML Mapper (Java). For specifics, see RML Mapper documentation.
Q: Can I integrate this into my Python/R/Node.js workflow? A: Yes! Use the REST API from any language that can make HTTP requests. See examples in the API Reference section.
RDFConverter uses a simple 3-step process:
┌─────────────────────────────────────┐
│ 1. Define Mapping (YARRRML) │
│ - Specify data sources │
│ - Define transformation rules │
│ - Optional: reference templates │
└─────────────────┬───────────────────┘
↓
┌─────────────────────────────────────┐
│ 2. RDFConverter Processes │
│ - Converts YARRRML → RML │
│ - Applies rules to your data │
│ - Merges with templates (optional) │
│ - Validates output (optional) │
└─────────────────┬───────────────────┘
↓
┌─────────────────────────────────────┐
│ 3. Get Semantic RDF │
│ - Structured knowledge graph │
│ - Validated against schemas │
│ - Provenance metadata included │
└─────────────────────────────────────┘
RDFConverter supports all standard RML reference formulations as defined in the RML specification:
| Format | MIME Type | Reference Formulation | Iterator Syntax | Example |
|---|---|---|---|---|
| CSV | text/plain | csv |
N/A (column names) | $(columnName) |
| JSON | application/json | jsonpath |
$.path.to[*].data |
$..items[*] |
| XML | application/xml | xpath |
/path/to/element |
//book[@id] |
| XML | application/xml | xquery |
XQuery expressions | for $x in ... |
RDFConverter automatically converts RDF formats to JSON-LD, enabling JSONPath-based mappings on semantic data sources. This is a key differentiator that allows you to use existing RDF/Turtle data with the powerful JSONPath iterator syntax.
| Input Format | File Extension | Conversion | Use With |
|---|---|---|---|
| Turtle | .ttl | → JSON-LD | referenceFormulation: jsonpath |
| RDF/XML | .rdf, .owl | → JSON-LD | referenceFormulation: jsonpath |
| N-Triples | .nt | → JSON-LD | referenceFormulation: jsonpath |
| N3 | .n3 | → JSON-LD | referenceFormulation: jsonpath |
| JSON-LD | .jsonld | Preserved | referenceFormulation: jsonpath |
The conversion process is transparent and automatic:
- Detection: RDFConverter detects RDF formats using file extension and content analysis
- Parsing: Loads RDF data with RDFLib into an in-memory graph
- Namespace Preservation: Extracts and preserves prefix bindings from original data
- JSON-LD Serialization: Converts graph to JSON-LD with standard
@contextfor CSVW - Mapping Execution: RML Mapper processes the JSON-LD using JSONPath iterators
Processing Pipeline:
Turtle/RDF File → RDFLib Parser → Graph → JSON-LD Serializer →
JSON-LD with @context → RML Mapper + JSONPath → Output RDF
Suppose you have a Turtle file data.ttl:
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix ex: <http://example.org/> .
ex:table a csvw:Table ;
csvw:column ex:col1, ex:col2 .
ex:col1 a csvw:Column ;
csvw:name "Force" ;
csvw:datatype "number" .
ex:col2 a csvw:Column ;
csvw:name "Displacement" ;
csvw:datatype "number" .YARRRML Mapping:
prefixes:
csvw: 'http://www.w3.org/ns/csvw#'
ex: 'http://example.org/'
sources:
columns:
access: 'https://example.com/data.ttl' # ← Turtle file
iterator: '$.[?(@.type=="csvw:Column")]' # ← JSONPath on JSON-LD
referenceFormulation: jsonpath # ← Use JSONPath
mappings:
ColumnMapping:
sources: [columns]
s: ex:measurement_$(["csvw:name"])
po:
- [rdf:type, ex:Measurement]
- [ex:columnName, $(["csvw:name"])]
- [ex:dataType, $(["csvw:datatype"])]What Happens Behind the Scenes:
- RDFConverter downloads
data.ttl(Turtle format) - Converts to JSON-LD:
{ "@context": { "csvw": "http://www.w3.org/ns/csvw#", "ex": "http://example.org/" }, "@graph": [ { "@id": "ex:col1", "@type": "csvw:Column", "csvw:name": "Force", "csvw:datatype": "number" }, { "@id": "ex:col2", "@type": "csvw:Column", "csvw:name": "Displacement", "csvw:datatype": "number" } ] } - RML Mapper applies JSONPath iterator:
$.[?(@.type=="csvw:Column")] - Generates output RDF with mappings applied
The existing csvw-template-map.yaml example demonstrates this perfectly - it uses CSVW metadata (which is JSON-LD) with JSONPath iterators:
sources:
columns:
access: 'https://.../example-metadata.json'
iterator: '$..columns[*]' # JSONPath on JSON-LD structure
referenceFormulation: jsonpath
annotations:
access: 'https://.../example-metadata.json'
iterator: '$.notes[*]' # Another JSONPath iterator
referenceFormulation: jsonpathThis works seamlessly because:
- CSVW metadata is already JSON-LD
- RDFConverter preserves the
@context - JSONPath can navigate the nested JSON structure
- RML Mapper generates semantic RDF output
Direct RDF/Turtle reference formulation:
# ❌ This does NOT work:
sources:
data:
access: 'https://example.com/data.ttl'
referenceFormulation: turtle # ❌ Not supported by RML MapperWhy? The underlying RML Mapper (Java implementation) does not support referenceFormulation: turtle or other RDF formats as iterators. RDF formats don't have a natural "iteration" concept like arrays in JSON or rows in CSV.
Use automatic JSON-LD conversion with JSONPath:
# ✅ This DOES work:
sources:
data:
access: 'https://example.com/data.ttl' # ✅ Auto-converted to JSON-LD
referenceFormulation: jsonpath # ✅ Use JSONPath
iterator: '$' # ✅ Iterate over JSON-LD structureFor complex RDF queries: If you need SPARQL-like pattern matching, consider:
- Pre-processing RDF with SPARQL CONSTRUCT queries
- Using the resulting RDF as your data source
- Or restructuring your mapping to work with JSON-LD's hierarchical structure
✅ DO:
- Use JSONPath for all data sources (JSON, JSON-LD, auto-converted RDF)
- Leverage automatic RDF → JSON-LD conversion for semantic sources
- Test with
/api/testendpoint to verify iterator expressions - Check the JSON-LD structure if mappings don't match (use debug mode)
❌ DON'T:
- Try to use
referenceFormulation: turtle(not supported) - Assume RDF structure will directly map to JSON arrays (test first)
- Mix reference formulations in a single source definition
RDFConverter consists of three Docker containers:
- RDFConverter (FastAPI, Python) - Main orchestrator and API
- YARRRML Parser (Node.js) - Converts human-readable YARRRML to machine-readable RML
- RML Mapper (Java) - Executes RML mapping rules on data
- YARRRML - Human-friendly RDF mapping language
- RML - RDF Mapping Language (W3C standard)
- RDFLib - Python library for RDF processing
- PySHACL - SHACL constraint validation
- FastAPI - Modern Python web framework with OpenAPI
| Endpoint | Method | Purpose | Input | Output |
|---|---|---|---|---|
/api/yarrrmltorml |
POST | Convert YARRRML to RML | YARRRML file URL | RML (Turtle format) |
/api/createrdf |
POST | Execute mapping and generate RDF | Mapping URL + optional data URL | RDF graph + statistics |
/api/checkmapping |
POST | Test mapping applicability | Mapping URL + data URL | Rule coverage report |
/api/rdfvalidator |
POST | Validate RDF against SHACL | RDF URL + SHACL shapes URL | Validation report |
/api/test |
POST | Test mapping with detailed stats | Mapping URL + optional data URL | Per-rule statistics |
/api/docs |
GET | Interactive API documentation | - | Swagger UI |
# Execute a mapping
curl -X POST http://localhost:6003/api/createrdf \
-H "Content-Type: application/json" \
-d '{
"mapping_url": "https://example.com/my-mapping.yaml",
"data_url": "https://example.com/my-data.json"
}'Response:
{
"filename": "my-data-joined.ttl",
"graph": "@prefix ex: <http://example.org/> .\n...",
"num_mappings_applied": 7,
"num_mappings_skipped": 0
}The /api/test endpoint is particularly useful during development:
curl -X POST "http://localhost:6003/api/test" \
-H "Content-Type: application/json" \
-d '{"mapping_url":"https://github.com/Mat-O-Lab/RDFConverter/raw/main/examples/csvw-template-map.yaml"}'Returns detailed statistics:
- Total rules vs. applied rules
- Per-rule triple counts
- Subject coverage
- Individual rule outputs
- Processing logs
What it does: Transforms CSV tensile test data with CSVW metadata into a semantic knowledge graph structured according to the DIN EN ISO 527-3 standard using the MSEO (Materials Science and Engineering Ontology).
Key Features Demonstrated:
- Multiple Data Sources from CSVW:
sources:
columns:
access: 'https://.../example-metadata.json'
iterator: '$..columns[*]' # Iterate over CSV columns
referenceFormulation: jsonpath
annotations:
access: 'https://.../example-metadata.json'
iterator: '$.notes[*]' # Iterate over metadata notes
referenceFormulation: jsonpath- Template Graph Usage:
prefixes:
template: 'https://.../DIN_EN_ISO_527-3.drawio.ttl/'The template provides a pre-structured semantic representation of the testing method. Individual measurements are linked to template placeholders.
- Conditional Mappings:
mappings:
SpecimenID:
sources: [annotations]
condition:
function: equal
parameters:
- [str1, $(label)]
- [str2, "aktuelle Probe"] # Match German label
po:
- ['http://purl.obolibrary.org/obo/RO_0010002', 'template:SpecimenID~iri']Rules only apply when conditions match, enabling flexible data extraction.
Why this matters:
- Reusability: Same template works for all DIN EN ISO 527-3 tests
- Standardization: Data follows established ontology (MSEO)
- Interoperability: Other labs using the same standard can understand your data
- FAIR Data: Findable, Accessible, Interoperable, Reusable
Data Flow:
CSV file → CSVW metadata (JSON-LD) → YARRRML rules →
Template graph + Mapped data → Validated semantic RDF
What it does: Transforms Catena-X batch manufacturing data (JSON format) into a semantic RDF document that conforms to the SAMM (Semantic Aspect Meta Model) batch specification version 3.0.1.
Key Features Demonstrated:
- Multiple Iterators for Complex JSON:
sources:
root:
access: 'https://.../Batch.json'
iterator: '$' # Root object
identifiers:
iterator: '$.localIdentifiers[*]' # Array of identifiers
sites_items:
iterator: '$.manufacturingInformation.sites[*]' # Nested array
partClass_items:
iterator: '$.partTypeInformation.partClassification[*]'- Industry-Standard Namespace Prefixes:
prefixes:
cx: "urn:samm:io.catenax.batch:3.0.1#"
samm: "urn:samm:org.eclipse.esmf.samm:meta-model:2.3.0#"
ext-built: "urn:samm:io.catenax.shared.part_site_information_as_built:2.0.0#"
ext-classification: "urn:samm:io.catenax.shared.part_classification:1.0.0#"- Nested Object Mappings:
mappings:
batch:
source: [root]
s: urn-uuid:$(catenaXId)
po:
- [rdf:type, cx:Batch]
- [cx:catenaXId, $(catenaXId)]
- p: cx:localIdentifiers
o:
mapping: identifiers_list # Links to another mapping
condition:
function: equal
parameters: [[str1, "1"], [str2, "1"]]- Reference to External Semantic Model:
prefixes:
method: "https://.../io.catenax.batch/3.0.1/Batch.ttl/"The mapping references the official Catena-X batch semantic model, ensuring compliance.
Why this matters:
- Industry Compliance: Meets Catena-X data space requirements
- Supply Chain Traceability: Enables end-to-end batch tracking across partners
- Semantic Interoperability: Different systems from different vendors can exchange data
- Automated Validation: Output can be validated against SAMM models
- Scalability: Same pattern works for other Catena-X aspects (SerialPart, JustInSequencePart, etc.)
Data Flow:
Catena-X JSON payload → YARRRML rules →
SAMM batch model reference → Compliant semantic RDF
Real-World Application: In the automotive industry, this enables a battery manufacturer to provide semantic batch data to the OEM, who can then trace that batch through the entire vehicle lifecycle - all while maintaining data sovereignty and interoperability.
RDFConverter uses GitHub Actions to automatically test all example mappings on every push:
- Workflow Location:
.github/workflows/TestExamples.yml - Tests Both Endpoints:
/api/yarrrmltorml- Generates.rmlfiles/api/createrdf- Generates.rdffiles
- Commit Results: Generated outputs are committed back to
examples/for review - Test on Push: Runs on all pushes to
mainordevelopbranches
Via API Endpoint:
# Test a specific mapping
curl -X POST "http://localhost:6003/api/test?mapping_url=https://github.com/Mat-O-Lab/RDFConverter/raw/main/examples/csvw-template-map.yaml"Test Your Own Mapping:
- Add your
.yamlfile to theexamples/directory - Commit and push to GitHub
- GitHub Actions automatically tests it
- Review generated
.rmland.rdffiles in the commit
The test endpoint provides detailed statistics:
{
"success": true,
"num_rules_total": 7,
"num_rules_applied": 7,
"num_rules_skipped": 0,
"num_triples_generated": 156,
"rule_statistics": [
{
"rule_name": "SpecimenID",
"predicate": "http://purl.obolibrary.org/obo/RO_0010002",
"triples_generated": 1,
"subjects_affected": 1
},
...
]
}For local development with hot-reloading:
docker compose -f docker-compose.develop.yml up -dAccess at: http://localhost:6003
For production deployment:
docker compose up -dOr use pre-built images from GitHub Container Registry:
services:
rdfconverter:
image: ghcr.io/mat-o-lab/rdfconverter:latest
yarrrml-parser:
image: ghcr.io/mat-o-lab/yarrrml-parser:latest
rmlmapper:
image: ghcr.io/mat-o-lab/rmlmapper-webapi:latestConfigure via .env file:
| Variable | Description | Default |
|---|---|---|
APP_PORT |
External port for RDFConverter API | 6003 |
CONVERTER_PORT |
Internal container port | 5000 |
PARSER_PORT |
YARRRML Parser port | 3001 |
MAPPER_PORT |
RML Mapper port | 4000 |
APP_MODE |
"development" or "production" | production |
SSL_VERIFY |
Verify SSL certificates | True |
YARRRML_URL |
Internal URL to YARRRML Parser | http://yarrrml-parser:3001 |
MAPPER_URL |
Internal URL to RML Mapper | http://rmlmapper:4000 |
-
Identify Your Data Source
- JSON file (plain or JSON-LD)
- CSV with CSVW metadata
- Existing RDF (Turtle, RDF/XML, etc.)
-
Choose or Create a Semantic Template (Optional)
- Look for existing ontologies in your domain
- Create a template RDF file with placeholders
- Reference the template URL in your mapping
-
Write YARRRML Mapping
Basic structure:
prefixes: ex: "http://example.org/" template: "http://example.org/template/" # Optional base: "http://example.org/data/" sources: mydata: access: "https://example.com/data.json" iterator: "$" referenceFormulation: jsonpath mappings: MyMapping: sources: [mydata] s: ex:$(id) po: - [rdf:type, ex:MyClass] - [ex:property, $(value)]
-
Test Your Mapping
curl -X POST http://localhost:6003/api/test \ -H "Content-Type: application/json" \ -d '{"mapping_url":"https://example.com/my-mapping.yaml"}'
-
Iterate
- Check
num_rules_appliedvsnum_rules_skipped - Review per-rule statistics
- Adjust conditions and mappings
- Test again
- Check
-
Deploy
- Add to your production workflow
- Set up automated validation if needed
- Monitor mapping success rates
- YARRRML Tutorial - Learn YARRRML syntax
- RML Specification - Detailed RML reference
- Example Mappings - See
examples/directory for real-world examples - YARRRML Matey - Online YARRRML editor
Docker not running
Error: Cannot connect to the Docker daemon
→ Start Docker and try again
Port already in use
Error: bind: address already in use
→ Change ports in .env file or stop conflicting services
No triples generated
{"num_mappings_applied": 0, "num_mappings_skipped": 7}→ Check:
- Data source URL is accessible
- Mapping conditions match your data
- Use
/api/testendpoint for detailed diagnostics
YARRRML syntax error
Could not read mapping file - is it valid YAML format?
→ Validate YAML syntax at https://www.yamllint.com/
Services not starting
# Check service status
docker compose ps
# View logs
docker compose logs rdfconverter
docker compose logs yarrrml-parser
docker compose logs rmlmapper
# Restart services
docker compose restartEnable detailed logging:
# Set in .env
APP_MODE=development
# Restart services
docker compose restart rdfconverterView logs in real-time:
docker compose logs -f rdfconverterWe welcome contributions! Here's how you can help:
- Report bugs - Open an issue with detailed reproduction steps
- Suggest features - Describe your use case and requirements
- Submit mappings - Share your YARRRML mappings as examples
- Improve docs - Help make the documentation clearer
- Fix issues - Submit pull requests with bug fixes or enhancements
- Create your
.yamlmapping file - Test it thoroughly with
/api/test - Add it to the
examples/directory - Submit a pull request with:
- The mapping file
- A description of what it demonstrates
- Sample data source (or URL to public data)
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
The authors would like to thank the Federal Government and the Heads of Government of the Länder for their funding and support within the framework of the Platform Material Digital consortium.
Funded by:
- Federal Ministry of Education and Research (BMBF)
- MaterialDigital initiative
- Project KupferDigital (Project ID: 13XP5119)
- Documentation: You're reading it! 📖
- API Docs: http://localhost:6003/api/docs (when running)
- Issues: https://github.com/Mat-O-Lab/RDFConverter/issues
- Organization: https://github.com/Mat-O-Lab
- Contact: thomas.hanke@iwm.fraunhofer.de
Made with ❤️ by the Mat-O-Lab team