Skip to content

Context Layer Evaluation Framework based on Semantic Information Theory

License

Notifications You must be signed in to change notification settings

AnswerLayer/semiosis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

41 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Semiosis: Unit Testing for Documentation Quality

License Python Status

Semiosis is an open-source framework for measuring the semantic quality of static documentation and context systems. Think "unit testing for your knowledge base" - Semiosis reveals how much information is redundant, what's critical, and where your documentation breaks down.

๐ŸŽฏ Why Semiosis?

The Problem: You've built extensive documentation (DBT projects, API docs, knowledge bases) but don't know if it's actually good. Is there redundancy? What happens if parts go missing? Is it token-efficient?

The Solution: Semiosis measures context system quality using standardized LLM probes to evaluate:

  • Completeness: Does your documentation cover all necessary concepts?
  • Redundancy: How much can you remove while maintaining performance?
  • Semantic Density: How much information per documentation unit?
  • Robustness: How gracefully does performance degrade as context is removed?
  • Critical Boundaries: What's the minimum viable documentation set?

๐Ÿš€ Vision

When complete, Semiosis will provide comprehensive documentation quality analysis:

# Analyze your DBT project documentation quality
semiosis evaluate \
    --context dbt \
    --context-args project_path=./my_dbt_project \
    --environment text-to-sql \
    --interventions progressive_removal,schema_corruption

# Expected results: context quality report
# ๐Ÿ“Š Baseline Performance: 94% (excellent documentation)
# ๐ŸŽฏ Semantic threshold: ฮท_c = 0.35 (robust to 65% removal)
# ๐Ÿ’Ž Critical components: schema.yml files (high impact)
# ๐Ÿ“ˆ Redundancy: Column descriptions (medium overlap)
# ๐Ÿ† Benchmark: 75th percentile vs industry average

๐Ÿ—๏ธ Architecture

Semiosis provides a modular framework for context quality measurement:

  • ๐ŸŒ Environments: Define evaluation scenarios (text-to-SQL, code generation, custom domains)
  • ๐Ÿค– Standardized Probes: Built-in LLM agents as measurement instruments
  • ๐Ÿ“š Context Systems: Integration with documentation sources (DBT, API docs, knowledge bases)
  • โšก Interventions: Systematic context modifications (removal, corruption, reordering)
  • ๐Ÿ“ˆ Quality Engine: Mathematical framework for measuring semantic information density

๐Ÿ”ฌ Planned Use Cases

Documentation Optimization

# Find minimal documentation set for reliable performance
semiosis evaluate --context dbt --interventions progressive_removal
# Expected: Need only 40% of semantic models for 90% accuracy

Pre-Deployment Validation

# Test documentation robustness before agent deployment
semiosis evaluate --interventions corruption,missing_schemas,outdated_docs
# Expected: Performance drops to 60% with 30% schema corruption

๐Ÿ“Š Expected Results

  • ๐Ÿ“ˆ Quality Curves: How performance degrades with documentation removal
  • ๐ŸŽฏ Semantic Thresholds: Critical information boundaries (ฮท_c values)
  • ๐Ÿ’Ž Component Analysis: Which documentation sections are most valuable
  • ๐Ÿ“Š Redundancy Maps: What information overlaps and can be consolidated
  • ๐Ÿ† Benchmarking: How your context compares to industry standards
  • โšก Intervention Impact: Quantified effects of specific documentation changes

๐Ÿ› ๏ธ Planned Integrations

Standardized Measurement Probes

  • Frontier Models: All your favourites
  • Open Source Models: SQLCoder, Kimi K2, Mistral, etc.
  • Cloud Platforms: AWS Bedrock, Google Vertex AI for enterprise deployment

Evaluation Environments

  • Text-to-SQL: Spider 2.0, BIRD-SQL datasets for database query generation
  • Code Generation: Programming task evaluation with execution validation
  • Custom Domains: YAML-configurable environments for any documentation type

Documentation Sources

  • DBT Projects: Schema definitions, model docs, semantic layer analysis
  • API Documentation: OpenAPI specs, endpoint descriptions, parameter definitions
  • Knowledge Bases: Markdown files, wikis, technical documentation
  • Custom Sources: Any structured documentation via plugins

๐Ÿงฎ Mathematical Foundation

Semiosis will implement a rigorous mathematical framework based on semantic information theory:

Agent state:              ๐š = (q, y, โ„“, c, b, ฮธ)
Environment state:        ๐ž = (D, Q, T)
Context system:           ๐’ฎ_ฮท = [sโ‚, โ€ฆ, sโ‚™]
Intervention:             ๐’ฎ_ฮท' = ๐’ฎ_ฮท + s_{n+1}
Agent output:             p_ฮธ(y | q, D, ๐’ฎ_ฮท)
Token probability:        p_ฮธ(tแตข | t_{<i}, q, D, ๐’ฎ_ฮท)
Log-likelihood:           LL_ฮท(t) = ฮฃแตข log p_ฮธ(tแตข | t_{<i}, q, D, ๐’ฎ_ฮท)
Cross-entropy:            H_ฮท = ๐”ผ[โˆ’LL_ฮท(t(q))]
Trust update:             โ„“' = โ„“ + f(LL(t))
Budget update:            b' = b โˆ’ c + g(โ„“')
Viability:                V(ฮท) = Pr(โ„“ > โ„“_min โˆง b > 0)
Semantic threshold:       ฮท_c = inf{ฮท | V(ฮท) โ‰ค ยฝV(1)}

Where agents maintain trust (โ„“) through performance and budget (b) through resource management, with viability measuring sustainable operation probability.

๐Ÿค Contributing

We welcome contributions! Key areas for community involvement:

  • ๐ŸŒ Environments: Create evaluation scenarios for specific domains
  • ๐Ÿ“š Context Systems: Integrate new semantic layer/knowledgebase/documentation technologies

See our Contributing Guide for detailed instructions.

Development Setup

git clone https://github.com/AnswerLayer/semiosis.git
cd semiosis
pip install -e ".[dev]"
# Note: Core framework still in development - tests coming soon

๐Ÿ“š Citation

If you use Semiosis in your research, please cite:

@software{semiosis2025,
  title={Semiosis: Evaluate Semantic Layers for AI Agent Performance},
  author={AnswerLayer Team},
  year={2025},
  url={https://github.com/AnswerLayer/semiosis}
}

๐Ÿ“– References

This framework builds on foundational work in semantic information theory:

[1] Kolchinsky, A. and Wolpert, D.H. Semantic information, autonomous agency, and nonequilibrium statistical physics. New Journal of Physics, 20(9):093024, 2018. arXiv:1806.08053

[2] Sowinski, D.R., Balasubramanian, V., and Kolchinsky, A. Semantic information in a model of resource gathering agents. Physical Review E, 107(4):044404, 2023. arXiv:2304.03286

[3] Balasubramanian, V. and Kolchinsky, A. Exo-Daisy World: Revisiting Gaia Theory through an Informational Architecture Perspective. Planetary Science Journal, 4(12):236, 2023. PSJ

[4] Sowinski, D.R., Frank, A., and Ghoshal, G. Information-theoretic description of a feedback-control Kuramoto model. Physical Review Research 6, 043188, 2024. arXiv:2505.20315

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ”— Links


Status: Alpha - Active development. APIs may change.

Roadmap: See GitHub Issues for current development plan.

About

Context Layer Evaluation Framework based on Semantic Information Theory

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages