ZON-Format · ronibhakta1 · Nov 30, 2025 · Nov 30, 2025 · Nov 30, 2025 · Nov 30, 2025
diff --git a/README.md b/README.md
diff --git a/SPEC.md b/SPEC.md
diff --git a/zon-format/CHANGELOG.md b/zon-format/CHANGELOG.md
@@ -1,10 +1,60 @@
 # Changelog
 
-All notable changes to the ZON Format project will be documented in this file.
+All notable changes to this project will be documented in this file.
 
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [1.0.4] - 2025-11-30
+
+### Added
+- **Colon-less Syntax:** Objects and arrays in nested positions now use `key{...}` and `key[...]` syntax, removing redundant colons.
+- **Smart Flattening:** Top-level nested objects are automatically flattened to dot notation (e.g., `config.db{...}`).
+- **Control Character Escaping:** All control characters (ASCII 0-31) are now properly escaped to prevent binary file creation.
+- **Runtime Schema Validation:** New `zon` builder and `validate()` function for LLM guardrails.
+- **Algorithmic Benchmark Generation**: Replaced LLM-based question generation with deterministic algorithm for consistent benchmarks.
+- **Expanded Dataset**: Added "products" and "feed" data to unified dataset for real-world e-commerce scenarios.
+- **Tricky Questions**: Introduced edge cases (non-existent fields, logic traps, case sensitivity) to stress-test LLM reasoning.
+- **Robust Benchmark Runner**: Added exponential backoff and rate limiting to handle Azure OpenAI S0 tier constraints.
+
+### Changed
+- **Benchmark Formats**: Refined tested formats to ZON, TOON, JSON, JSON (Minified), and CSV for focused analysis.
+- **Documentation**: Updated README and API references with the latest benchmark results (GPT-5 Nano) and accurate token counts.
+- **Token Efficiency**: Recalculated efficiency scores based on the expanded dataset, confirming ZON's leadership (1430.6 score).
+
+### Improved
+- **Token Efficiency:** Achieved up to 23.8% reduction vs JSON (GPT-4o) thanks to syntax optimizations.
+- **Readability:** Cleaner, block-like structure for nested data.
+
+### Fixed
+- **Critical Data Integrity**: Fixed roundtrip failures for strings containing newlines, empty strings, and escaped characters.
+- **Decoder Logic**: Fixed `_split_by_delimiter` to correctly handle nested arrays and objects within table cells (e.g., `[10, 20]`).
+- **Encoder Logic**: Added mandatory quoting for empty strings and strings with newlines to prevent data loss.
+- **Rate Limiting**: Resolved 429 errors during benchmarking with robust retry logic.
+
+## [1.0.3] - 2025-11-28
+
+### 🎯 100% LLM Retrieval Accuracy Achieved
+
+**Major Achievement**: ZON now achieves **100% LLM retrieval accuracy** while maintaining superior token efficiency over TOON!
+
+### Changed
+- **Explicit Sequential Columns**: Disabled automatic sequential column omission (`[id]` notation)
+  - All columns now explicitly listed in table headers for better LLM comprehension
+  - Example: `users:@(5):active,id,lastLogin,name,role` (was `users:@(5)[id]:active,lastLogin,name,role`)
+  - Trade-off: +1.7% token increase for 100% LLM accuracy
+
+### Performance
+- **LLM Accuracy**: 100% (24/24 questions) vs TOON 100%, JSON 91.7%
+- **Token Efficiency**: 19,995 tokens (5.0% fewer than TOON's 20,988)
+- **Overall Savings vs TOON**: 4.6% (Claude) to 17.6% (GPT-4o)
+
+### Quality
+- ✅ All unit tests pass (28/28)
+- ✅ All roundtrip tests pass (27/27 datasets)
+- ✅ No data loss or corruption
+- ✅ Production ready
+
 ## [1.0.2] - 2025-11-24
 
 ### Changed - "ClearText" Major Format Overhaul
@@ -14,8 +64,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **YAML-like metadata**: Changed from `M=key="val"` to clean `key:val` syntax
 - **Clean @table syntax**: Replaced schema markers with readable `@tablename(count):cols`
 - **Aggressive quote removal**: Only quote when absolutely necessary (commas, control chars)
-  - Spaces no longer trigger quoting: `Blue Lake Trail` instead of `"Blue Lake Trail"`
-  - Colons allowed in values
 - **Compact array syntax**: `[item1,item2,item3]` with minimal inner quotes
 - **No spaces after separators**: Removed spaces after `:` and `,` for compactness
 
@@ -24,78 +72,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **25.6% better** than TOON (up from 20.8%)
 - Tested on 318 records across 6 real-world datasets
 
-#### New Features
-- Singleton bypass: 1-item lists flatten to metadata (`items.0.id:1`)
-- Pure list handling: Lists without wrapper use default `@data` table name
-- Boolean hard rule: Always explicit `T`/`F`, never inferred from empty cells
-
-#### Documentation
-- Comprehensive README.md with visual comparisons
-- EXAMPLES.md with detailed symbol reference
-- Benchmark sample generation scripts
-- `/benchmarks/encoded_samples/` with `.json`, `.zon`, and `.toon` comparisons
-
-### Fixed
-- Boolean preservation in roundtrip encoding/decoding
-- Array index handling in decoder unflatten logic
-- Pure list encoding/decoding (was returning empty string)
-
 ## [1.0.0] - 2025-11-23
 
 ### Added - Initial Release
-
-#### Core Features
-- ZON v7.0 format with pipe-based protocol syntax
-- Compression rules: Range (R), Liquid (L), Solid (S), Pattern (P), Value (V)
-- Anchor-based row references
-- Global dictionary for repeated strings
+- ZON v1.0 format implementation
+- Full encoder/decoder with lossless round-trips
 - CLI tool for encoding/decoding
 - Comprehensive test suite
 
-#### Performance
-- ~27% average compression vs JSON
-- ~21% better than TOON on structured data
-
-#### Package
-- Python 3.8+ support
-- PyPI distribution
-- Apache 2.0 license
-
----
-
-## Upgrade Notes
-
-### From 1.0.0 to 1.0.2
-
-**⚠️ Breaking Change**: The encoded format has changed completely. Data encoded with v1.0.0 will **not** decode correctly with v1.0.2.
-
-**Migration**: Re-encode your data with v1.0.2:
-
-```python
-import zon
-
-# Load your JSON data
-with open('data.json') as f:
-    data = json.load(f)
-
-# Encode with new format
-encoded = zon.encode(data)
-
-# Decode works as before
-decoded = zon.decode(encoded)
-```
-
-**Benefits**: The new format is much more readable and efficient. The migration is worth it for:
-- ✅ 4.5% additional compression
-- ✅ Zero protocol overhead
-- ✅ Better LLM readability
-- ✅ Cleaner visual appearance
-
----
-
-## Links
-
-- [PyPI](https://pypi.org/project/zon-format/)
-- [GitHub](https://github.com/ZON-Format/ZON)
-- [Examples](EXAMPLES.md)
-- [README](README.md)
+[1.0.4]: https://github.com/ZON-Format/ZON/releases/tag/v1.0.4
+[1.0.3]: https://github.com/ZON-Format/ZON/releases/tag/v1.0.3
+[1.0.2]: https://github.com/ZON-Format/ZON/releases/tag/v1.0.2
+[1.0.0]: https://github.com/ZON-Format/ZON/releases/tag/v1.0.0