Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 159 additions & 0 deletions WEAVE-REFERENCE-FIXES-SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Weave Reference Generation Script Fixes

This document summarizes the fixes applied to the Weave reference documentation generation scripts based on PR #1888 feedback.

## Issues Fixed

### 1. Models Reference Files Being Renamed (CRITICAL BUG)
**Problem**: `fix_casing.py` was incorrectly targeting `models/ref/python/public-api` files instead of Weave reference docs.

**Fix**: Updated `fix_casing.py` to only target `weave/reference/python-sdk` files.
- Changed path from `models/ref/python/public-api` to `weave/reference/python-sdk`
- Removed the logic that was renaming Models API files (ArtifactCollection, etc.)
- Added clear comments indicating this should NEVER touch Models reference docs

**Files Modified**:
- `scripts/reference-generation/weave/fix_casing.py`

### 2. TypeScript SDK Using PascalCase Filenames
**Problem**: TypeScript SDK files were being generated with PascalCase filenames (e.g., `Dataset.mdx`, `WeaveClient.mdx`), which causes Git case-sensitivity issues.

**Fix**: Updated generation scripts to use lowercase filenames throughout.
- Modified `generate_typescript_sdk_docs.py` to convert filenames to lowercase when creating `.mdx` files
- Updated function and type-alias extraction to use lowercase filenames
- Updated internal links to use lowercase paths

**Files Modified**:
- `scripts/reference-generation/weave/generate_typescript_sdk_docs.py` (lines 259, 319-320, 369-370, 379)
- `scripts/reference-generation/weave/fix_casing.py` (simplified to just convert to lowercase)

### 3. H1 in service-api/index.mdx
**Problem**: The generated `service-api/index.mdx` had both a frontmatter title and an H1, which is redundant in Mintlify.

**Fix**: Removed the H1 heading since Mintlify uses the frontmatter title.

**Files Modified**:
- `scripts/reference-generation/weave/generate_service_api_spec.py` (line 31)

### 4. Duplicate H3 Headings in service-api.mdx
**Problem**: The `service-api.mdx` file had duplicate category sections (e.g., "### Calls" appeared on both line 23 and line 158), listing the same endpoints twice.

**Fix**: Added deduplication logic to prevent duplicate categories and duplicate endpoints.
- Track which categories have been written to prevent duplicate H3 headings
- Deduplicate endpoints within each category by (method, path) tuple
- This prevents the same endpoint from being listed multiple times if it appears in the OpenAPI spec with duplicate tags

**Files Modified**:
- `scripts/reference-generation/weave/update_service_api_landing.py` (lines 99-118)

### 5. Markdown Table Formatting Errors (------ lines)
**Problem**: Python SDK docs contained standalone lines with just dashes (`------`) which break markdown parsing.

**Example**: In `trace_server_interface.mdx`, lines like 22, 30, 39, etc. had `------` that created invalid table structures.

**Fix**: Added regex pattern to remove these malformed table separators.
- Pattern: `\n\s*------+\s*\n` → `\n\n`
- This removes lines that are just dashes with optional whitespace

**Files Modified**:
- `scripts/reference-generation/weave/generate_python_sdk_docs.py` (lines 258-260)

## Testing Recommendation

Before merging, test the fixes by running the reference generation locally:

```bash
# From the docs repository root
cd scripts/reference-generation/weave
python generate_weave_reference.py
```

Then verify:
1. No files in `models/ref/python/public-api` were modified
2. All TypeScript SDK files in `weave/reference/typescript-sdk/` have lowercase filenames
3. `weave/reference/service-api/index.mdx` has no H1 heading
4. `weave/reference/service-api.mdx` has no duplicate H3 category headings
5. No `------` lines in `weave/reference/python-sdk/trace_server/trace_server_interface.mdx`
6. In `docs.json`, modules under `weave/reference/python-sdk/trace/` are grouped as "Core" (not "Other")
7. In `docs.json`, the Service API `openapi` configuration uses the local spec (not a GitHub URL) if sync_openapi_spec.py was run with `--use-local`

### 6. Incorrect Section Grouping ("Core" → "Other")
**Problem**: Python SDK modules in the `trace/` directory were being incorrectly grouped as "Other" instead of "Core" in docs.json navigation.

**Root Cause**: The path checking logic in `update_weave_toc.py` was checking `if parts[0] == "weave"`, but paths are relative to `python-sdk/`, so `parts[0]` is actually the module subdirectory (`trace`, `trace_server`, etc.), not `weave`.

**Fix**: Corrected the path checking logic to check the actual first path component.
- Changed from checking `parts[0] == "weave"` then `parts[1] == "trace"`
- To directly checking `parts[0] == "trace"`, `parts[0] == "trace_server"`, etc.

**Files Modified**:
- `scripts/reference-generation/weave/update_weave_toc.py` (lines 33-45)

### 7. OpenAPI Configuration Being Overwritten
**Problem**: `update_weave_toc.py` was unconditionally overwriting the OpenAPI spec configuration in docs.json to use a remote URL, ignoring the local spec that `sync_openapi_spec.py` downloads and configures.

**Impact**: Even though `sync_openapi_spec.py` downloads the OpenAPI spec locally and can configure docs.json to use it, `update_weave_toc.py` would immediately overwrite it with a remote GitHub URL, defeating the purpose of the local spec.

**Fix**: Removed the Service API OpenAPI configuration code from `update_weave_toc.py`. This script should only manage Python/TypeScript SDK navigation, not the OpenAPI spec source.
- Deleted lines 209-224 that were setting `page["openapi"]` to remote URLs
- Added comment noting that OpenAPI configuration is managed by `sync_openapi_spec.py`

**Files Modified**:
- `scripts/reference-generation/weave/update_weave_toc.py` (lines 206-207)

### 8. Missing Root Module Documentation (CRITICAL - WEAVE PACKAGE REGRESSION)
**Problem**: The generated `python-sdk.mdx` file is only 8 lines (just frontmatter), completely missing all the important API documentation for functions like `init()`, `publish()`, `ref()`, `get()`, etc.

**Expected**: The current version (Weave 0.52.10) has 2074 lines documenting all the core Weave functions and classes.

**Root Cause**: **This is a WEAVE PACKAGE REGRESSION, not a script bug.**

Something changed in Weave between versions **0.52.10** (current docs) and **0.52.16** (PR version) that broke documentation generation for the root `weave` module. The generation scripts haven't changed, and lazydocs hasn't changed - so this is an upstream issue in the Weave package itself.

Possible causes:
1. Changes to `weave/__init__.py` that affect how the module exports its public API
2. Module structure refactoring that lazydocs can't handle
3. New import patterns or lazy loading that breaks introspection

**Status**: **CRITICAL UPSTREAM BUG** - This makes the Python SDK documentation completely unusable for version 0.52.16.

**Action Required**: Report this to the Weave team immediately:
1. File an issue: https://github.com/wandb/weave/issues
2. Include: "Documentation generation broken in 0.52.16 - root module exports not discoverable by lazydocs"
3. Mention: "Works fine in 0.52.10, broken in 0.52.16"
4. Tag: @dbrian57 or relevant Weave maintainers

**Recommendation**:
- **DO NOT MERGE PR #1888** - it will break Python SDK documentation
- Either: Fix the Weave package and regenerate docs
- Or: Stay on 0.52.10 documentation until the Weave package is fixed

**Files to Investigate** (in Weave repo):
- `weave/__init__.py` between versions 0.52.10 and 0.52.16
- Any structural changes to the weave package in that version range

### 9. OpenAPI Spec Validation (New Feature)
**Enhancement**: Added validation to detect issues in the OpenAPI spec itself, which can help identify upstream problems.

**Features**:
- Detects duplicate endpoint definitions (same method+path defined multiple times)
- Identifies endpoints appearing in multiple categories/tags
- Warns when critical issues like duplicate endpoints are found
- Suggests reporting issues to the Weave team when spec problems are detected

**Files Modified**:
- `scripts/reference-generation/weave/sync_openapi_spec.py` (added `validate_spec()` function and integration in `main()`)

This will help identify if duplicate H3s or other issues originate from the OpenAPI spec rather than our generation scripts.

## Files Modified Summary

1. `scripts/reference-generation/weave/fix_casing.py`
2. `scripts/reference-generation/weave/generate_typescript_sdk_docs.py`
3. `scripts/reference-generation/weave/generate_service_api_spec.py`
4. `scripts/reference-generation/weave/update_service_api_landing.py`
5. `scripts/reference-generation/weave/generate_python_sdk_docs.py`
6. `scripts/reference-generation/weave/update_weave_toc.py`
7. `scripts/reference-generation/weave/sync_openapi_spec.py` (new validation feature)

All fixes are backward compatible and will take effect on the next reference documentation generation run.
109 changes: 24 additions & 85 deletions scripts/reference-generation/weave/fix_casing.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,108 +12,47 @@
from pathlib import Path

def fix_typescript_casing(base_path):
"""Fix TypeScript SDK file casing."""
print("Fixing TypeScript SDK file casing...")
"""Fix TypeScript SDK file casing - ensure all files use lowercase."""
print("Fixing TypeScript SDK file casing to lowercase...")

ts_base = Path(base_path) / "weave/reference/typescript-sdk/weave"
ts_base = Path(base_path) / "weave/reference/typescript-sdk"
if not ts_base.exists():
print(f" TypeScript SDK path not found: {ts_base}")
return

# Define correct names for each directory
casing_rules = {
"classes": {
"dataset": "Dataset",
"evaluation": "Evaluation",
"weaveclient": "WeaveClient",
"weaveobject": "WeaveObject",
},
"interfaces": {
"callschema": "CallSchema",
"callsfilter": "CallsFilter",
"weaveaudio": "WeaveAudio",
"weaveimage": "WeaveImage",
},
"functions": {
# Functions should be lowercase/camelCase
"init": "init",
"login": "login",
"op": "op",
"requirecurrentcallstackentry": "requireCurrentCallStackEntry",
"requirecurrentchildsummary": "requireCurrentChildSummary",
"weaveaudio": "weaveAudio",
"weaveimage": "weaveImage",
"wrapopenai": "wrapOpenAI",
},
"type-aliases": {
"op": "Op", # Type alias Op is uppercase
"opdecorator": "OpDecorator",
"messagesprompt": "MessagesPrompt",
"stringprompt": "StringPrompt",
}
}
# All TypeScript SDK files should use lowercase filenames for consistency
# This applies to classes, functions, interfaces, and type-aliases
subdirs_to_check = ["classes", "functions", "interfaces", "type-aliases"]

for dir_name, rules in casing_rules.items():
dir_path = ts_base / dir_name
for subdir in subdirs_to_check:
dir_path = ts_base / subdir
if not dir_path.exists():
continue

for file in dir_path.glob("*.mdx"):
basename = file.stem.lower()
if basename in rules:
correct_name = rules[basename]
if file.stem != correct_name:
new_path = file.parent / f"{correct_name}.mdx"
print(f" Renaming: {file.name} → {correct_name}.mdx")
shutil.move(str(file), str(new_path))
# Convert filename to lowercase
lowercase_name = file.stem.lower()
if file.stem != lowercase_name:
new_path = file.parent / f"{lowercase_name}.mdx"
print(f" Renaming: {file.name} → {lowercase_name}.mdx")
shutil.move(str(file), str(new_path))

def fix_python_casing(base_path):
"""Fix Python SDK file casing."""
print("Fixing Python SDK file casing...")
"""Fix Python SDK file casing for WEAVE reference docs only."""
print("Fixing Weave Python SDK file casing...")

py_base = Path(base_path) / "models/ref/python/public-api"
# IMPORTANT: This should ONLY touch Weave reference docs, never Models reference docs
py_base = Path(base_path) / "weave/reference/python-sdk"
if not py_base.exists():
print(f" Python SDK path not found: {py_base}")
print(f" Weave Python SDK path not found: {py_base}")
return

# Python class files that should be uppercase
uppercase_files = {
"artifactcollection": "ArtifactCollection",
"artifactcollections": "ArtifactCollections",
"artifactfiles": "ArtifactFiles",
"artifacttype": "ArtifactType",
"artifacttypes": "ArtifactTypes",
"betareport": "BetaReport",
"file": "File",
"member": "Member",
"project": "Project",
"registry": "Registry",
"run": "Run",
"runartifacts": "RunArtifacts",
"sweep": "Sweep",
"team": "Team",
"user": "User",
}

# Files that should remain lowercase
lowercase_files = ["api", "artifacts", "automations", "files", "projects",
"reports", "runs", "sweeps", "_index"]
# For Weave Python SDK, we generally want lowercase filenames
# Only specific files might need special casing - currently none known
# Most Weave modules use lowercase with underscores (e.g., weave_client.mdx)

for file in py_base.glob("*.mdx"):
basename = file.stem.lower()

if basename in uppercase_files:
correct_name = uppercase_files[basename]
if file.stem != correct_name:
new_path = file.parent / f"{correct_name}.mdx"
print(f" Renaming: {file.name} → {correct_name}.mdx")
shutil.move(str(file), str(new_path))
elif basename in lowercase_files:
# Ensure these stay lowercase
if file.stem != basename:
new_path = file.parent / f"{basename}.mdx"
print(f" Renaming: {file.name} → {basename}.mdx")
shutil.move(str(file), str(new_path))
print(f" Weave Python SDK files are generated with correct casing")
print(f" No casing changes needed for Weave reference documentation")

def main():
"""Main function to fix all casing issues."""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,10 @@ def generate_module_docs(module, module_name: str, src_root_path: str, version:
# Remove <b>` at the start of lines that don't have a closing </b>
content = re.sub(r'^- <b>`([^`\n]*?)$', r'- \1', content, flags=re.MULTILINE)

# Remove malformed table separators that lazydocs sometimes generates
# These appear as standalone lines with just dashes (------) which break markdown parsing
content = re.sub(r'\n\s*------+\s*\n', '\n\n', content)

# Fix parameter lists that have been broken by lazydocs
# Strategy: Parse all parameters into a structured format, then reconstruct them properly
def fix_parameter_lists(text):
Expand Down
56 changes: 8 additions & 48 deletions scripts/reference-generation/weave/generate_service_api_spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,59 +12,19 @@
def main():
"""Main function."""
print("Service API configuration:")
print(" Using remote OpenAPI spec: https://trace.wandb.ai/openapi.json")
print(" Mintlify will generate documentation for all 41 endpoints")
print(" Using OpenAPI spec from sync_openapi_spec.py")
print(" Mintlify will generate documentation for endpoints")
print("")

# Create the service-api directory structure
# Create the service-api directory structure for openapi.json
# Note: The landing page is service-api.mdx (not service-api/index.mdx)
# and is managed by update_service_api_landing.py
service_api_dir = Path("weave/reference/service-api")
service_api_dir.mkdir(parents=True, exist_ok=True)

# Create an index file if it doesn't exist
index_file = service_api_dir / "index.mdx"
if not index_file.exists():
index_content = """---
title: "Service API"
description: "REST API endpoints for the Weave service"
---

# Weave Service API

The Weave Service API provides REST endpoints for interacting with the Weave tracing service.

## Available Endpoints

This documentation is automatically generated from the OpenAPI specification at https://trace.wandb.ai/openapi.json.

The API includes endpoints for:
- **Calls**: Start, end, update, query, and manage traces
- **Tables**: Create, update, and query data tables
- **Files**: Upload and manage file attachments
- **Objects**: Store and retrieve versioned objects
- **Feedback**: Collect and query user feedback
- **Costs**: Track and query usage costs
- **Inference**: OpenAI-compatible inference endpoints

## Authentication

Most endpoints require authentication. Include your W&B API key in the request headers:

```
Authorization: Bearer YOUR_API_KEY
```

## Base URL

All API requests should be made to:

```
https://trace.wandb.ai
```
"""
index_file.write_text(index_content)
print(f"✓ Created Service API index at {index_file}")

print("✓ Service API setup complete!")
print("✓ Service API directory structure ready")
print(" Note: Landing page at weave/reference/service-api.mdx")
print(" Note: OpenAPI spec at weave/reference/service-api/openapi.json")


if __name__ == "__main__":
Expand Down
Loading