apache · davsclaus · Jan 16, 2026 · Jan 5, 2026 · Jan 5, 2026 · Jan 5, 2026
diff --git a/.docsearch.README.md b/.docsearch.README.md
@@ -0,0 +1,111 @@
+# DocSearch Configuration
+
+This directory contains the Algolia DocSearch configuration for the Apache Camel website.
+
+## Overview
+
+The `.docsearch.config.json` file defines how Algolia's crawler indexes the Camel website for search functionality. This configuration ensures that all relevant content is discoverable through the site search, including:
+
+- All component documentation (not just canonical versions)
+- Tables with component specifications and supported models
+- Metadata sections and inline code
+- Multiple documentation versions (next, latest, and release branches)
+
+## Key Configuration Elements
+
+### Index Settings (`index`)
+- **name**: `apache_camel` - The Algolia index where content is stored
+- **startUrls**: Entry points for the crawler
+- **pathsToMatch**: URL patterns to include in indexing
+- **pathsToIgnore**: URLs to skip (search pages, error pages, etc.)
+- **includeHeadingLevels**: All heading levels (h1-h6) are indexed for better navigation
+
+### Content Selectors (`selectors`)
+
+These CSS selectors define what content gets indexed:
+
+- **lvl0-lvl5**: Heading hierarchy (h1-h6) used to build the breadcrumb structure
+- **text**: Main content to index including:
+  - Paragraphs (`p`), list items (`li`)
+  - Table cells (`td`, `th`) - **Important for component specs**
+  - Definition terms (`dt`, `dd`)
+  - Code blocks (`code`, `pre`)
+
+This ensures keywords like "PyTorch" in Model Zoo tables are indexed, fixing issue #1209.
+
+### Exclusions (`selectors_exclude`)
+
+Navigation, sidebars, footers, and other non-content elements are excluded to improve search quality:
+- `.no_index`, `[data-no-index]` - Custom exclusion attributes
+- Navigation elements: `nav`, `.navbar`, `.menu`, `.sidebar`, `.toc`
+- Footer and copyright
+- Hidden elements: `.hidden`, `[aria-hidden='true']`
+
+### Crawling Rules (`crawler`)
+
+- **maxDepth**: 20 - Allows deep navigation through component docs
+- **maxUrls**: 50,000 - Sufficient for Camel's comprehensive documentation
+- **sitemapUrls**: Uses sitemap for efficient crawling
+- **timeoutMs**: 30,000 - Adequate for large pages with tables
+
+### Multi-Version Support (`start_urls`)
+
+The configuration crawls multiple documentation versions:
+
+1. **next** (page_rank: 5) - Development version
+2. **latest** (page_rank: 5) - Latest stable
+3. **\d+\.\d+\.\x** (page_rank: 4) - Release branches (4.4.x, 4.10.x, etc.)
+4. **manual** (page_rank: 7) - Core documentation (highest priority)
+5. **docs** (page_rank: 6) - General documentation
+6. **blog** (page_rank: 3) - Blog posts
+
+This addresses the issue where only canonical (4.4.x) pages were indexed.
+
+### Search Behavior (`custom_settings`)
+
+- **searchableAttributes**: Fields available for full-text search
+- **separatorsToIndex**: Include underscores, dots, and dashes in search (important for component names like `camel-k`)
+- **attributeForDistinctResults**: Deduplicate results by URL to avoid showing the same page multiple times
+
+## Maintenance
+
+When making changes to this configuration:
+
+1. **Test locally** - Build the site and verify crawling works
+2. **Document changes** - Explain why selectors or URLs were modified
+3. **Consider impacts** - Changes affect search indexing across all users
+4. **Verify coverage** - Use Algolia dashboard to check what's indexed
+
+### Common Modifications
+
+**Adding new documentation sections:**
+```json
+{
+  "url": "https://camel.apache.org/new-section/",
+  "page_rank": 5
+}
+```
+
+**Excluding problematic content:**
+```json
+"selectors_exclude": [
+  ".no_index",
+  ".problematic-element"
+]
+```
+
+**Adjusting content extraction:**
+Modify the `text` selector in the `selectors` section to include additional elements.
+
+## Related Issue
+
+- **Issue #1209**: "The search is not finding several fields"
+  - Problem: Keywords like Bradley, firmata, PyTorch not indexed from component documentation
+  - Root cause: Missing configuration for table content and non-canonical versions
+  - Solution: This configuration file with improved selectors and multi-version crawling
+
+## References
+
+- [Algolia DocSearch Documentation](https://docsearch.algolia.com/)
+- [Camel Website GitHub](https://github.com/apache/camel-website)
+- [Issue #1209](https://github.com/apache/camel-website/issues/1209)
diff --git a/.docsearch.config.json b/.docsearch.config.json
@@ -0,0 +1,125 @@
+{
+  "index": {
+    "name": "apache_camel",
+    "startUrls": [
+      "https://camel.apache.org/"
+    ],
+    "ignoreCanonicalTo": false,
+    "pathsToMatch": [
+      "https://camel.apache.org/**"
+    ],
+    "pathsToIgnore": [
+      "https://camel.apache.org/search",
+      "https://camel.apache.org/404.html"
+    ],
+    "includeHeadingLevels": [1, 2, 3, 4, 5, 6],
+    "stripQueryParameters": true
+  },
+  "crawler": {
+    "userAgent": "Algolia Crawler",
+    "maxDepth": 20,
+    "maxUrls": 50000,
+    "waitUntilFired": true,
+    "timeoutMs": 30000,
+    "sitemapUrls": [
+      "https://camel.apache.org/sitemap.xml"
+    ],
+    "ignoreRobotsTxt": false,
+    "allowedDomains": [
+      "camel.apache.org"
+    ]
+  },
+  "selectors": {
+    "lvl0": {
+      "selector": "h1",
+      "global": true,
+      "default_value": "Documentation"
+    },
+    "lvl1": "h2",
+    "lvl2": "h3",
+    "lvl3": "h4",
+    "lvl4": "h5",
+    "lvl5": "h6",
+    "text": "p, li, td, th, dt, dd, span:not(.tooltip), div:not([class*='hidden']), table tbody, code, pre"
+  },
+  "selectors_exclude": [
+    ".no_index",
+    "[data-no-index]",
+    ".sidebar",
+    ".breadcrumb",
+    "nav",
+    ".navbar",
+    ".menu",
+    ".toc",
+    "footer",
+    ".footer",
+    ".copyright",
+    ".hide",
+    ".hidden",
+    "[aria-hidden='true']",
+    "script",
+    "style",
+    ".language-toggle",
+    ".sidebar-toggle"
+  ],
+  "min_indexed_level": 1,
+  "only_content_level": false,
+  "start_urls": [
+    {
+      "url": "https://camel.apache.org/components/next/",
+      "page_rank": 5
+    },
+    {
+      "url": "https://camel.apache.org/components/latest/",
+      "page_rank": 5
+    },
+    {
+      "url": "https://camel.apache.org/components/(\\d+)\\.(\\d+)\\.x/",
+      "page_rank": 4
+    },
+    {
+      "url": "https://camel.apache.org/manual/",
+      "page_rank": 7
+    },
+    {
+      "url": "https://camel.apache.org/docs/",
+      "page_rank": 6
+    },
+    {
+      "url": "https://camel.apache.org/blog/",
+      "page_rank": 3
+    },
+    {
+      "url": "https://camel.apache.org/",
+      "page_rank": 8
+    }
+  ],
+  "stop_urls": [
+    "\\?",
+    "#"
+  ],
+  "custom_settings": {
+    "separatorsToIndex": "_.-",
+    "attributesForFaceting": [
+      "version"
+    ],
+    "attributesToIndex": [
+      "hierarchy",
+      "content",
+      "url"
+    ],
+    "minWordSizefor1Typo": 4,
+    "minWordSizefor2Typos": 8,
+    "exactOnSingleWordQuery": "none",
+    "attributeForDistinctResults": "url",
+    "searchableAttributes": [
+      "hierarchy.lvl0",
+      "hierarchy.lvl1",
+      "hierarchy.lvl2",
+      "hierarchy.lvl3",
+      "hierarchy.lvl4",
+      "hierarchy.lvl5",
+      "content"
+    ]
+  }
+}
diff --git a/README.md b/README.md
@@ -453,6 +453,35 @@ all generated sources in the project first.
 
 Of course this then takes some more time than an optimized rebuild (time to grab another coffee!).
 
+## Search Indexing Configuration
+
+The website uses [Algolia DocSearch](https://docsearch.algolia.com/) to provide site-wide search functionality. The search configuration is defined in [`.docsearch.config.json`](.docsearch.config.json).
+
+### What is indexed
+
+The configuration ensures that Algolia's crawler indexes:
+- All documentation versions (development `next`, latest, and release branches like `4.4.x`)
+- Component specifications and tables (fixing issue #1209)
+- All heading levels and content blocks
+- Code blocks and inline code snippets
+
+### Maintaining the search configuration
+
+If you need to modify what gets indexed or how content is crawled:
+
+1. Edit [`.docsearch.config.json`](.docsearch.config.json) to change selectors or crawling rules
+2. Review the detailed documentation in [`.docsearch.README.md`](.docsearch.README.md)
+3. Test your changes by building the site locally: `yarn build`
+4. Verify content is indexable by visiting the search functionality in the preview
+
+Key elements to be aware of:
+- **Selectors** define what HTML elements are indexed (headings, paragraphs, tables, code)
+- **start_urls** control which parts of the site are crawled and their search priority
+- **selectors_exclude** specify elements to skip (navigation, sidebars, footers)
+- **custom_settings** control search behavior and index settings
+
+For more details, see [`.docsearch.README.md`](.docsearch.README.md).
+
 # Checks, publishing the website
 
 The content of the website, as built by the [Camel.website](https://ci-builds.apache.org/job/Camel/job/Camel.website/job/main/)

diff --git a/antora-ui-camel/src/css/header.css b/antora-ui-camel/src/css/header.css
@@ -303,6 +303,8 @@ html:not([data-scroll='0']) .navbar {
   margin-right: 10px;
   overflow-y: auto;
   max-height: 80vh;
+  max-width: min(600px, 90vw);
+  min-width: 300px;
   scrollbar-width: thin; /* Firefox */
 }