Skip to content

Eliminate unnecessary session writes for guest users#5541

Open
Deltik wants to merge 7 commits intoe107inc:masterfrom
Deltik:fix/5262
Open

Eliminate unnecessary session writes for guest users#5541
Deltik wants to merge 7 commits intoe107inc:masterfrom
Deltik:fix/5262

Conversation

@Deltik
Copy link
Member

@Deltik Deltik commented Aug 5, 2025

Motivation and Context

This change solves the rapidly growing session table issue where bot traffic and anonymous visitors were creating thousands of orphaned database sessions, causing the e107_session table to balloon from a reported 6 rows to 17,334+ rows within a week, with the vast majority being guest sessions (session_user = 0).

Fixes: #5262

Description

This PR comprehensively eliminates unnecessary session writes for guest/unauthenticated users through multiple optimizations:

Key Changes

  1. JWT-based CAPTCHA – Replace session-based CAPTCHA with stateless JWT tokens
  2. Cookie-based CSRF for guests – Use HTTP-only cookies instead of sessions for guest CSRF tokens
  3. Lazy session initialization – Don't write sessions to persistent storage when they're empty
  4. Conditional CHAP challenges – Skip challenge generation when pref password_CHAP is disabled
  5. Message handler optimization – Prevent session writes when no messages exist
  6. Remove unused session keys – Eliminate the unused language-list session key

Implementation Details

  • Added e_jwt handler for stateless token generation
  • Added CSRF handlers for cookie-based CSRF protection for guests and (the existing) session-based CSRF protection for sessions that have authentication tokens inside them
  • Modified session handler to support lazy initialization and conditional writes
  • Maintained full backward compatibility through magic methods and fallback mechanisms

Breaking Changes

  • secure_image::$random_number is now private (access is possible using __get() and __set() magic methods but discouraged by deprecation)
  • CSRF tokens for guests now use cookies instead of sessions

The session table growth should be dramatically reduced for sites with bot traffic and other one-time visitors. Only authenticated users and guests actively interacting with forms will trigger session creation.

How Has This Been Tested?

Automated

New automated tests in secure_imageTest at e107_tests/tests/unit/secure_imageTest.php

Manual

  • @Jimmi08 reviewed CAPTCHA functionality
  • Walked through the Double-Submit Cookie CSRF protection for guests
  • Observed that the session table no longer adds records for simple guest page views
  • Checked backwards compatibility by not noticing anything broken in eMessage usages, logging in, logging out, and clicking around

Types of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist

BREAKING CHANGE: The `secure_image::$random_number` property is now
private and contains JWT tokens instead of session IDs. Legacy code
accessing this property will continue to work via magic methods, but
direct manipulation may cause undefined behavior.

This change eliminates the need for database sessions when generating
CAPTCHAs for anonymous users, solving the rapidly growing session
table issue where bot traffic was creating thousands of orphaned
sessions.

Key changes:
- Add JWT handler (`e_jwt`) for stateless CAPTCHA token generation
- Implement lazy loading with magic methods for backward compatibility
- Store CAPTCHA solutions in self-contained encrypted JWT tokens
- Add configurable TTL (default 10 min) and optional IP verification
- Remove dependency on database sessions for anonymous CAPTCHA users
- Add test coverage for backwards compatibility

The session table growth should be enormously reduced for sites with
bot traffic and other one-time visitors that don't otherwise make use
of sessions.

Fixes: e107inc#5262
@Deltik Deltik requested review from CaMer0n and Moc August 5, 2025 09:50
Deltik added 5 commits August 5, 2025 14:23
This change prevents the message handler from writing to the session
storage when there are no messages to store, reducing unnecessary
database writes for guest users. This fix:

- Adds an `eMessage::isSessionDataEmpty()` helper to check if message
  arrays are empty
- Introduces an `eMessage::setSessionData()` wrapper that only writes
  non-empty data
- Clears the session key when all message arrays are empty
- Adds an `eMessage::getSessionData()` wrapper to ensure the proper
  array structure when cleared or partially populated

This eliminates the `_system_messages` key session write for visitors
who have no pending messages to display on the next page, reducing the
session storage overhead.

Supports: e107inc#5262
Implement a hybrid CSRF token system that eliminates session writes
for anonymous visitors while maintaining backward compatibility for
authenticated users.

- Add new CSRF handler classes with strategy pattern
  - `CSRFTokenHandler`: Abstract base class
  - `CSRFSessionHandler`: Session-based tokens for authenticated users
  - `CSRFCookieHandler`: JWT + double-submit cookie for guests
- Refactor `e_session` to use CSRF handlers via `getCSRFHandler()`
- Use `$_SESSION[e_COOKIE]` to deduce auth status before USERID is
  defined
- Automatically clean up guest CSRF cookie on authentication
- Reuse existing session validation logic for request fingerprinting
- Maintain full backward compatibility with `e_TOKEN` constant

The cookie-based handler uses JWT tokens with the same TTL as the
session's and validates using the double-submit cookie pattern.
Request fingerprinting includes IP address and user agent validation
based on security level.

This change significantly reduces session storage overhead for sites
with high guest traffic by eliminating the `__form_token` session
write for anonymous visitors.

Supports: e107inc#5262
CHAP authentication is largely obsolete and unnecessary for most
sites. When the `password_CHAP` preference is disabled (which is the
default), skip generating `challenge`, `prevchallenge`, and
`prevprevchallenge` session values.

This reduces unnecessary session writes for guest users, helping
address issue e107inc#5262 where sessions were being created for anonymous
visitors.

In the unlikely case that sites that have `password_CHAP` disabled but
rely on the challenge session values being present, admins will need
to enable `password_CHAP` in their core preferences.

This change also removes the unused "ubrowser" session key that was
bundled in the CHAP challenge code generator.

Supports: e107inc#5262
When guests visit the site without performing any actions that require
session state, avoid creating persistent session storage. This
optimization:

- Skips session validation data storage for guests (no
  `_session_validate_data`)
- Skips cookie validation timer for guests (no
  `_cookie_session_validate`)
- Destroys empty sessions during shutdown to prevent disk writes

Supports: e107inc#5262
@Deltik Deltik force-pushed the fix/5262 branch 2 times, most recently from cc949b2 to dec1316 Compare August 6, 2025 09:03
@Deltik Deltik changed the title Replace session-based CAPTCHA with JSON Web Tokens Eliminate unnecessary session writes for guest users Aug 6, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Nick! What do you think about renaming this file to e_jwt_class.php ? (to match e_file, e_parse etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The precedent looked uncertain and inconsistent to me because we also have:

  • e_bbcode in bbcode_handler.php
  • e_form in form_handler.php
  • e_jshelper in js_helper.php
  • e_media in media_class.php
  • e_menu in menu_class.php
  • e_model in model_class.php
  • e_online in online_class.php

So I felt my way through the name jwt_handler.php. But you're the boss, so if you'd change the name jwt_handler.phpe_jwt_class.php, we can do that!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. We're kind of stuck with those old files. If you don't mind changing it to e_jwt_class that would be great. I believe PHPStorm is happier when there's a file-name/class-name match too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in fixup

Update MySQL and Xdebug sources for unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Question]: huge table e107_session

2 participants