initial scim implementation by fcollman · Pull Request #75 · CAVEconnectome/middle_auth

fcollman · 2026-02-06T05:50:05Z

this adds a blueprint to middle_auth to implement SCIM 2.0 (https://scim.cloud/) to enable middle_auth users to be provisioned and arranged into groups by a SCIM compatible server.

Note

High Risk
Adds new externally-facing provisioning endpoints and modifies database schema/migrations plus auth gating behavior, so mistakes could impact user/group/dataset integrity or allow unintended access if misconfigured.

Overview
Introduces a new neuroglancer_auth.scim blueprint mounted at /{URL_PREFIX}/scim/v2 implementing SCIM 2.0 discovery endpoints and full CRUD/PATCH behavior for Users, Groups, and a custom Dataset resource, including SCIM-compliant error responses and RFC-style filtering.

Adds persistent SCIM identifiers by extending the user, group, and dataset models/tables with scim_id (deterministic UUID5) and optional external_id, plus Alembic migrations to backfill existing rows and create indexes/uniqueness constraints. Updates creation flows to auto-generate scim_id on insert and expands User.update() to allow email changes.

Adds SCIM integration test infrastructure: GitHub Actions workflow (Postgres/Redis services) to run migrations, start the app with auth bypass, and execute pytest integration tests; plus Docker Compose + dedicated test runner image/config and new test dependencies (scim2-filter-parser, pytest, requests).

^{Written by Cursor Bugbot for commit b99ac5b. This will update automatically on new commits. Configure here.}

neuroglancer_auth/scim/filter.py

neuroglancer_auth/scim/routes.py

neuroglancer_auth/scim/utils.py

neuroglancer_auth/scim/routes.py

neuroglancer_auth/scim/auth.py

neuroglancer_auth/scim/utils.py

neuroglancer_auth/scim/filter.py

neuroglancer_auth/scim/routes.py

neuroglancer_auth/scim/utils.py

neuroglancer_auth/scim/filter.py

cursor · 2026-02-06T20:52:05Z

neuroglancer_auth/scim/filter.py

+        "ne": lambda attr, val: attr != val,
+        "co": lambda attr, val: attr.ilike(f"%{val}%"),  # contains (case-insensitive)
+        "sw": lambda attr, val: attr.ilike(f"{val}%"),  # starts with
+        "ew": lambda attr, val: attr.ilike(f"%{val}"),  # ends with


LIKE wildcards in filter values not escaped

Low Severity

The co, sw, and ew filter operators pass user-supplied values directly into ilike patterns without escaping SQL LIKE wildcards (% and _). A filter like userName co "%" generates the pattern %%%, matching all records instead of only those containing a literal percent sign. This produces incorrect filter results.

neuroglancer_auth/scim/routes.py

neuroglancer_auth/scim/utils.py

neuroglancer_auth/scim/routes.py

neuroglancer_auth/scim/filter.py

neuroglancer_auth/scim/serializers.py

neuroglancer_auth/scim/routes.py

neuroglancer_auth/scim/filter.py

neuroglancer_auth/scim/routes.py

neuroglancer_auth/scim/filter.py

cursor · 2026-02-07T17:41:57Z

neuroglancer_auth/model/user.py


    def update(self, data):
-        user_fields = ["admin", "name", "pi", "gdpr_consent", "read_only"]
+        user_fields = ["admin", "name", "pi", "gdpr_consent", "read_only", "email"]


Adding email to update fields affects existing endpoints

Medium Severity

Adding "email" to user_fields in User.update() changes behavior of the existing modify_user_route and modify_service_account_route admin endpoints, which pass raw request JSON to user.update(data). Previously, an email key in the request body was silently ignored; now it updates the email. Those endpoints don't catch IntegrityError, so a duplicate email triggers an unhandled 500 error. This also subtly broadens what admin PUT requests can modify.

Additional Locations (2)

neuroglancer_auth/server.py#L586-L587

neuroglancer_auth/server.py#L940-L941

cursor · 2026-02-07T17:41:57Z

neuroglancer_auth/scim/routes.py

+                                    UserGroup.add(user.id, group.id)
+                                    user.update_cache()  # Update user cache after group change
+                                except sqlalchemy.exc.IntegrityError:
+                                    pass  # Already in group


Missing session rollback after caught IntegrityError breaks subsequent operations

High Severity

When UserGroup.add(), GroupDatasetPermission.add(), or ServiceTable.add() throws an IntegrityError (because the record already exists), their internal db.session.commit() fails and puts the SQLAlchemy session into a "pending rollback" state. The except IntegrityError: pass blocks never call db.session.rollback(), so any subsequent database operation (e.g., the next loop iteration's find_user_by_scim_identifier query) raises PendingRollbackError, crashing the request with a 500 error. This pattern is repeated across ~12 locations in this file.

Additional Locations (2)

neuroglancer_auth/scim/routes.py#L886-L891

neuroglancer_auth/scim/routes.py#L1157-L1163

migrations/versions/a1b2c3d4e5f6_populate_scim_ids_and_add_constraints.py

neuroglancer_auth/scim/serializers.py

cursor · 2026-02-07T22:43:19Z

migrations/versions/15ae05f61e12_add_scim_fields.py

+    op.drop_index("ix_group_external_id", table_name="group")
+    op.drop_index("ix_group_scim_id", table_name="group")
+    op.drop_index("ix_user_external_id", table_name="user")
+    op.drop_index("ix_user_scim_id", table_name="user")


First migration downgrade drops indexes from second migration

Medium Severity

The downgrade() of migration 15ae05f61e12 drops indexes (e.g., ix_user_scim_id) that are only created by the subsequent migration a1b2c3d4e5f6. When rolling back sequentially, the second migration's downgrade() drops these indexes first, then the first migration's downgrade() tries to drop them again, causing a failure. The index drops belong only in the second migration's downgrade().

Additional Locations (1)

migrations/versions/a1b2c3d4e5f6_populate_scim_ids_and_add_constraints.py#L68-L81

cursor · 2026-02-07T22:43:19Z

neuroglancer_auth/scim/routes.py

+                        ug = UserGroup.get(group.id, member_user.id)
+                        if ug:
+                            db.session.delete(ug)
+                            affected_users.add(member_user.id)


Remove-all-members PATCH misses path == "members" with no value

Medium Severity

In patch_group, the "remove all members" logic (elif value is None at line 1281) is only reachable when path.startswith("members["), not when path == "members". Per SCIM RFC 7644, {"op": "remove", "path": "members"} (with no value) means remove all members. But when path == "members" and value is None, the isinstance(value, list) check on line 1275 is False, nothing happens, and the elif value is None branch is skipped because it's an elif to the if path == "members" branch.

cursor · 2026-02-07T22:43:19Z

neuroglancer_auth/scim/routes.py

+    resource = DatasetSCIMSerializer.to_scim(dataset)
+    response = flask.jsonify(resource)
+    response.headers["Content-Type"] = "application/scim+json"
+    return response


Dataset PATCH operations lack atomicity and error handling

Medium Severity

Unlike patch_user and patch_group which wrap all operations in a try/except and commit once atomically at the end, patch_dataset commits after each individual operation (via dataset.update(), ServiceTable.add(), db.session.commit()). If a multi-operation PATCH partially fails, the database is left in an inconsistent state. There's also no error handling — any IntegrityError or other exception propagates as an unhandled 500 instead of a SCIM-compliant error response.

cursor · 2026-02-10T16:31:30Z

migrations/versions/a1b2c3d4e5f6_populate_scim_ids_and_add_constraints.py

+        connection.execute(
+            sa.text("UPDATE group SET scim_id = :scim_id WHERE id = :id"),
+            {"scim_id": scim_id, "id": group_id}
+        )


Migration raw SQL uses unquoted reserved word group

High Severity

The raw SQL queries use unquoted group and user table names via sa.text(). In PostgreSQL (this project uses psycopg2), group is a fully reserved keyword (part of GROUP BY), so SELECT id FROM group and UPDATE group SET ... will raise a syntax error. The user table has the same risk. These identifiers need to be double-quoted in the raw SQL strings (e.g., "\"group\"", "\"user\"). Alembic's op.add_column() in the first migration handles quoting automatically, but sa.text() does not — it passes SQL through verbatim.

Additional Locations (1)

migrations/versions/a1b2c3d4e5f6_populate_scim_ids_and_add_constraints.py#L27-L34

cursor · 2026-02-10T16:31:31Z

neuroglancer_auth/model/user.py


    def update(self, data):
-        user_fields = ["admin", "name", "pi", "gdpr_consent", "read_only"]
+        user_fields = ["admin", "name", "pi", "gdpr_consent", "read_only", "email"]


Adding email to User.update() changes existing API behavior

Medium Severity

Adding "email" to user_fields in User.update() introduces a side effect: existing non-SCIM admin routes (modify_user_route, modify_service_account_route) now accept email changes from the raw JSON request body. Those routes pass flask.request.json directly to user.update() without checking email uniqueness, so a duplicate email will trigger an unhandled IntegrityError (500). Previously, email in the request body was silently ignored.

Additional Locations (1)

neuroglancer_auth/server.py#L586-L587

cursor · 2026-02-16T05:00:28Z

neuroglancer_auth/model/user.py


    def update(self, data):
-        user_fields = ["admin", "name", "pi", "gdpr_consent", "read_only"]
+        user_fields = ["admin", "name", "pi", "gdpr_consent", "read_only", "email"]


Email updates now bypass endpoint safeguards

Medium Severity

Adding email to User.update() makes legacy admin endpoints mutate email through a generic path that lacks duplicate-email handling. A conflicting update now raises a database IntegrityError outside SCIM handlers, turning a normal validation case into a server error path.

cursor · 2026-02-16T05:00:28Z

neuroglancer_auth/scim/routes.py

+                service_name=st["serviceName"],
+                table_name=st["tableName"],
+                dataset=dataset.name,
+            )


Dataset creation is not atomic

Medium Severity

create_dataset() persists the dataset via create_dataset_with_scim() before creating ServiceTable rows. If a later ServiceTable.add() fails, the request can return an error even though the dataset was already committed, leaving partial state and misleading conflict behavior.

Additional Locations (1)

neuroglancer_auth/model/dataset.py#L33-L45

cursor · 2026-02-16T05:00:28Z

neuroglancer_auth/scim/routes.py

+                _handle_user_add(user, path, value)
+            elif op_type == "remove":
+                _handle_user_remove(user, path, value)
+


Invalid PATCH operations succeed silently

Medium Severity

PATCH handlers never validate op values. Unknown operations are skipped and the request still returns success, so clients can believe updates were applied when nothing changed. This produces silent data drift for Users, Groups, and Datasets.

Additional Locations (2)

neuroglancer_auth/scim/routes.py#L1362-L1374

neuroglancer_auth/scim/routes.py#L1859-L1870

cursor · 2026-02-16T05:00:28Z

neuroglancer_auth/scim/routes.py

+                service_name=st["serviceName"],
+                table_name=st["tableName"],
+                dataset=dataset.name,
+            )


Malformed serviceTables can trigger server error

Low Severity

create_dataset() indexes serviceTables entries with st["serviceName"] and st["tableName"] without validating structure. Missing keys raise KeyError, which is not caught in this handler, causing a 500 instead of a SCIM client error response.

cursor · 2026-02-16T05:00:28Z

neuroglancer_auth/scim/utils.py

+
+    base_url = os.environ.get("SCIM_BASE_URL")
+    if base_url:
+        return base_url.rstrip("/")


SCIM base URL duplicates version segment

Medium Severity

When SCIM_BASE_URL already includes /v2, get_base_url() returns it unchanged and serializers append another /v2, producing invalid meta.location and Location URLs like /v2/v2/.... This breaks resource references returned by POST and discovery responses.

Additional Locations (1)

neuroglancer_auth/scim/serializers.py#L65-L67

chrisj · 2026-02-17T03:52:46Z

neuroglancer_auth/scim/routes.py

+    return build_error_response(404, "NOT_FOUND", f"User {scim_id} not found")
+
+
+def _sanitize_pi_field(pi_value):


this seems verbose

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-17T04:39:37Z

neuroglancer_auth/scim/auth.py

+
+            # If it's an auth error (400/401/403), ensure it's SCIM-formatted
+            if status_code in [400, 401, 403]:
+                # Check if already SCIM-formatted


SCIM auth can return OAuth redirects

Medium Severity

scim_auth_required delegates to auth_required, but only rewrites 400/401/403 responses. When no Authorization header is sent, auth_required can return a 302 OAuth redirect, so SCIM endpoints emit non-SCIM redirects instead of SCIM auth errors.

cursor · 2026-02-17T04:39:38Z

neuroglancer_auth/scim/utils.py

+
+    base_url = os.environ.get("SCIM_BASE_URL")
+    if base_url:
+        return base_url.rstrip("/")


SCIM locations can include duplicated version segment

Medium Severity

get_base_url returns SCIM_BASE_URL as-is, but serializers/routes append "/v2/..." to that value. With SCIM_BASE_URL set to a value already ending in /v2, generated meta.location URLs become .../v2/v2/..., producing invalid resource links.

Additional Locations (1)

test_config.py#L33-L34

chrisj · 2026-02-17T18:56:39Z

test is failing for looking for test_config.py in neuroglancer_auth

initial implementation

bb22207

cursor bot reviewed Feb 6, 2026

View reviewed changes

fcollman added 5 commits February 5, 2026 22:15

fix missing import os

4242003

fixing bad header to return scim compatible error

7122ccd

removing indepotent creation, returning conflict error instead

9a32f21

fixing filter case handling

3e35364

bugfixes

156c2a5

cursor bot reviewed Feb 6, 2026

View reviewed changes

neuroglancer_auth/scim/routes.py Show resolved Hide resolved

neuroglancer_auth/scim/auth.py Outdated Show resolved Hide resolved

neuroglancer_auth/scim/utils.py Outdated Show resolved Hide resolved

fcollman added 4 commits February 6, 2026 07:10

fixing auth check

1764f12

fixing no PI logic in user creation, patch and update

af47570

fixing scim ID namespace, and removing fallback code

840c9a1

fixing removal of group membership when deleting groups

85a34c9

cursor bot reviewed Feb 6, 2026

View reviewed changes

neuroglancer_auth/scim/filter.py Outdated Show resolved Hide resolved

fcollman added 3 commits February 6, 2026 07:24

make deleting a dataset more complete

6e4d2ad

fixing group PUT to replace all

2615145

fix present filter

6050404

cursor bot reviewed Feb 6, 2026

View reviewed changes

neuroglancer_auth/scim/routes.py Outdated Show resolved Hide resolved

neuroglancer_auth/scim/routes.py Outdated Show resolved Hide resolved

neuroglancer_auth/scim/routes.py Show resolved Hide resolved

fcollman added 3 commits February 6, 2026 11:51

fixing dataset external_id update behavior

5ddbe3d

capture remove with patterns

1ecd1b4

fixing cache update on group updates

ac01535

cursor bot reviewed Feb 6, 2026

View reviewed changes

fcollman added 3 commits February 6, 2026 12:55

fixing group delete commit

ea67cf2

fixing scim root url finding

f3e396b

swithiching to scim parsing library

aaba21b

cursor bot reviewed Feb 6, 2026

View reviewed changes

neuroglancer_auth/scim/routes.py Outdated Show resolved Hide resolved

cursor bot reviewed Feb 6, 2026

View reviewed changes

neuroglancer_auth/scim/utils.py Outdated Show resolved Hide resolved

removing default inclusion of members and permissions

8b95bf5

cursor bot reviewed Feb 6, 2026

View reviewed changes

neuroglancer_auth/scim/routes.py Outdated Show resolved Hide resolved

neuroglancer_auth/scim/filter.py Outdated Show resolved Hide resolved

neuroglancer_auth/scim/serializers.py Show resolved Hide resolved

add proper error handling

cca93b7

cursor bot reviewed Feb 6, 2026

View reviewed changes

neuroglancer_auth/scim/routes.py Outdated Show resolved Hide resolved

cursor bot reviewed Feb 7, 2026

View reviewed changes

neuroglancer_auth/scim/filter.py Outdated Show resolved Hide resolved

fix typo in filter

e495023

cursor bot reviewed Feb 7, 2026

View reviewed changes

neuroglancer_auth/scim/routes.py Outdated Show resolved Hide resolved

fcollman added 5 commits February 7, 2026 09:17

fixing pagination error handling

739ce6c

finishing pagination error and updating external id

ec27e25

fix count=0 behavior

87f65f8

allowing email address changes with checking for uniqueness constraint

3d334ed

fixing case insensitivity

9bbbc71

cursor bot reviewed Feb 7, 2026

View reviewed changes

fcollman added 4 commits February 7, 2026 14:15

making scim_id appearance more reproducible

c2d6645

making patches atomic

3b2de11

remove nullable=False constraint for now

5eca2c1

fix typo

650b574

cursor bot reviewed Feb 7, 2026

View reviewed changes

fcollman added 8 commits February 10, 2026 07:51

fixing downgrade script

52a3047

fixing filter mapping

e96b8ec

removing fallback serialization

c97ea76

consolidating scim id generation to model methods

6aaa328

refactoring patch to functions

d65e57b

removing shared lexer and parser

74c876e

adding AUTH_DISABLED function to scim auth wrapper

0b95b8d

fixing content type on response

ceaf2ca

cursor bot reviewed Feb 10, 2026

View reviewed changes

fcollman added 2 commits February 11, 2026 21:01

fix quotes on sql

7979ee8

adding tests

f3452c1

cursor bot reviewed Feb 16, 2026

View reviewed changes

chrisj reviewed Feb 17, 2026

View reviewed changes

adding test_config

b99ac5b

cursor bot reviewed Feb 17, 2026

View reviewed changes

		return build_error_response(404, "NOT_FOUND", f"User {scim_id} not found")


		def _sanitize_pi_field(pi_value):

Comments

Conversation

fcollman commented Feb 6, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Feb 6, 2026

Choose a reason for hiding this comment

LIKE wildcards in filter values not escaped

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Feb 7, 2026

Choose a reason for hiding this comment

Adding email to update fields affects existing endpoints

Uh oh!

cursor bot Feb 7, 2026

Choose a reason for hiding this comment

Missing session rollback after caught IntegrityError breaks subsequent operations

Uh oh!

Uh oh!

Uh oh!

cursor bot Feb 7, 2026

Choose a reason for hiding this comment

First migration downgrade drops indexes from second migration

Uh oh!

cursor bot Feb 7, 2026

Choose a reason for hiding this comment

Remove-all-members PATCH misses path == "members" with no value

Uh oh!

cursor bot Feb 7, 2026

Choose a reason for hiding this comment

Dataset PATCH operations lack atomicity and error handling

Uh oh!

cursor bot Feb 10, 2026

Choose a reason for hiding this comment

Migration raw SQL uses unquoted reserved word group

Uh oh!

cursor bot Feb 10, 2026

Choose a reason for hiding this comment

Adding email to User.update() changes existing API behavior

Uh oh!

cursor bot Feb 16, 2026

Choose a reason for hiding this comment

Email updates now bypass endpoint safeguards

Uh oh!

cursor bot Feb 16, 2026

Choose a reason for hiding this comment

Dataset creation is not atomic

Uh oh!

cursor bot Feb 16, 2026

Choose a reason for hiding this comment

Invalid PATCH operations succeed silently

Uh oh!

cursor bot Feb 16, 2026

Choose a reason for hiding this comment

fcollman commented Feb 6, 2026 •

edited by cursor bot

Loading

Remove-all-members PATCH misses `path == "members"` with no value

Migration raw SQL uses unquoted reserved word `group`

Adding `email` to `User.update()` changes existing API behavior