-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
Hello there,
I have now had time to upgrade our operator to 0.3.7, and I still experience the issue described in #222 . Again, reverting to 0.3.5 fixes these issue.
This is on a somwhat standard kubernetes, 1-replica setup:
---
apiVersion: ts.opentelekomcloud.com/v1alpha1
kind: TypesenseCluster
metadata:
name: zebra-preprod
spec:
image: mirror.gcr.io/typesense/typesense:29.0
replicas: 1
storage:
size: "128Mi"
storageClassName: csi-cinder-sc-delete-wait
adminApiKey:
name: typesense-zebra-preprod-bootstrap-key
healthProbeTimeoutInMilliseconds: 10000
incrementalQuorumRecovery: true
resources:
limits:
cpu: 1
memory: 768Mi
requests:
memory: 768Mi
metrics:
release: 'undefined'
resources:
limits:
cpu: 100m
memory: 64Mi
requests:
memory: 64Mi
healthcheck:
resources:
limits:
cpu: 100m
memory: 32Mi
requests:
memory: 32MiWith 0.3.7:
I20260129 09:56:20.883572 291 raft_server.cpp:605] Finished loading collections from disk.
I20260129 09:56:20.883666 291 raft_server.cpp:616] Loaded 0conversation model(s).
I20260129 09:56:20.883678 291 raft_server.cpp:620] Initializing batched indexer from snapshot state...
I20260129 09:56:20.883739 291 batched_indexer.cpp:635] Restored 0 in-flight requests from snapshot.
I20260129 09:56:20.883786 291 raft_server.cpp:633] Loaded 0 personalization model(s).
I20260129 09:56:20.883822 291 raft_server.h:294] Configuration of this group is 10.64.95.250:8107:8108
I20260129 09:56:20.883937 291 snapshot_executor.cpp:264] node default_group:10.64.67.24:8107:8108 snapshot_load_done, last_included_index: 6367974 last_included_term: 605 peers: "10.64.95.250:8107:8108"
I20260129 09:56:20.885480 263 raft_meta.cpp:521] Loaded single stable meta, path /usr/share/typesense/data/state/meta term 607 votedfor 0.0.0.0:0:0 time: 1235
I20260129 09:56:20.885545 263 node.cpp:608] node default_group:10.64.67.24:8107:8108 init, term: 607 last_log_id: (index=6367975,term=605) conf: 10.64.95.250:8107:8108 old_conf:
I20260129 09:56:20.885615 263 raft_server.cpp:141] Node last_index: 6367975
I20260129 09:56:20.885628 263 typesense_server_utils.cpp:309] Typesense peering service is running on 10.64.67.24:8107
I20260129 09:56:20.885643 263 typesense_server_utils.cpp:310] Snapshot interval configured as: 3600s
I20260129 09:56:20.885654 263 typesense_server_utils.cpp:311] Snapshot max byte count configured as: 4194304
W20260129 09:56:20.885668 263 controller.cpp:1550] SIGINT was installed with 1
I20260129 09:56:20.885769 263 raft_server.cpp:692] Term: 607, pending_queue: 0, last_index: 6367975, committed: 0, known_applied: 6367974, applying: 0, pending_writes: 0, queued_writes: 0, local_sequence: 31837413
W20260129 09:56:20.885792 263 raft_server.cpp:717] Node with no leader. Resetting peers of size: 1
W20260129 09:56:20.885810 263 node.cpp:926] node default_group:10.64.67.24:8107:8108 set_peer from 10.64.95.250:8107:8108 to 10.64.67.47:8107:8108
I20260129 09:56:20.891244 263 raft_meta.cpp:546] Saved single stable meta, path /usr/share/typesense/data/state/meta term 608 votedfor 0.0.0.0:0:0 time: 5383
I20260129 09:56:26.165112 285 node.cpp:1579] node default_group:10.64.67.24:8107:8108 term 608 start pre_vote
W20260129 09:56:26.165217 285 node.cpp:1589] node default_group:10.64.67.24:8107:8108 can't do pre_vote as it is not in 10.64.67.47:8107:8108
> ..... More log lines, but check the timestamps
I20260129 09:57:54.755228 287 node.cpp:1579] node default_group:10.64.67.24:8107:8108 term 608 start pre_vote
W20260129 09:57:54.755301 287 node.cpp:1589] node default_group:10.64.67.24:8107:8108 can't do pre_vote as it is not in 10.64.67.47:8107:8108
I20260129 09:57:59.898531 287 node.cpp:1579] node default_group:10.64.67.24:8107:8108 term 608 start pre_vote
W20260129 09:57:59.898599 287 node.cpp:1589] node default_group:10.64.67.24:8107:8108 can't do pre_vote as it is not in 10.64.67.47:8107:8108
I20260129 09:58:00.904165 263 raft_server.cpp:692] Term: 608, pending_queue: 0, last_index: 6367975, committed: 0, known_applied: 6367974, applying: 0, pending_writes: 0, queued_writes: 0, local_sequence: 31837413
W20260129 09:58:00.904224 263 raft_server.cpp:717] Node with no leader. Resetting peers of size: 1
I20260129 09:58:03.886561 1 typesense_server_utils.cpp:60] Stopping Typesense server...
> … until it dies
The same single-replica cluster, on 0.3.5:
I20260129 10:02:17.028793 287 raft_server.cpp:605] Finished loading collections from disk.
I20260129 10:02:17.028932 287 raft_server.cpp:616] Loaded 0conversation model(s).
I20260129 10:02:17.028945 287 raft_server.cpp:620] Initializing batched indexer from snapshot state...
I20260129 10:02:17.028995 287 batched_indexer.cpp:635] Restored 0 in-flight requests from snapshot.
I20260129 10:02:17.029036 287 raft_server.cpp:633] Loaded 0 personalization model(s).
I20260129 10:02:17.029070 287 raft_server.h:294] Configuration of this group is 10.64.95.250:8107:8108
I20260129 10:02:17.029173 287 snapshot_executor.cpp:264] node default_group:10.64.67.131:8107:8108 snapshot_load_done, last_included_index: 6367974 last_included_term: 605 peers: "10.64.95.250:8107:8108"
I20260129 10:02:17.030550 263 raft_meta.cpp:521] Loaded single stable meta, path /usr/share/typesense/data/state/meta term 610 votedfor 0.0.0.0:0:0 time: 1179
I20260129 10:02:17.030592 263 node.cpp:608] node default_group:10.64.67.131:8107:8108 init, term: 610 last_log_id: (index=6367975,term=605) conf: 10.64.95.250:8107:8108 old_conf:
I20260129 10:02:17.030647 263 raft_server.cpp:141] Node last_index: 6367975
I20260129 10:02:17.030664 263 typesense_server_utils.cpp:309] Typesense peering service is running on 10.64.67.131:8107
I20260129 10:02:17.030674 263 typesense_server_utils.cpp:310] Snapshot interval configured as: 3600s
I20260129 10:02:17.030683 263 typesense_server_utils.cpp:311] Snapshot max byte count configured as: 4194304
W20260129 10:02:17.030730 263 controller.cpp:1550] SIGINT was installed with 1
I20260129 10:02:17.030818 263 raft_server.cpp:692] Term: 610, pending_queue: 0, last_index: 6367975, committed: 0, known_applied: 6367974, applying: 0, pending_writes: 0, queued_writes: 0, local_sequence: 31837413
W20260129 10:02:17.030834 263 raft_server.cpp:717] Node with no leader. Resetting peers of size: 1
W20260129 10:02:17.030848 263 node.cpp:926] node default_group:10.64.67.131:8107:8108 set_peer from 10.64.95.250:8107:8108 to 10.64.67.197:8107:8108
I20260129 10:02:17.036015 263 raft_meta.cpp:546] Saved single stable meta, path /usr/share/typesense/data/state/meta term 611 votedfor 0.0.0.0:0:0 time: 5138
I20260129 10:02:22.091174 278 node.cpp:1579] node default_group:10.64.67.131:8107:8108 term 611 start pre_vote
W20260129 10:02:22.091253 278 node.cpp:1589] node default_group:10.64.67.131:8107:8108 can't do pre_vote as it is not in 10.64.67.197:8107:8108
> … it tries for some time, then finally:
W20260129 10:03:37.046319 263 raft_server.cpp:717] Node with no leader. Resetting peers of size: 1
W20260129 10:03:37.046335 263 node.cpp:926] node default_group:10.64.67.131:8107:8108 set_peer from 10.64.67.197:8107:8108 to 10.64.67.131:8107:8108
I20260129 10:03:37.054266 263 raft_meta.cpp:546] Saved single stable meta, path /usr/share/typesense/data/state/meta term 612 votedfor 0.0.0.0:0:0 time: 7861
I20260129 10:03:39.442730 278 node.cpp:1579] node default_group:10.64.67.131:8107:8108 term 612 start pre_vote
I20260129 10:03:39.442939 278 node.cpp:1645] node default_group:10.64.67.131:8107:8108 term 612 start vote and grant vote self
I20260129 10:03:39.448323 287 raft_meta.cpp:546] Saved single stable meta, path /usr/share/typesense/data/state/meta term 613 votedfor 10.64.67.131:8107:8108 time: 5227
I20260129 10:03:39.448383 287 node.cpp:1899] node default_group:10.64.67.131:8107:8108 term 613 become leader of group 10.64.67.131:8107:8108
I20260129 10:03:39.453203 287 raft_server.h:294] Configuration of this group is 10.64.67.131:8107:8108
I20260129 10:03:39.453270 287 node.cpp:3298] node default_group:10.64.67.131:8107:8108 reset ConfigurationCtx, new_peers: 10.64.67.131:8107:8108, old_peers: 10.64.67.131:8107:8108
I20260129 10:03:39.453289 287 raft_server.h:277] Node becomes leader, term: 613
Happy to provide more inputs or logs. I also manage a second 3-nodes cluster, that exhibits the same issues.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels