Skip to content

Conversation

@Yuval-Ariel
Copy link
Collaborator

With a setting of multiple cfs and WriteBufferManager with allow_stall, the DB can enter a deadlock when the WBM initiates a stall. This happens since only the oldest cf is picked for flush when HandleWriteBufferManagerFlush is called to flush the data and prevent the stall. When using multiple CFs, this does not ensure the FreeMem will evict enough memory to prevent a stall and no other flush is scheduled.

To fix this, add cfs to the flush queue so that we'll be below the mutable_limit_.

closes #857

With a setting of multiple cfs and WriteBufferManager with allow_stall,
the DB can enter a deadlock when the WBM initiates a stall.
This happens since only the oldest cf is picked for flush when
HandleWriteBufferManagerFlush is called to flush the data and prevent the stall.
When using multiple CFs, this does not ensure the FreeMem will evict
enough memory to prevent a stall and no other flush is scheduled.

To fix this, add cfs to the flush queue so that we'll be below the mutable_limit_.
@Yuval-Ariel Yuval-Ariel added the bug fix Fixes a known bug label Apr 14, 2024
@Yuval-Ariel Yuval-Ariel requested a review from ofriedma April 14, 2024 13:52
@Yuval-Ariel Yuval-Ariel self-assigned this Apr 14, 2024
Copy link
Contributor

@ofriedma ofriedma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a permanent stall using this command:
./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_concurrent_memtable_write=0 --allow_data_in_errors=True --allow_wbm_stalls=1 --async_io=0 --avoid_flush_during_recovery=1 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=1.3037923446511857 --bottommost_compression_type=none --bytes_per_sync=0 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=1 --charge_filter_construction=0 --charge_table_reader=0 --checkpoint_one_in=0 --checksum_type=kxxHash --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=0 --compaction_ttl=1000 --compare_full_db_state_snapshot=0 --compression_max_dict_buffer_bytes=511 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_use_zstd_dict_trainer=0 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --create_timestamped_snapshot_one_in=20 --customopspercent=0 --data_block_index_type=0 --db=/tmp/rocksdb_crashtest_blackboxod86ddmf --db_write_buffer_size=67108864 --delpercent=19 --delrangepercent=0 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=0 --expected_values_dir=/tmp/rocksdb_crashtest_expected_vy0z2l_r --fail_if_options_file_error=0 --fifo_allow_compaction=0 --file_checksum_impl=crc32c --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=100000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=4 --index_type=0 --ingest_external_file_one_in=0 --initial_auto_readahead_size=16384 --initiate_wbm_flushes=0 --iterpercent=7 --key_len_percent_dist=100 --level_compaction_dynamic_level_bytes=True --lock_wal_one_in=1000000 --long_running_snapshots=0 --manual_wal_flush_one_in=1000 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=16384 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=102400 --max_key_len=1 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=64 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=10485760000 --memtable_prefix_bloom_size_ratio=0 --memtable_protection_bytes_per_key=2 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=30 --num_file_reads_for_auto_readahead=2 --num_iterations=25 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=1 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=1000000 --periodic_compaction_seconds=10 --pinning_policy=speedb_scoped_pinning_policy --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=1 --preserve_internal_time_seconds=36000 --progress_reports=0 --read_fault_one_in=32 --readahead_size=0 --readpercent=28 --recycle_log_file_num=0 --reopen=0 --ribbon_starting_level=999 --secondary_cache_fault_one_in=32 --secondary_cache_uri= --seed=3618268266 --set_options_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=1048576 --start_delay_percent=22 --stats_dump_period_sec=600 --subcompactions=2 --sync=0 --sync_fault_injection=1 --sync_wal_one_in=100000 --target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=3 --txn_write_policy=0 --unordered_write=0 --unpartitioned_pinning=3 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_dynamic_delay=1 --use_full_merge_v1=False --use_get_entity=1 --use_merge=1 --use_multiget=1 --use_put_entity_one_in=0 --use_txn=1 --user_timestamp_size=0 --value_size_mult=32 --verify_before_write=False --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=none --write_buffer_size=1073741824 --write_dbid_to_manifest=1 --writepercent=46

and memory_usage() instead of mutable_memtable_memory_usage()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Fixes a known bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

db_stress gets stuck

3 participants