forked from databendlabs/databend
-
Notifications
You must be signed in to change notification settings - Fork 2
refactor(query): stream style block writer for hash join spill #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bohutang
wants to merge
3,722
commits into
main
Choose a base branch
from
refactor/stream_writer
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…abendlabs#18458) * refactor(meta): move Lua functions to metactl namespace Move all Lua functions from global scope to metactl namespace to prevent conflicts with other Lua libraries: - metactl.new_grpc_client() replaces new_grpc_client() - metactl.spawn() replaces spawn() - metactl.sleep() replaces sleep() - metactl.NULL replaces NULL * docs(meta): add Lua API documentation and benchmarking tools Add comprehensive documentation for the metactl Lua runtime API, including all available functions, client methods, and usage patterns. Add benchmark script and runner for performance testing of concurrent meta operations. - Complete API documentation with examples and best practices - Benchmark script with configurable concurrent workers - Python test runner with meta service setup automation * chore: add README to benchmark dir
* fix: vacuum drop table with limit does not work * add result set * fix test * fix test
…atabendlabs#18461) * feat: add Lua admin client support and metrics subcommand to metactl - Add MetricsArgs and metrics subcommand to metactl CLI - Implement LuaAdminClient with admin API methods (metrics, status, transfer_leader, etc.) - Add new_admin_client function to Lua environment - Add comprehensive test suite for Lua admin client functionality - Update utils.py to improve error handling in run_command * M tests/metactl/subcommands/cmd_metrics.py
…endlabs#18450) * [chore] update comment in rule_grouping_sets_to_union.rs * Addressed --------- Co-authored-by: sundyli <543950155@qq.com>
perf(query): Improve parse json performance
* chore: refine cte profile * chore: add setting * make lint
* refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom * refactor(query): refactor row fetcher for avoid oom
* fix: collect statistics for MaterializeCTERef * fix test * make lint * fix test
* improve stream write * fix virtual column builder * add block statistics * remove null in bloom index builder * use metahll * avoid large string * fix * use segment level stats * add test * remove unused code * fix review comment * fix test --------- Co-authored-by: Bohu <overred.shuttler@gmail.com>
* fix: By extending the task time unit by 5 times, the hardware weakness can easily lead to CI failure. * chore: fix test * chore: fix test
…atabendlabs#18465) * fix: attach table does not carry the indexes of the original table * chore: fix test * chore: fix test * chore: fix test * chore: fix test * chore: fix test * chore: fix test * chore: add more index type on attach table test * chore: fix test * chore: fix test
* fix: make update table meta idempotent * add ut * refine * fix * rename func * polish unit test --------- Co-authored-by: dantengsky <dantengsky@gmail.com>
…stage (databendlabs#18453) * feat(query): add zero table * feat(query): add zero table * feat(query): add zero table * feat(query): add zero table * feat(query): add zero table * feat(query): add zero table
* fix: missing 'values' when displaying insert stmt. * feat: add header X-DATABEND-CLIENT-CAPS. * feat: add header X-DATABEND-CLIENT-CAPS.
* chore(query): add max node quota * chore(query): add max node quota * chore(query): add max node quota * chore(query): add max node quota
fix: refresh index lose data
…ndlabs#18722) * fix(meta-service): detach the SysData to avoid race condition When creating a new level in state-machine, it should detach the SysData to avoid race condition with snapshot building. Before this commit, the new writable level and the snapshot compactor shares the same data thus the new applied data increases the `last-log-id` of a new built snapshot. Result in a snapshot that lacks some log entries it declares to have. * M src/meta/raft-store/src/sm_v003/compact_immutable_levels_test.rs
* chore: move GetSubTable to separate file * chore: replace Arc<Mutex<SysData>> with SysData
* chore: add error check on private task test script * chore: codefmt * chore: codefmt * chore: codefmt * chore: enable private_task_warehouse.sh on ci test private task * chore: codefmt
…mpatibility (databendlabs#18724) * fix(query): Set Parquet default encoding to `PLAIN` to ensure data compatibility * add comments * fix * only set encode for decimal column
…#18728) - Add ANY_VALUE as an alias for the ANY aggregate function to improve compatibility with standard SQL and other database systems - ANY_VALUE is widely used in BI tools and analytical workloads as shown in the Snowflake paper analyzing 667M queries - Add comprehensive tests to verify ANY_VALUE works correctly
…bs#18736) Add new compaction modules: compact_all, compact_conductor, compact_min_adjacent
* refactor: new setting `max_vacuum_threads` Add new setting `max_vacuum_threads` which effect the degree of concurrency during vacuume operations. * cargo fmt
- Add DropCallback to call a callback when being dropped. - Remove `CompactingData`, use `LeveledMap` directly. - Refine `WriterPermit` and `CompactorPermit` logging. - When building snapshot, it should acquire both the writer and compactor permits, because it needs to modify both the writable and the `immutable` data.
…databendlabs#18741) During long-running SQL queries, the system repeatedly logs empty pages with rows=0 which creates excessive log noise. This change only logs non-empty pages and final completion status. Changes: - Skip logging empty pages (rows=0) during query execution - Only log when pages contain actual data (rows>0) - Log final completion status when query ends with empty page - Preserve all error and cleanup logs for debugging This significantly reduces log volume while maintaining visibility into actual data processing and query completion.
* eat(rbac): procedure object support rbac * 1. replace p_id to procedure_id 2. add new function create_id_value_with_cleanup process ownership kv 3. modify test * no need to assert the seq of ownership key * refactor create_id_value * refactor cleanup_old_fn * fix
…cation (databendlabs#18732) * refactor(query): refactor the join partition to reduce memory amplification * refactor(query): refactor the join partition to reduce memory amplification * refactor(query): refactor the join partition to reduce memory amplification * refactor(query): refactor the join partition to reduce memory amplification * refactor(query): refactor the join partition to reduce memory amplification * refactor(query): refactor the join partition to reduce memory amplification
databendlabs#18744) * fix: fuse_vacuum2 panic while vauuming empty table with data_retention_num_snapshots_to_keep policy Return early if found table has no snapshot * revert test config file * improve(vacuum): enhance vacuum drop table logging for better progress tracking - Replace verbose TableMeta output with concise table_name(id:table_id) format - Add clear start/completion markers with === delimiters - Improve result summary with specific counts of success/failed operations - Add detailed progress information while preserving all debug data - Failed table IDs are still logged separately for troubleshooting * tweak logs --------- Co-authored-by: BohuTANG <overred.shuttler@gmail.com>
…in the same transaction (databendlabs#18739)
…ndlabs#18749) - Remove `is_opened` flag from `RaftStore` - Remove obsolete config no_sync, which is only used by sled tree store - Add config to MetaRaftLog - Remove `RaftStoreInner`
|
Pull request description must contain CLA like the following: |
ed36db3 to
7d70667
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
refactor(query): stream style block writer for hash join spill
Tests
Type of change