Draft
Conversation
… can forbid copying of volatile expressions
…namic function call
Note that this do not currently work for extensions that are double Loaded This might happen (for example in tpch) if _init function calls Load explicitly
…be reading random bytes)
This clean-up CMake syntax and fixes a problem where empty strings would be conflated for no argument
fix(jdbc): support non-string parameter types
Few more fuzzer fixes
We have our own signing mechanism, and they conflict making the Apple signature invalid
Co-authored-by: Carlo Piovesan <piovesan.carlo@gmail.com>
Bump spatial
…ions Avoid performing Apple codesign on extensions
Filter out single relation predicates before join ordering
…value Fix `last_value` in the `duckdb_sequences` metadata function
Limit batch insert threads based on available memory, similar to Parquet write
[Vacuum] Fix serialization and Copy of the VacuumStatement
gropaul
pushed a commit
that referenced
this pull request
Feb 18, 2025
We had two users crash with the following backtrace:
```
frame #0: 0x0000ffffab2571ec
frame #1: 0x0000aaaaac00c5fc duckling`duckdb::InternalException::InternalException(this=<unavailable>, msg=<unavailable>) at exception.cpp:328:2
frame #2: 0x0000aaaaac1ee418 duckling`duckdb::optional_ptr<duckdb::OptimisticDataWriter, true>::CheckValid(this=<unavailable>) const at optional_ptr.hpp:34:11
frame #3: 0x0000aaaaac1eea8c duckling`duckdb::MergeCollectionTask::Execute(duckdb::PhysicalBatchInsert const&, duckdb::ClientContext&, duckdb::GlobalSinkState&, duckdb::LocalSinkState&) [inlined] duckdb::optional_ptr<duckdb::OptimisticDataWriter, true>::operator*(this=<unavailable>) at optional_ptr.hpp:43:3
frame #4: 0x0000aaaaac1eea84 duckling`duckdb::MergeCollectionTask::Execute(this=0x0000aaaaf1b06150, op=<unavailable>, context=0x0000aaaba820d8d0, gstate_p=0x0000aaab06880f00, lstate_p=<unavailable>) at physical_batch_insert.cpp:219:90
frame #5: 0x0000aaaaac1d2e10 duckling`duckdb::PhysicalBatchInsert::Sink(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSinkInput&) const [inlined] duckdb::PhysicalBatchInsert::ExecuteTask(this=0x0000aaaafa62ab40, context=<unavailable>, gstate_p=0x0000aaab06880f00, lstate_p=0x0000aab12d442960) const at physical_batch_insert.cpp:425:8
frame #6: 0x0000aaaaac1d2dd8 duckling`duckdb::PhysicalBatchInsert::Sink(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSinkInput&) const [inlined] duckdb::PhysicalBatchInsert::ExecuteTasks(this=0x0000aaaafa62ab40, context=<unavailable>, gstate_p=0x0000aaab06880f00, lstate_p=0x0000aab12d442960) const at physical_batch_insert.cpp:431:9
frame #7: 0x0000aaaaac1d2dd8 duckling`duckdb::PhysicalBatchInsert::Sink(this=0x0000aaaafa62ab40, context=0x0000aab2fffd7cb0, chunk=<unavailable>, input=<unavailable>) const at physical_batch_insert.cpp:494:4
frame #8: 0x0000aaaaac353158 duckling`duckdb::PipelineExecutor::ExecutePushInternal(duckdb::DataChunk&, duckdb::ExecutionBudget&, unsigned long) [inlined] duckdb::PipelineExecutor::Sink(this=0x0000aab2fffd7c00, chunk=0x0000aab2fffd7d30, input=0x0000fffec0aba8d8) at pipeline_executor.cpp:521:24
frame #9: 0x0000aaaaac353130 duckling`duckdb::PipelineExecutor::ExecutePushInternal(this=0x0000aab2fffd7c00, input=0x0000aab2fffd7d30, chunk_budget=0x0000fffec0aba980, initial_idx=0) at pipeline_executor.cpp:332:23
frame #10: 0x0000aaaaac34f7b4 duckling`duckdb::PipelineExecutor::Execute(this=0x0000aab2fffd7c00, max_chunks=<unavailable>) at pipeline_executor.cpp:201:13
frame #11: 0x0000aaaaac34f258 duckling`duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) [inlined] duckdb::PipelineExecutor::Execute(this=<unavailable>) at pipeline_executor.cpp:278:9
frame #12: 0x0000aaaaac34f250 duckling`duckdb::PipelineTask::ExecuteTask(this=0x0000aab16dafd630, mode=<unavailable>) at pipeline.cpp:51:33
frame #13: 0x0000aaaaac348298 duckling`duckdb::ExecutorTask::Execute(this=0x0000aab16dafd630, mode=<unavailable>) at executor_task.cpp:49:11
frame #14: 0x0000aaaaac356600 duckling`duckdb::TaskScheduler::ExecuteForever(this=0x0000aaaaf0105560, marker=0x0000aaaaf00ee578) at task_scheduler.cpp:189:32
frame #15: 0x0000ffffab0a31fc
frame #16: 0x0000ffffab2ad5c8
```
Core dump analysis showed that the assertion `D_ASSERT(lstate.writer);`
in `MergeCollectionTask::Execute` (i.e. it is crashing because
`lstate.writer` is NULLPTR) was not satisfied when
`PhysicalBatchInsert::Sink` was processing merge tasks from (other)
pipeline executors.
My suspicion is that this is only likely to happen for heavily
concurrent workloads (applicable to the two users which crashed). The
patch submitted as part of this PR has addressed the issue for these
users.
gropaul
pushed a commit
that referenced
this pull request
May 15, 2025
gropaul
pushed a commit
that referenced
this pull request
Dec 1, 2025
…uckdb#19680) (duckdb#19811) Fixes duckdb#19680 This fixes a bug where queries using `NOT EXISTS` with `IS DISTINCT FROM` returned incorrect results due to improper handling of NULL semantics in the optimizer. The issue was that the optimizer's deliminator incorrectly treated `DISTINCT FROM` variants the same as regular equality/inequality comparisons, which have different NULL handling: - `IS DISTINCT FROM`: NULL-aware (NULL IS DISTINCT FROM NULL = FALSE) - != or =: NULL-unaware (NULL != NULL = NULL, filters out NULLs) ### Incorrect Query Plan ``` ┌───────────────────────────┐ │ PROJECTION │ │ ──────────────────── │ │ c2 │ │ │ │ ~0 rows │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ PROJECTION │ │ ──────────────────── │ │ #5 │ │__internal_decompress_integ│ │ ral_integer(#3, 1) │ │ #1 │ │ │ │ ~0 rows │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ NESTED_LOOP_JOIN │ │ ──────────────────── │ │ Join Type: ANTI │ │ Conditions: c2 != c2 ├──────────────┐ │ │ │ │ ~0 rows │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ PROJECTION ││ PROJECTION │ │ ──────────────────── ││ ──────────────────── │ │ NULL ││ NULL │ │ #2 ││ #2 │ │ NULL ││ NULL │ │ #1 ││ #1 │ │ NULL ││ NULL │ │ #0 ││ #0 │ │ NULL ││ NULL │ │ ││ │ │ ~2 rows ││ ~1 row │ └─────────────┬─────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ PROJECTION ││ PROJECTION │ │ ──────────────────── ││ ──────────────────── │ │ #0 ││ #0 │ │__internal_compress_integra││__internal_compress_integra│ │ l_utinyint(#1, 1) ││ l_utinyint(#1, 1) │ │ #2 ││ #2 │ │ ││ │ │ ~2 rows ││ ~1 row │ └─────────────┬─────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ PROJECTION ││ PROJECTION │ │ ──────────────────── ││ ──────────────────── │ │ NULL ││ NULL │ │ #0 ││ #0 │ │ NULL ││ NULL │ │ ││ │ │ ~2 rows ││ ~1 row │ └─────────────┬─────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ SEQ_SCAN ││ FILTER │ │ ──────────────────── ││ ──────────────────── │ │ Table: t0 ││ (col0 IS NOT NULL) │ │ Type: Sequential Scan ││ │ │ Projections: c2 ││ │ │ ││ │ │ ~2 rows ││ ~1 row │ └───────────────────────────┘└─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ SEQ_SCAN │ │ ──────────────────── │ │ Table: t0 │ │ Type: Sequential Scan │ │ Projections: c2 │ │ │ │ ~2 rows │ └───────────────────────────┘ ``` The buggy plan shows two critical issues: ``` ┌─────────────┴─────────────┐ │ NESTED_LOOP_JOIN │ │ Join Type: ANTI │ │ Conditions: c2 != c2 │ ← ❌ Wrong(the join conditions should be c2 IS DISTINCT FROM c2) │ ~0 rows │ └─────────────┬─────────────┘ │ └─────────────┐ ┌┴─────────────┐ │ FILTER │ │ (col0 IS NOT │ ← ❌ Wrong(the filter should be removed) │ NULL) │ └──────────────┘ ``` ### Solution This PR adds proper support for DISTINCT FROM operators throughout the optimization pipeline: 1. Preserve DISTINCT FROM semantics in join conversion.(src/optimizer/deliminator.cpp) ``` // NOTE: We should NOT convert DISTINCT FROM to != in general // Only convert if the ORIGINAL join had != or = (not DISTINCT FROM variants) if (delim_join.join_type != JoinType::MARK && original_join_comparison != ExpressionType::COMPARE_DISTINCT_FROM && original_join_comparison != ExpressionType::COMPARE_NOT_DISTINCT_FROM) { // Safe to convert } ``` 2. Skip NULL filters for DISTINCT FROM variants.(src/optimizer/deliminator.cpp) ``` // Only add IS NOT NULL filter for regular equality/inequality comparisons // Do NOT add for DISTINCT FROM variants, as they handle NULL correctly if (cond.comparison != ExpressionType::COMPARE_NOT_DISTINCT_FROM && cond.comparison != ExpressionType::COMPARE_DISTINCT_FROM) { // Add IS NOT NULL filter } ``` 3. Added negation support for COMPARE_DISTINCT_FROM and COMPARE_NOT_DISTINCT_FROM in expression type handling.(src/common/enums/expression_type.cpp) 4. Updated parser to properly negate IS DISTINCT FROM expressions when wrapped with NOT. (src/parser/transform/expression/transform_bool_expr.cpp) 5. Added regression test in test/sql/subquery/exists/test_correlated_exists_with_derived_table.test
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.