-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the condition for runtime filters #18496
Conversation
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Reviewer Guide 🔍
|
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Code Suggestions ✨
|
* remove mergingObjs check (matrixorigin#18482) * forbidden restore database or table to other account (matrixorigin#18337) forbidden restore database or table to other account Approved by: @iamlinjunhong, @heni02 * Fix Ceiling (matrixorigin#18479) matrixorigin/MO-Cloud#3955 Approved by: @aunjgr, @heni02 * fileservice: release semaphore after reader close in objectStorageSemaphore.Read (matrixorigin#18463) should hold the semaphore during the Read, otherwise it will not limit the I/O concurrency to S3. Approved by: @fengttt * cn commit tombstone objects to tn. (matrixorigin#18423) remove delta location, using object stats instead. Approved by: @daviszhen, @zhangxu19830126, @reusee, @LeftHandCold, @m-schen, @XuPeng-SH, @triump2020, @aunjgr * malloc: use unique.Handle in GetStacktrace (matrixorigin#18489) use the unique package to reduce memory consumption Approved by: @m-schen * Fengttt conn (matrixorigin#18421) Repro matrixorigin#18420 Approved by: @reusee, @zhangxu19830126, @daviszhen, @qingxinhome * fileservice: do disk cache eviction after loading and before writing files (matrixorigin#18394) more eviction Approved by: @fengttt * print debug message when waiting message timeout (matrixorigin#18468) print debug message when waiting message timeout return err instead of panic Approved by: @fengttt, @aunjgr, @m-schen * refactor tombstone read related code (matrixorigin#18506) refactor tombstone related code for further usage Approved by: @triump2020, @LeftHandCold * fix mpool leak (matrixorigin#18493) 1. close abort vector 2. close object batches Approved by: @XuPeng-SH * refactoring dispatch scopes (matrixorigin#18462) 运行时展开pipeline,目前对dispatch算子有些特别处理。 现在将相关逻辑移动到compile阶段,并对其进行简化。 好处一是简化运行时的逻辑,方便后续连接pipeline,二是运行之前就能看到完整的pipeline,便于调试问题 修复了一处导致daily run hang住的bug Approved by: @ouyuanning, @fengttt, @aunjgr, @sukki37, @heni02 * fix the condition for runtime filters (matrixorigin#18496) fix the condition for runtime filters, try to push runtime filters in more cases. fix some comment Approved by: @aunjgr * remove restriction for top value optimization (matrixorigin#18503) sometimes the stats for filters are wrong, and top value optimization will be disabled. remove restriction for top value optimization Approved by: @aunjgr, @heni02 * Change the type string in SHOW CREATE TABLE to lowercase (matrixorigin#18380) Change the type string in SHOW CREATE TABLE to lowercase, consistent with MySQL, aim to be compatible with Python SQLAlchemy Approved by: @ck89119, @aunjgr, @heni02 * fix suspend not effect in main (matrixorigin#18491) fix suspend not effect in main Approved by: @daviszhen, @reusee, @sukki37 * fix too early return in processLoadLocal (matrixorigin#18498) 修复load local逻辑中函数过早返回而没有读完client发送的数据包 Approved by: @daviszhen * release hashmap memory when input batch rowcnt is 0 (matrixorigin#18514) release hashmap memory when input batch rowcnt is 0 Approved by: @m-schen, @aunjgr * fix reading of deleted data (matrixorigin#18511) fix unmatched column sizes in a read batch Approved by: @XuPeng-SH * refactor the run method of pipelines (matrixorigin#18510) mo的pipeline运行方式有四种,remoterun,mergerun,parallelrun and normalrun. remoterun发送到远端cn执行。 mergerun在协程中运行所有prescope并监控对应返回。 parallelrun在运行时展开scope,normalrun直接从rootOp算子开始call。 以前pipeline断开比较碎,run方法可以写的很简单,现在将pipeline进行连接以后,需要更复杂的run方法。 一条pipeline可能同时经历remoterun->mergerun->parallelrun->normalrun Approved by: @m-schen * code refactoring of some pipelines (matrixorigin#18525) refactor insert scope and delete merge scope, reuse some logic to make code more clear. optimize shuffle scopes to break after shuffle operator. remove unused code Approved by: @ouyuanning, @heni02 * add UT for S3 Delete (matrixorigin#18519) add UT for S3 Delete Approved by: @XuPeng-SH * remove unused code (matrixorigin#18517) remove unused code Approved by: @LeftHandCold, @triump2020 * Fix data race (matrixorigin#18536) fix data race Approved by: @LeftHandCold * fix bug: make sure vector will be free in some case (matrixorigin#18490) fix bug: make sure vector will be free in some case Approved by: @m-schen, @badboynt1, @aunjgr, @fengttt * fix TestGetInitPort (matrixorigin#18528) fix TestGetInitPort Approved by: @reusee * fix disttae mpool leak (matrixorigin#18509) fix disttae mem leak Approved by: @triump2020, @XuPeng-SH * fix transfer page race (matrixorigin#18548) fix transfer page race Approved by: @XuPeng-SH * refactor pipelines of broadcast join to reduce pipeline break (matrixorigin#18549) refactor pipelines of broadcast join to reduce pipeline break remove unused codes to make code more clear Approved by: @ouyuanning, @m-schen, @heni02, @aunjgr * avoid redundant and time-consuming type check operation using vector api (matrixorigin#18505) * remove transfer delete log (matrixorigin#18560) * add test case for create partition based shard table (matrixorigin#18515) add test case for create partition based shard table Approved by: @m-schen, @triump2020, @badboynt1, @reusee, @daviszhen, @qingxinhome, @ouyuanning, @aressu1985, @heni02, @LeftHandCold, @aunjgr * fix ut TestManyRangeLockInManyGoroutines data race (matrixorigin#18478) fix ut TestManyRangeLockInManyGoroutines data race Approved by: @zhangxu19830126 * save transfer page memory usage (matrixorigin#18557) save transfer page memory usage Approved by: @XuPeng-SH * fix a bug that sometimes hashmap is not released (matrixorigin#18554) fix a bug that sometimes hashmap is not released. free memory when query finished. Approved by: @m-schen, @aunjgr * fix: buildload LoadWriteS3 (matrixorigin#18556) buildload函数里面对builder.qry.LoadWriteS3的设置条件写反了 Approved by: @badboynt1 * Supplement v1.2.3 version upgrade package (matrixorigin#18431) Supplement v1.2.3 version upgrade package Approved by: @daviszhen, @zhangxu19830126, @LeftHandCold * refactor: value_scan (matrixorigin#18460) move constructValueScanBatch to value_scan Approved by: @ouyuanning, @aunjgr, @m-schen * support group by with rollup in parser (matrixorigin#18526) support group by with rollup in parser Approved by: @iamlinjunhong, @ouyuanning * Fix some data race in frontend (matrixorigin#18545) Fix some data race in frontend.Conn Approved by: @daviszhen * fix TransferDeleteIntent for S3 tombstone (matrixorigin#18541) fix TransferDeleteIntent Approved by: @zhangxu19830126, @XuPeng-SH * update CODEOWNERS (matrixorigin#18552) * fix checkpoint entry use after clean (matrixorigin#18580) * add sharding replica balance test case (matrixorigin#18566) Add cluster sharding test case for replica balance. Replica can be re-balance if cn added to the cluster or remove from cluster. Approved by: @daviszhen, @reusee * fileservice: re-implement memory cache with fifocache (matrixorigin#18569) re-implement memory cache with fifocache Approved by: @fengttt, @LeftHandCold * reduce *Conn.Read syscall (matrixorigin#18379) improve *Conn.Read In most cases, only one syscall Approved by: @daviszhen * delay ranges call for dml queries [part 1] (matrixorigin#18581) delay ranges call for dml queries [part 1] Approved by: @ouyuanning * optimize shuffle plans (matrixorigin#18570) 在pipeline重构以后,由于shuffle的pipeline总是断开的,会产生额外的拷贝开销,所以shuffle的代价需要重新计算。 Approved by: @ouyuanning, @heni02 * Aggr statements with same error msg and keep the error msg info (matrixorigin#18537) changes: - aggr erorr statement filter with same error msg. Approved by: @heni02 * optimize insert pipelines to reduce pipeline berak (matrixorigin#18574) optimize insert pipelines to reduce pipeline berak remove some unused code Approved by: @ouyuanning * skip restart cn test case (matrixorigin#18607) * reduce insert memory usage (matrixorigin#18542) Clean batches in s3writer right after its used. Replace `GetUnionOneFunction` with `GetUnionAllFunction` to reduce memory allocation. before this pr:  after this pr:  Approved by: @XuPeng-SH, @aunjgr, @m-schen * adapt old config observablity::metricUpdateStorageUsageInterval (matrixorigin#18573) 1. adapt mo 1.2 config item which renamed. Approved by: @daviszhen * Upgrade panic bug main (matrixorigin#18550) Resolve horizontal upgrade exception handling for the same version Approved by: @daviszhen, @zhangxu19830126 * optimize pipelines for external scan to reduce pipeline breaks (matrixorigin#18612) load data inline语句,总是单并发的 load data local,或者压缩的文件,读总是单并发的,写的并发度需要根据大小来决定,并不总是满并发。 Approved by: @ouyuanning * fix panic in ut (matrixorigin#18582) - Remove the panic check inside CheckFlushTaskRetry and collectDelsAndTransfer. It's safe to retry this task. Approved by: @XuPeng-SH * fix: 修改load strict参数默认行为 (matrixorigin#18587) 修改load data strict参数的默认行为,默认值由false改为true, 与文档保持一致 Approved by: @iamlinjunhong, @badboynt1 * mpool: remove memHdr.allocateStacktraceID (matrixorigin#18579) should not put unique.Handle field in memHdr. Approved by: @m-schen * container/hashtable: replace mpool with malloc (matrixorigin#18446) hashtable: add getAllocator container/hashtable: use malloc in string hashtable container/hashtable: use malloc in int64 hashtable Approved by: @badboynt1, @m-schen, @aunjgr, @XuPeng-SH * refactor some code for the future code reuse for disttae and tae (matrixorigin#18564) 1. remove unused vector pool in the Read api 2. code refactor for the future code reuse for disttae and tae Approved by: @ouyuanning, @daviszhen, @LeftHandCold, @reusee, @triump2020, @m-schen, @aunjgr * fileservice: object storage semaphore refinements (matrixorigin#18543) fileservice: also acquire semaphore in List and Write fileservice: set default IO concurrency to 1024 fileservice: ensure semaphore is release when error Approved by: @fengttt * Refactor block prefetch (matrixorigin#18626) The Prefetch interface is redundant and unnecessary interfaces and parameters need to be removed. Approved by: @XuPeng-SH * Incr blob size limit and clean up. (matrixorigin#18508) Incr max blob size and code cleanup. Approved by: @aressu1985, @heni02, @m-schen, @XuPeng-SH * add more sharding metrics (matrixorigin#18628) add more sharding metrics Approved by: @aptend * Optimize select count (matrixorigin#18627) 1. fix previous false positive logic 2. only check object metadata 3. todo: refactor whole select count(*) later Approved by: @triump2020, @LeftHandCold * Adding a create db test. (matrixorigin#18576) Adding a test Approved by: @zhangxu19830126 * adding flag to block info and object info (matrixorigin#18527) flag: 1. dependable 2. sorted 3. by cn created. Approved by: @daviszhen, @ouyuanning, @aunjgr, @m-schen, @XuPeng-SH, @LeftHandCold, @triump2020 * some code refactor (matrixorigin#18633) 1. refactor some code for the future code reuse 2. remove dummy code 3. remove stale stats related code Approved by: @LeftHandCold, @triump2020 * adding object count metric for accounts. (matrixorigin#18363) adding object count metric for accounts. Approved by: @daviszhen, @aptend, @heni02, @XuPeng-SH, @zhangxu19830126, @qingxinhome, @xzxiong * [Subtask]: reduce shuffle memory allocation step 1 (matrixorigin#18616) 从shuffle算子中抽象出shufflePool对象,为接下来多并发的shuffle算子共用同一个pool做准备 Approved by: @ouyuanning * retry lock without lock table (matrixorigin#18611) retry lock without lock table Approved by: @m-schen, @daviszhen, @reusee, @zhangxu19830126, @XuPeng-SH, @qingxinhome * support rename multi table (matrixorigin#18572) support rename multi table Approved by: @iamlinjunhong, @ouyuanning, @m-schen, @heni02, @qingxinhome, @daviszhen, @aunjgr * Fix test dedup snapshot3 (matrixorigin#18639) Fix test dedup snapshot3, update StartTS instead of SnapshotTS Approved by: @XuPeng-SH * [Bug]: performance regression of loading compressed file (matrixorigin#18644) fix a bug in external scan stats which cause performance regression Approved by: @ouyuanning * [refactor] : reuse memory for insert/delete partition table (matrixorigin#18617) 之前对于分区表的insert和delete, 每次使用的batch都是新创建出来的(见原来的GroupByPartitionForDelete和GroupByPartitionForInsert). 现在让它们复用算子自带的buffer Approved by: @m-schen * hashtable: add memory limit checks (matrixorigin#18646) to ensure the hashtable is not allocating excessive memory. Approved by: @XuPeng-SH, @aunjgr * refactor: preinsert (matrixorigin#18516) preinsert, 对于复合主键,cluster by,提前申请内存并复用。 Approved by: @badboynt1, @ouyuanning, @m-schen, @aunjgr * dispatch operator must be in a pipeline with single parallel (matrixorigin#18659) 在pipeline重构过程中,为了减少pipeline的打断,尝试让dispatch算子也支持多并发。 但是这个尝试过程中遇到了很多问题。现在先将这一逻辑回退,让dispatch算子保持单并发。 Approved by: @ouyuanning * remove persisted tombstones merge during flush (matrixorigin#18660) do not merge persisted tombstones during flush Approved by: @XuPeng-SH * fix build plan pb (matrixorigin#18653) fix build plan pb Approved by: @ouyuanning, @aunjgr * fix ut (matrixorigin#18657) fix data race Approved by: @XuPeng-SH * fix types.Rowid|types.Blockid compare related (matrixorigin#18635) 1. Roiwd and Blockid are using bytes.Compare, while byte order does not indicate the actual order of Rowid and Blockid 2. without this fix, the order of Rowid in the tombstone file is disordered and it will cause data loss during scan 3. Without this fix, the rowid column zone map may be wrong 4. This PR also improve the performance of Blockid.Compare|Objectid.Compare|Rowid.Compare Approved by: @LeftHandCold, @badboynt1, @ouyuanning, @aunjgr, @m-schen, @triump2020 * bug fix: Top Operator should not change input batch (matrixorigin#18658) bug fix: Top Operator should not change input batch Approved by: @heni02, @m-schen * Refactor join node compilation analyze (matrixorigin#18665) Fix and refactor the compilation and analysis of the join nodes: Approved by: @badboynt1, @ouyuanning, @m-schen * New operator analyzer design (matrixorigin#18332) * [bug] logtail: clear the tables in global stats when logtail re-connect (matrixorigin#18675) clear the tables in global stats when logtail re-connect Approved by: @LeftHandCold, @XuPeng-SH * fix select version (matrixorigin#18670) fix select version Approved by: @heni02, @qingxinhome * fix a bug that cause load performance regression when there is sink_scan node in plan (matrixorigin#18682) fix a bug that cause load performance regression when there is sink_scan node in plan Approved by: @ouyuanning * fix replay incomplete data to catalog cache (matrixorigin#18662) The system must complete the catalog cache replay before it can process the push logtail for the three tables. Approved by: @XuPeng-SH * fix non reserved keyword period (matrixorigin#18640) fix non reserved keyword period Approved by: @m-schen * using pointers as function arguments to avoid relying on golang compiler optimizations (matrixorigin#18693) using pointers as function arguments to avoid relying on golang compiler optimizations. The compiler sometimes chooses to allocate memory on the heap for passed parameters, leading to performance degradation <img width="1789" alt="image" src="https://github.com/user-attachments/assets/6a9b6a89-205b-4501-800d-02cb1a6ae167"> The compiler sometimes chooses to allocate memory on the heap for passed parameters, leading to performance degradation <img width="1792" alt="image" src="https://github.com/user-attachments/assets/edfc8d8e-8d12-430a-ac4d-ec421396805d"> Approved by: @LeftHandCold, @triump2020 * add restart txn interface for reuse txn operator (matrixorigin#18700) Add restart txn interface for reuse txn operator to avoid mem allocate. Approved by: @daviszhen, @reusee, @qingxinhome * avoid metadata cache access heap allocation (matrixorigin#18699) Prev: <img width="1792" alt="image" src="https://github.com/user-attachments/assets/b5927e30-6a6d-4a30-8d68-8d67b5085f38"> Now: <img width="1792" alt="image" src="https://github.com/user-attachments/assets/72dc4b1c-b9be-4f9a-9709-9f28fdfc72cc"> Approved by: @LeftHandCold, @triump2020 * malloc: optimize profiler (matrixorigin#18703) malloc: make decorator types generic malloc: add cacheline padding to ShardedAllocator Approved by: @qingxinhome * refactor merge scheduler (matrixorigin#18531) refactor merge scheduler Approved by: @aptend, @daviszhen, @XuPeng-SH * bug fix: unique key/secondary key get incorrect value (matrixorigin#18711) bug fix: unique key/secondary key get incorrect value Approved by: @heni02, @badboynt1, @aunjgr * merge main --------- Co-authored-by: Wei Ziran <[email protected]> Co-authored-by: YANGGMM <[email protected]> Co-authored-by: zengyan1 <[email protected]> Co-authored-by: reusee <[email protected]> Co-authored-by: gouhongshen <[email protected]> Co-authored-by: fengttt <[email protected]> Co-authored-by: nitao <[email protected]> Co-authored-by: XuPeng-SH <[email protected]> Co-authored-by: jiangxinmeng1 <[email protected]> Co-authored-by: CJKkkk_ <[email protected]> Co-authored-by: aptend <[email protected]> Co-authored-by: ou yuanning <[email protected]> Co-authored-by: fagongzi <[email protected]> Co-authored-by: Wenbin <[email protected]> Co-authored-by: iamlinjunhong <[email protected]> Co-authored-by: huby2358 <[email protected]> Co-authored-by: qingxinhome <[email protected]> Co-authored-by: Jackson <[email protected]> Co-authored-by: GreatRiver <[email protected]> Co-authored-by: Jensen <[email protected]> Co-authored-by: LiuBo <[email protected]>
* remove mergingObjs check (matrixorigin#18482) * forbidden restore database or table to other account (matrixorigin#18337) forbidden restore database or table to other account Approved by: @iamlinjunhong, @heni02 * Fix Ceiling (matrixorigin#18479) matrixorigin/MO-Cloud#3955 Approved by: @aunjgr, @heni02 * fileservice: release semaphore after reader close in objectStorageSemaphore.Read (matrixorigin#18463) should hold the semaphore during the Read, otherwise it will not limit the I/O concurrency to S3. Approved by: @fengttt * cn commit tombstone objects to tn. (matrixorigin#18423) remove delta location, using object stats instead. Approved by: @daviszhen, @zhangxu19830126, @reusee, @LeftHandCold, @m-schen, @XuPeng-SH, @triump2020, @aunjgr * malloc: use unique.Handle in GetStacktrace (matrixorigin#18489) use the unique package to reduce memory consumption Approved by: @m-schen * Fengttt conn (matrixorigin#18421) Repro matrixorigin#18420 Approved by: @reusee, @zhangxu19830126, @daviszhen, @qingxinhome * fileservice: do disk cache eviction after loading and before writing files (matrixorigin#18394) more eviction Approved by: @fengttt * print debug message when waiting message timeout (matrixorigin#18468) print debug message when waiting message timeout return err instead of panic Approved by: @fengttt, @aunjgr, @m-schen * refactor tombstone read related code (matrixorigin#18506) refactor tombstone related code for further usage Approved by: @triump2020, @LeftHandCold * fix mpool leak (matrixorigin#18493) 1. close abort vector 2. close object batches Approved by: @XuPeng-SH * refactoring dispatch scopes (matrixorigin#18462) 运行时展开pipeline,目前对dispatch算子有些特别处理。 现在将相关逻辑移动到compile阶段,并对其进行简化。 好处一是简化运行时的逻辑,方便后续连接pipeline,二是运行之前就能看到完整的pipeline,便于调试问题 修复了一处导致daily run hang住的bug Approved by: @ouyuanning, @fengttt, @aunjgr, @sukki37, @heni02 * fix the condition for runtime filters (matrixorigin#18496) fix the condition for runtime filters, try to push runtime filters in more cases. fix some comment Approved by: @aunjgr * remove restriction for top value optimization (matrixorigin#18503) sometimes the stats for filters are wrong, and top value optimization will be disabled. remove restriction for top value optimization Approved by: @aunjgr, @heni02 * Change the type string in SHOW CREATE TABLE to lowercase (matrixorigin#18380) Change the type string in SHOW CREATE TABLE to lowercase, consistent with MySQL, aim to be compatible with Python SQLAlchemy Approved by: @ck89119, @aunjgr, @heni02 * fix suspend not effect in main (matrixorigin#18491) fix suspend not effect in main Approved by: @daviszhen, @reusee, @sukki37 * fix too early return in processLoadLocal (matrixorigin#18498) 修复load local逻辑中函数过早返回而没有读完client发送的数据包 Approved by: @daviszhen * release hashmap memory when input batch rowcnt is 0 (matrixorigin#18514) release hashmap memory when input batch rowcnt is 0 Approved by: @m-schen, @aunjgr * fix reading of deleted data (matrixorigin#18511) fix unmatched column sizes in a read batch Approved by: @XuPeng-SH * refactor the run method of pipelines (matrixorigin#18510) mo的pipeline运行方式有四种,remoterun,mergerun,parallelrun and normalrun. remoterun发送到远端cn执行。 mergerun在协程中运行所有prescope并监控对应返回。 parallelrun在运行时展开scope,normalrun直接从rootOp算子开始call。 以前pipeline断开比较碎,run方法可以写的很简单,现在将pipeline进行连接以后,需要更复杂的run方法。 一条pipeline可能同时经历remoterun->mergerun->parallelrun->normalrun Approved by: @m-schen * code refactoring of some pipelines (matrixorigin#18525) refactor insert scope and delete merge scope, reuse some logic to make code more clear. optimize shuffle scopes to break after shuffle operator. remove unused code Approved by: @ouyuanning, @heni02 * add UT for S3 Delete (matrixorigin#18519) add UT for S3 Delete Approved by: @XuPeng-SH * remove unused code (matrixorigin#18517) remove unused code Approved by: @LeftHandCold, @triump2020 * Fix data race (matrixorigin#18536) fix data race Approved by: @LeftHandCold * fix bug: make sure vector will be free in some case (matrixorigin#18490) fix bug: make sure vector will be free in some case Approved by: @m-schen, @badboynt1, @aunjgr, @fengttt * fix TestGetInitPort (matrixorigin#18528) fix TestGetInitPort Approved by: @reusee * fix disttae mpool leak (matrixorigin#18509) fix disttae mem leak Approved by: @triump2020, @XuPeng-SH * fix transfer page race (matrixorigin#18548) fix transfer page race Approved by: @XuPeng-SH * refactor pipelines of broadcast join to reduce pipeline break (matrixorigin#18549) refactor pipelines of broadcast join to reduce pipeline break remove unused codes to make code more clear Approved by: @ouyuanning, @m-schen, @heni02, @aunjgr * avoid redundant and time-consuming type check operation using vector api (matrixorigin#18505) * remove transfer delete log (matrixorigin#18560) * add test case for create partition based shard table (matrixorigin#18515) add test case for create partition based shard table Approved by: @m-schen, @triump2020, @badboynt1, @reusee, @daviszhen, @qingxinhome, @ouyuanning, @aressu1985, @heni02, @LeftHandCold, @aunjgr * fix ut TestManyRangeLockInManyGoroutines data race (matrixorigin#18478) fix ut TestManyRangeLockInManyGoroutines data race Approved by: @zhangxu19830126 * save transfer page memory usage (matrixorigin#18557) save transfer page memory usage Approved by: @XuPeng-SH * fix a bug that sometimes hashmap is not released (matrixorigin#18554) fix a bug that sometimes hashmap is not released. free memory when query finished. Approved by: @m-schen, @aunjgr * fix: buildload LoadWriteS3 (matrixorigin#18556) buildload函数里面对builder.qry.LoadWriteS3的设置条件写反了 Approved by: @badboynt1 * Supplement v1.2.3 version upgrade package (matrixorigin#18431) Supplement v1.2.3 version upgrade package Approved by: @daviszhen, @zhangxu19830126, @LeftHandCold * refactor: value_scan (matrixorigin#18460) move constructValueScanBatch to value_scan Approved by: @ouyuanning, @aunjgr, @m-schen * support group by with rollup in parser (matrixorigin#18526) support group by with rollup in parser Approved by: @iamlinjunhong, @ouyuanning * Fix some data race in frontend (matrixorigin#18545) Fix some data race in frontend.Conn Approved by: @daviszhen * fix TransferDeleteIntent for S3 tombstone (matrixorigin#18541) fix TransferDeleteIntent Approved by: @zhangxu19830126, @XuPeng-SH * update CODEOWNERS (matrixorigin#18552) * fix checkpoint entry use after clean (matrixorigin#18580) * add sharding replica balance test case (matrixorigin#18566) Add cluster sharding test case for replica balance. Replica can be re-balance if cn added to the cluster or remove from cluster. Approved by: @daviszhen, @reusee * fileservice: re-implement memory cache with fifocache (matrixorigin#18569) re-implement memory cache with fifocache Approved by: @fengttt, @LeftHandCold * reduce *Conn.Read syscall (matrixorigin#18379) improve *Conn.Read In most cases, only one syscall Approved by: @daviszhen * delay ranges call for dml queries [part 1] (matrixorigin#18581) delay ranges call for dml queries [part 1] Approved by: @ouyuanning * optimize shuffle plans (matrixorigin#18570) 在pipeline重构以后,由于shuffle的pipeline总是断开的,会产生额外的拷贝开销,所以shuffle的代价需要重新计算。 Approved by: @ouyuanning, @heni02 * Aggr statements with same error msg and keep the error msg info (matrixorigin#18537) changes: - aggr erorr statement filter with same error msg. Approved by: @heni02 * optimize insert pipelines to reduce pipeline berak (matrixorigin#18574) optimize insert pipelines to reduce pipeline berak remove some unused code Approved by: @ouyuanning * skip restart cn test case (matrixorigin#18607) * reduce insert memory usage (matrixorigin#18542) Clean batches in s3writer right after its used. Replace `GetUnionOneFunction` with `GetUnionAllFunction` to reduce memory allocation. before this pr:  after this pr:  Approved by: @XuPeng-SH, @aunjgr, @m-schen * adapt old config observablity::metricUpdateStorageUsageInterval (matrixorigin#18573) 1. adapt mo 1.2 config item which renamed. Approved by: @daviszhen * Upgrade panic bug main (matrixorigin#18550) Resolve horizontal upgrade exception handling for the same version Approved by: @daviszhen, @zhangxu19830126 * optimize pipelines for external scan to reduce pipeline breaks (matrixorigin#18612) load data inline语句,总是单并发的 load data local,或者压缩的文件,读总是单并发的,写的并发度需要根据大小来决定,并不总是满并发。 Approved by: @ouyuanning * fix panic in ut (matrixorigin#18582) - Remove the panic check inside CheckFlushTaskRetry and collectDelsAndTransfer. It's safe to retry this task. Approved by: @XuPeng-SH * fix: 修改load strict参数默认行为 (matrixorigin#18587) 修改load data strict参数的默认行为,默认值由false改为true, 与文档保持一致 Approved by: @iamlinjunhong, @badboynt1 * mpool: remove memHdr.allocateStacktraceID (matrixorigin#18579) should not put unique.Handle field in memHdr. Approved by: @m-schen * container/hashtable: replace mpool with malloc (matrixorigin#18446) hashtable: add getAllocator container/hashtable: use malloc in string hashtable container/hashtable: use malloc in int64 hashtable Approved by: @badboynt1, @m-schen, @aunjgr, @XuPeng-SH * refactor some code for the future code reuse for disttae and tae (matrixorigin#18564) 1. remove unused vector pool in the Read api 2. code refactor for the future code reuse for disttae and tae Approved by: @ouyuanning, @daviszhen, @LeftHandCold, @reusee, @triump2020, @m-schen, @aunjgr * fileservice: object storage semaphore refinements (matrixorigin#18543) fileservice: also acquire semaphore in List and Write fileservice: set default IO concurrency to 1024 fileservice: ensure semaphore is release when error Approved by: @fengttt * Refactor block prefetch (matrixorigin#18626) The Prefetch interface is redundant and unnecessary interfaces and parameters need to be removed. Approved by: @XuPeng-SH * Incr blob size limit and clean up. (matrixorigin#18508) Incr max blob size and code cleanup. Approved by: @aressu1985, @heni02, @m-schen, @XuPeng-SH * add more sharding metrics (matrixorigin#18628) add more sharding metrics Approved by: @aptend * Optimize select count (matrixorigin#18627) 1. fix previous false positive logic 2. only check object metadata 3. todo: refactor whole select count(*) later Approved by: @triump2020, @LeftHandCold * Adding a create db test. (matrixorigin#18576) Adding a test Approved by: @zhangxu19830126 * adding flag to block info and object info (matrixorigin#18527) flag: 1. dependable 2. sorted 3. by cn created. Approved by: @daviszhen, @ouyuanning, @aunjgr, @m-schen, @XuPeng-SH, @LeftHandCold, @triump2020 * some code refactor (matrixorigin#18633) 1. refactor some code for the future code reuse 2. remove dummy code 3. remove stale stats related code Approved by: @LeftHandCold, @triump2020 * adding object count metric for accounts. (matrixorigin#18363) adding object count metric for accounts. Approved by: @daviszhen, @aptend, @heni02, @XuPeng-SH, @zhangxu19830126, @qingxinhome, @xzxiong * [Subtask]: reduce shuffle memory allocation step 1 (matrixorigin#18616) 从shuffle算子中抽象出shufflePool对象,为接下来多并发的shuffle算子共用同一个pool做准备 Approved by: @ouyuanning * retry lock without lock table (matrixorigin#18611) retry lock without lock table Approved by: @m-schen, @daviszhen, @reusee, @zhangxu19830126, @XuPeng-SH, @qingxinhome * support rename multi table (matrixorigin#18572) support rename multi table Approved by: @iamlinjunhong, @ouyuanning, @m-schen, @heni02, @qingxinhome, @daviszhen, @aunjgr * Fix test dedup snapshot3 (matrixorigin#18639) Fix test dedup snapshot3, update StartTS instead of SnapshotTS Approved by: @XuPeng-SH * [Bug]: performance regression of loading compressed file (matrixorigin#18644) fix a bug in external scan stats which cause performance regression Approved by: @ouyuanning * [refactor] : reuse memory for insert/delete partition table (matrixorigin#18617) 之前对于分区表的insert和delete, 每次使用的batch都是新创建出来的(见原来的GroupByPartitionForDelete和GroupByPartitionForInsert). 现在让它们复用算子自带的buffer Approved by: @m-schen * hashtable: add memory limit checks (matrixorigin#18646) to ensure the hashtable is not allocating excessive memory. Approved by: @XuPeng-SH, @aunjgr * refactor: preinsert (matrixorigin#18516) preinsert, 对于复合主键,cluster by,提前申请内存并复用。 Approved by: @badboynt1, @ouyuanning, @m-schen, @aunjgr * dispatch operator must be in a pipeline with single parallel (matrixorigin#18659) 在pipeline重构过程中,为了减少pipeline的打断,尝试让dispatch算子也支持多并发。 但是这个尝试过程中遇到了很多问题。现在先将这一逻辑回退,让dispatch算子保持单并发。 Approved by: @ouyuanning * remove persisted tombstones merge during flush (matrixorigin#18660) do not merge persisted tombstones during flush Approved by: @XuPeng-SH * fix build plan pb (matrixorigin#18653) fix build plan pb Approved by: @ouyuanning, @aunjgr * fix ut (matrixorigin#18657) fix data race Approved by: @XuPeng-SH * fix types.Rowid|types.Blockid compare related (matrixorigin#18635) 1. Roiwd and Blockid are using bytes.Compare, while byte order does not indicate the actual order of Rowid and Blockid 2. without this fix, the order of Rowid in the tombstone file is disordered and it will cause data loss during scan 3. Without this fix, the rowid column zone map may be wrong 4. This PR also improve the performance of Blockid.Compare|Objectid.Compare|Rowid.Compare Approved by: @LeftHandCold, @badboynt1, @ouyuanning, @aunjgr, @m-schen, @triump2020 * bug fix: Top Operator should not change input batch (matrixorigin#18658) bug fix: Top Operator should not change input batch Approved by: @heni02, @m-schen * Refactor join node compilation analyze (matrixorigin#18665) Fix and refactor the compilation and analysis of the join nodes: Approved by: @badboynt1, @ouyuanning, @m-schen * New operator analyzer design (matrixorigin#18332) * [bug] logtail: clear the tables in global stats when logtail re-connect (matrixorigin#18675) clear the tables in global stats when logtail re-connect Approved by: @LeftHandCold, @XuPeng-SH * fix select version (matrixorigin#18670) fix select version Approved by: @heni02, @qingxinhome * fix a bug that cause load performance regression when there is sink_scan node in plan (matrixorigin#18682) fix a bug that cause load performance regression when there is sink_scan node in plan Approved by: @ouyuanning * fix replay incomplete data to catalog cache (matrixorigin#18662) The system must complete the catalog cache replay before it can process the push logtail for the three tables. Approved by: @XuPeng-SH * fix non reserved keyword period (matrixorigin#18640) fix non reserved keyword period Approved by: @m-schen * using pointers as function arguments to avoid relying on golang compiler optimizations (matrixorigin#18693) using pointers as function arguments to avoid relying on golang compiler optimizations. The compiler sometimes chooses to allocate memory on the heap for passed parameters, leading to performance degradation <img width="1789" alt="image" src="https://github.com/user-attachments/assets/6a9b6a89-205b-4501-800d-02cb1a6ae167"> The compiler sometimes chooses to allocate memory on the heap for passed parameters, leading to performance degradation <img width="1792" alt="image" src="https://github.com/user-attachments/assets/edfc8d8e-8d12-430a-ac4d-ec421396805d"> Approved by: @LeftHandCold, @triump2020 * add restart txn interface for reuse txn operator (matrixorigin#18700) Add restart txn interface for reuse txn operator to avoid mem allocate. Approved by: @daviszhen, @reusee, @qingxinhome * avoid metadata cache access heap allocation (matrixorigin#18699) Prev: <img width="1792" alt="image" src="https://github.com/user-attachments/assets/b5927e30-6a6d-4a30-8d68-8d67b5085f38"> Now: <img width="1792" alt="image" src="https://github.com/user-attachments/assets/72dc4b1c-b9be-4f9a-9709-9f28fdfc72cc"> Approved by: @LeftHandCold, @triump2020 * malloc: optimize profiler (matrixorigin#18703) malloc: make decorator types generic malloc: add cacheline padding to ShardedAllocator Approved by: @qingxinhome * refactor merge scheduler (matrixorigin#18531) refactor merge scheduler Approved by: @aptend, @daviszhen, @XuPeng-SH * bug fix: unique key/secondary key get incorrect value (matrixorigin#18711) bug fix: unique key/secondary key get incorrect value Approved by: @heni02, @badboynt1, @aunjgr * set userID & roleID for executor (matrixorigin#18708) Set UserID & RoleID for normal account. Approved by: @YANGGMM, @daviszhen, @heni02 * Datalink support stage URL and add save_file() function (matrixorigin#18668) datalink add stage URL support and add save_file() function Approved by: @heni02, @daviszhen, @m-schen, @aunjgr, @aressu1985, @XuPeng-SH * optimize create vector index (matrixorigin#18713) don't return error when message timeout, only print log for product join, get one batch from left child before receive from build side Approved by: @m-schen, @XuPeng-SH * add new ExpressionExecutor for plan.Expr_List (matrixorigin#18698) add new ExpressionExecutor for plan.Expr_List Approved by: @m-schen --------- Co-authored-by: Wei Ziran <[email protected]> Co-authored-by: YANGGMM <[email protected]> Co-authored-by: zengyan1 <[email protected]> Co-authored-by: reusee <[email protected]> Co-authored-by: gouhongshen <[email protected]> Co-authored-by: fengttt <[email protected]> Co-authored-by: nitao <[email protected]> Co-authored-by: XuPeng-SH <[email protected]> Co-authored-by: jiangxinmeng1 <[email protected]> Co-authored-by: CJKkkk_ <[email protected]> Co-authored-by: aptend <[email protected]> Co-authored-by: ou yuanning <[email protected]> Co-authored-by: fagongzi <[email protected]> Co-authored-by: Wenbin <[email protected]> Co-authored-by: iamlinjunhong <[email protected]> Co-authored-by: huby2358 <[email protected]> Co-authored-by: qingxinhome <[email protected]> Co-authored-by: Jackson <[email protected]> Co-authored-by: GreatRiver <[email protected]> Co-authored-by: Jensen <[email protected]> Co-authored-by: LiuBo <[email protected]> Co-authored-by: Eric Lam <[email protected]>
fix the condition for runtime filters, try to push runtime filters in more cases. fix some comment Approved by: @aunjgr
cp to 1.2-dev 'fix the condition for runtime filters (#18496)' Approved by: @ouyuanning, @aunjgr, @sukki37
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #18387
What this PR does / why we need it:
fix the condition for runtime filters, try to push runtime filters in more cases.
fix some comment
PR Type
Bug fix
Description
Selectivity
field to be calculated based on the ratio ofoutcnt
tochildStats.Outcnt
.Changes walkthrough 📝
stats.go
Fix selectivity calculation for aggregation nodes
pkg/sql/plan/stats.go
Selectivity
to be calculated asoutcnt / childStats.Outcnt
.