Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show stats for aggregation queries in SHOW STATS #6323

Closed
kokosing opened this issue Dec 13, 2020 · 2 comments · Fixed by #16201
Closed

Show stats for aggregation queries in SHOW STATS #6323

kokosing opened this issue Dec 13, 2020 · 2 comments · Fixed by #16201
Assignees
Labels
enhancement New feature or request

Comments

@kokosing
Copy link
Member

Currently (after #3109, see TestShowStats):

        assertQuery("SHOW STATS FOR (SELECT avg(totalprice) FROM orders GROUP BY orderkey)",
                "SELECT * FROM (VALUES " +
                        "   ('_col0', null, null, null, null, null, null), " +
                        "   (null, null, null, null, null, null, null))");
 assertQuery(
                "SHOW STATS FOR (SELECT DISTINCT * FROM orders)",
                "VALUES " +
                        "   ('orderkey', null, null, null, null, null, null), " +
                        "   ('custkey', null, null, null, null, null, null), " +
                        "   ('orderstatus', null, null, null, null, null, null), " +
                        "   ('totalprice', null, null, null, null, null, null), " +
                        "   ('orderdate', null, null, null, null, null, null), " +
                        "   ('orderpriority', null, null, null, null, null, null), " +
                        "   ('clerk', null, null, null, null, null, null), " +
                        "   ('shippriority', null, null, null, null, null, null), " +
                        "   ('comment', null, null, null, null, null, null), " +
                        "   (null, null, null, null, null, null, null)");
        assertQuery(
                "SHOW STATS FOR (SELECT DISTINCT regionkey FROM region)",
                "VALUES " +
                        "   ('regionkey', null, null, null, null, null, null), " +
                        "   (null, null, null, null, null, null, null)");
    }

However, we should be able to the row count as NDV of columns are known.

@kokosing kokosing added the enhancement New feature or request label Dec 13, 2020
@kokosing kokosing changed the title Calculate statistics for aggregations Calculate row count statistic for aggregations Dec 13, 2020
@findepi
Copy link
Member

findepi commented Dec 14, 2020

Calculate row count statistic for aggregations

The title can be misunderstood a bit.

We do calculate the row count for aggregations and this is important for CBO. The reason why it didn't work in #3109 is -- i am guessing -- because of partial aggregations.
A potential solution could be disabling partial aggregations when planning for SHOW STATS.

@findepi findepi changed the title Calculate row count statistic for aggregations Show stats for aggregation queries in SHOW STATS Dec 14, 2020
@findepi
Copy link
Member

findepi commented Apr 20, 2021

#6998 could help with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
3 participants