Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): Support pre_execution_query parameter from connectorx #21288

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jsjasonseba
Copy link

Implements #21287

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Feb 16, 2025
Copy link

codecov bot commented Feb 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.03%. Comparing base (5e51218) to head (8547f90).
Report is 143 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #21288      +/-   ##
==========================================
+ Coverage   79.82%   80.03%   +0.20%     
==========================================
  Files        1596     1604       +8     
  Lines      228565   231339    +2774     
  Branches     2608     2642      +34     
==========================================
+ Hits       182462   185142    +2680     
- Misses      45507    45588      +81     
- Partials      596      609      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ritchie46
Copy link
Member

@alexander-beedie can you review this one?

@mcrumiller
Copy link
Contributor

@alexander-beedie got time to review this? I gave a few suggestions but haven't heard back from the PR creator, so if they've disappeared I can just take this one over--this would help my team quite a bit, as our postgres implementation (Greenplum) keeps the query optimizer disabled by default and this is the only way to enable it.

@jsjasonseba
Copy link
Author

Hi @mcrumiller , which suggestions are you talking about?

In the meantime, I used the pre_execution_query param in connectorx read_sql and set the destination to arrow, and then use the from_arrow in polars. This might be a viable workaround for you too

Copy link
Contributor

@mcrumiller mcrumiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @jsjasonseba, I used the review mechanism instead of normal comments and forgot to click the submit button. You should see my recommendations now.

)
elif engine == "adbc":
if not isinstance(query, str):
msg = "only a single SQL query string is accepted for adbc"
raise ValueError(msg)
if pre_execution_query:
msg = (
"the 'connectorx' engine does not support use of `pre_execution_query`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should say adbc, not connectorx.

query="SELECT * FROM test_data",
protocol="sqlite",
errclass=ValueError,
errmsg="the 'connectorx' engine does not support use of `pre_execution_query`",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

partition_num=partition_num,
protocol=protocol,
pre_execution_query=pre_execution_query,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor refactor, but this section would be a bit cleaner as:

try:
    if pre_execution_query:
        if parse_version(cx.__version__) < (0, 4, 2):
            msg = "pre_execution_query is only supported in connectorx version 0.4.2 or later."
            raise ValueError(msg)
        return_type = "arrow2"
        pre_execution_args = {"pre_execution_query": pre_execution_query}
    else:
        return_type = "arrow"
        pre_execution_args = {}
    tbl = cx.read_sql(
        conn=connection_uri,
        query=query,
        return_type=return_type,
        partition_on=partition_on,
        partition_range=partition_range,
        partition_num=partition_num,
        protocol=protocol,
        **pre_execution_args,
    )

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Appreciate your feedback!

pre_execution_args = {}
else:
return_type = "arrow"
pre_execution_args = {"pre_execution_query": pre_execution_query}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is more correct than my suggestion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, made some slight correction. But I got your point clearly. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants