Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLPattern use of URL record is ambiguous #252

Open
youennf opened this issue Jan 15, 2025 · 6 comments
Open

URLPattern use of URL record is ambiguous #252

youennf opened this issue Jan 15, 2025 · 6 comments

Comments

@youennf
Copy link

youennf commented Jan 15, 2025

What is the issue with the URL Pattern Standard?

URLPattern is using URL records, and passing some URL records to URL algorithms. This might lead to ambiguous behaviour.

For instance https://urlpattern.spec.whatwg.org/#canonicalize-a-hostname is passing the URL record to parse hostname (via https://url.spec.whatwg.org/#hostname-state).
It is unclear whether an URL record is a special URL or not, so it is unclear whether parsing of the hostname (https://url.spec.whatwg.org/#concept-host-parser) will lead to https://url.spec.whatwg.org/#concept-opaque-host-parser.

Looking at WPT tests and Chrome results, the expected behaviour is to consider that the url is special and hostname is not considered opaque.

The same pattern is done in other places of the spec (canonicalisation algorithms of port, pathname, search and hash for instance), though it is not clear to me whether this has side effects.

Discussing with @annevk, it is unexpected to pass URL records directly like done in https://urlpattern.spec.whatwg.org/#canonicalize-a-hostname. A future url spec refactoring might make that clear.

Ideally, the url given to the basic URL Parser should be an URL built by a URL parser.
Maybe that is what the spec should do.

@sisidovski
Copy link
Collaborator

Thank you for filing this. I agree, the current spec doesn't specify the given URL is special or not and this should be explicit. Or even the current spec may be wrong because the default value of scheme is empty string, which is not a special scheme so the URL wouldn't be special.

Looking at WPT tests and Chrome results, the expected behaviour is to consider that the url is special and hostname is not considered opaque.

Yes.

Ideally, the url given to the basic URL Parser should be an URL built by a URL parser.
Maybe that is what the spec should do.

Are you referring URL parser? I see there is a comment in URL parser, saying "Non-web-browser implementations only need to implement the basic URL parser". I assume some non-web-browser implementers don't implement this algorithm. Do you have any concern about just using the result of basic URL parser, given "http://dummy.test" as an input and pass it as dummyURL in https://urlpattern.spec.whatwg.org/#canonicalize-a-hostname?

@annevk
Copy link
Member

annevk commented Mar 12, 2025

I think that generally works, but note that you cannot distinguish between someone passing in a hostname of : and a hostname of dummy.test. So how exactly do you know whether it succeeded?

@annevk
Copy link
Member

annevk commented Mar 12, 2025

I suppose we could maybe standardize upon a magic host value, such as urlpattern.invalid, but it doesn't seem great.

@annevk
Copy link
Member

annevk commented Mar 13, 2025

Perhaps we can make https://url.spec.whatwg.org/#hostname-state return failure in more cases to make it more suitable for this caller. As all other callers with hostname state as override state ignore the return value. Which does make this API stand out, but so be it. I'll investigate.

annevk added a commit to whatwg/url that referenced this issue Mar 13, 2025
URLPattern's canonicalize a hostname is the only invocation of the basic URL parser with a URL and state override set to hostname state that looks at the return failure.

And since the URL that URLPattern uses is a dummy URL that is ideally not exposed, knowing about all the failure conditions is kind of important.

(Now technically the second return failure added here cannot be observed as the dummy URL won't have credentials or a non-null port, but it seemed good to change that at the same time for consistency.)

This is covered by existing URLPattern web-platform-tests. Some more background can be found in whatwg/urlpattern#252.
annevk added a commit to web-platform-tests/wpt that referenced this issue Mar 13, 2025
@annevk
Copy link
Member

annevk commented Mar 13, 2025

@sisidovski I think whatwg/url#863 coupled with your suggested change here to create a proper dummy URL this should be good. When implementing this I found one impacted test for which I put up a patch at web-platform-tests/wpt#51316.

@sisidovski
Copy link
Collaborator

Thank you so much. I'll check your PRs and then work on the URLPattern side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants