Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

native parser: tokenizer error: <U+2EBF0> is not a valid identifier #1288

Open
jakkdl opened this issue Jan 28, 2025 · 0 comments
Open

native parser: tokenizer error: <U+2EBF0> is not a valid identifier #1288

jakkdl opened this issue Jan 28, 2025 · 0 comments
Labels
bug Something isn't working parsing Converting source code into CST nodes

Comments

@jakkdl
Copy link
Contributor

jakkdl commented Jan 28, 2025

Hypothesis found a failing test case in the flake8-async CI which boiled down to libcst's native parser incorrectly marking 𮯰 as an invalid identifier. It is a valid identifier, but only on 3.13+. The pure parser works as expected.

Repro:

import libcst
libcst.parse_module('class 𮯰: pass\n')
$ export LIBCST_PARSER_TYPE=native
$ python repro.py
Traceback (most recent call last):
  File "/tmp/f8as_crash/foo.py", line 2, in <module>
    libcst.parse_module('class 𮯰: pass\n')
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/tmp/f8as_crash/.venv/lib/python3.13/site-packages/libcst/_parser/entrypoints.py", line 109, in parse_module
    result = _parse(
        "file_input",
    ...<3 lines>...
        detect_default_newline=True,
    )
  File "/tmp/f8as_crash/.venv/lib/python3.13/site-packages/libcst/_parser/entrypoints.py", line 55, in _parse
    return parse(source_str)
libcst._exceptions.ParserSyntaxError: Syntax Error @ 1:1.
tokenizer error: "𮯰" is not a valid identifier

class 𮯰: pass
^
$ python --version
Python 3.13.1
$ pip list | grep libcst
libcst                1.6.0
@zsol zsol added bug Something isn't working parsing Converting source code into CST nodes labels Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parsing Converting source code into CST nodes
Projects
None yet
Development

No branches or pull requests

2 participants