Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regexp caching #354

Merged
merged 10 commits into from
Jan 6, 2025
Merged

Regexp caching #354

merged 10 commits into from
Jan 6, 2025

Conversation

apalala
Copy link
Collaborator

@apalala apalala commented Jan 5, 2025

Caching by re.compile() is mostly undocumented and unreliable.

Implement caching of compiled regexes for the few critical ones in TatSu configurations.

@apalala apalala requested a review from vfazio January 5, 2025 15:48
Copy link
Collaborator

@vfazio vfazio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you want to address it in this MR, but I do want to call out that we're still applying (?m) implicitly on whitespace:

return re.compile(f'(?m){whitespace}')

You've updated the ebnf grammar to specify this flag explicitly

@@whitespace :: /(?m)\s+/

which was necessary when it was an re.Pattern but the parsing will be hitting the str branch of this logic. It may make sense to just remove that (?m) and update the documentation to say users need to update their patterns as necessary.

@vfazio
Copy link
Collaborator

vfazio commented Jan 6, 2025

seems ok so far, just a question on the whitespace pattern which could be spun off into a new issue/PR and the regex validation

@apalala
Copy link
Collaborator Author

apalala commented Jan 6, 2025

seems ok so far, just a question on the whitespace pattern which could be spun off into a new issue/PR and the regex validation

Yes! There may be more improvements in further issues and pull requests.

I'd like to merge and make a release now to get feedback about what projects we broke ASAP.

@vfazio
Copy link
Collaborator

vfazio commented Jan 6, 2025

Last minor fixup. Since you're updating whitespace I think it would be nice to add a note to the documentation like was done for the comments and eol_comments:

TatSu/docs/directives.rst

Lines 121 to 127 in f3fb92a

@@whitespace :: /[\t ]+/
To disable any parsing of whitespace, use ``None`` for the definition:
.. code::
@@whitespace :: None

Should be OK after this minor change

@vfazio
Copy link
Collaborator

vfazio commented Jan 6, 2025

@apalala do you want me to merge this or do you want to?

@apalala apalala merged commit 31dd78d into master Jan 6, 2025
2 checks passed
@apalala apalala deleted the regexp_caching branch January 6, 2025 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants