-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test harness for regexp optimization #18213
Conversation
I didn't examine every line in detail, but I think it should be merged after squashing the commits. The outline looks good, and it is a new test that can be tweaked as we go along. |
@iabyn ping? |
On Tue, Oct 06, 2020 at 08:47:42AM -0700, Hugo van der Sanden wrote:
I welcome opinions on whether this is a sane approach to testing
optimization flags for regexps before I start adding lots more tests. I
can feel a lot of TODOs coming on ...
@iabyn of particular interest is whether this approach is going to be a
problem for your regexp substring plans. Should I change the return data
for re::optimization to provide substrs as an arrayref right now? Or are
those plans far enough out that I shouldn't worry about them?
You can view, comment on, or merge this pull request online at:
Such plans are so far off that you might as well go with anchored/floating
etc for now. So this gets a +1 from me.
Just some general comments on data file format:
- we should allow lines containing comments: i.e. 'next if /^\s*#/'
although they should still do a pass() so the test number corresponds to
the line number within __DATA__.
- possibly extend comments to any line: i.e. s/#.*$// (unless it's likely
people will need to write patterns or data containing '#'s)? If we do
this, then trailing comments will be delineated by a # rather than a
tab.
- should the field separator be whitepace rather than tabs?
…--
Counsellor Troi states something other than the blindingly obvious.
-- Things That Never Happen in "Star Trek" #16
|
Thanks @iabyn.
Yes, good idea; I'll add this, and extend it to blank lines too.
I'm not so sure about these: it is no great hardship to write I do realise that it can be a little inconvenient to insert tabs if one's editor is set up to avoid them in perl's source (as mine is), but to me the benefits seemed to outweigh that. |
Is this ready for merge? |
With a varying number of tests per data line, the plan is too much work to maintain.
Say why we're skipping; skip min/max tests for substrings if we didn't get the substring; skip checking test for substrings if we didn't get the substring we expect to be checked.
Probably, yes; sorry, I've had no spare capacity in recent weeks. I've rebased and pushed the latest state, which improves the handling of skips somewhat, and adds more tests: there are now 32 patterns generating 586 tests of which 14 are TODO; it would be useful also if you (@khwilliamson) could take a look at those TODOs and get a sense a) whether you agree the expected results are what we'd ideally want, and b) where those 14 cases fall on the spectrum between nice-to-have and bug. I don't think anything need gate on that though, I think it's probably fine to merge as is. |
I welcome opinions on whether this is a sane approach to testing optimization flags for regexps before I start adding lots more tests. I can feel a lot of TODOs coming on ...
@iabyn of particular interest is whether this approach is going to be a problem for your regexp substring plans. Should I change the return data for re::optimization to provide substrs as an arrayref right now? Or are those plans far enough out that I shouldn't worry about them?