-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deparse forgets use utf8 #11334
Comments
From [email protected]This should certainly be emitting a use utf8 at the top: % perl -CS -MO=Deparse,-p -E 'say "\N{U+3b1}-\N{U+3c9}"' --tom Summary of my perl5 (revision 5 version 12 subversion 3) configuration: Characteristics of this binary (from libperl): |
From @cpansproutOn Sat May 14 14:14:45 2011, tom christiansen wrote:
If it were to put ‘use utf8’ at the top, would it not make sense for it After all, ‘use utf8’ indicates that the *bytes* that follow consist of And should it behave differently depending on whether the output is -- Father Chrysostomos |
The RT System itself - Status changed from 'new' to 'open' |
From @cpansproutOn Thu Jan 05 14:04:42 2012, sprout wrote:
Don’t forget that (under use v5.16) eval("'\x{100}'") does the same -- Father Chrysostomos |
From @ikegamiOn Thu, Jan 5, 2012 at 5:04 PM, Father Chrysostomos via RT <
That's not relevant. The issue is that the string built by the program $ perl -E'$_="\N{U+3b1}-\N{U+3c9}"; say length;' $ perl -MO=Deparse -E'$_="\N{U+3b1}-\N{U+3c9}"; say length;' | perl |
From @cpansproutOn Thu Jan 05 14:24:41 2012, ikegami@adaelis.com wrote:
Note the wide character warning. If one were to eval() the string It makes sense to me to *encode* output as utf8 by default, with ‘use -- Father Chrysostomos |
From @ikegamiOn Thu, Jan 5, 2012 at 5:39 PM, Father Chrysostomos via RT <
Agree. It makes sense to me to *encode* output as utf8 by default, with ‘use utf8’. Agree. |
From @ap* Father Chrysostomos via RT <perlbug-followup@perl.org> [2012-01-05 23:05]:
I think Deparse needs to do better for strings. The output here should BEGIN { That would be independent of encodings (well, beyond… ASCII I guess) as Currently Deparse actually does the opposite transform for strings – if (Maybe it should even use \N by default. In fact I would be sure, if it Regards, |
From @cpansproutOn Thu Jan 05 16:33:41 2012, aristotle wrote:
What about symbol names? -- Father Chrysostomos |
From @ap* Father Chrysostomos via RT <perlbug-followup@perl.org> [2012-01-06 01:45]:
They don’t have an easy answer I can think of. The subset of lexical variable names sort of has one, insofar as they But no generalised solution for all identifiers comes to mind. Then again identifiers should be getting normalised anyway (which Brian Regards, |
From @cpansproutThis ticket is about whether B::Deparse output should use "\x{100}" or "Ā" and whether the latter should be encoded or not and whether the output should include ‘use utf8’. On Thu Jan 05 19:30:00 2012, aristotle wrote:
To make things more complex: What about /(?<айдэнтыфайер>)/? You can’t escape those characters, because you get a syntax error. You can’t change them, because they correspond to hash keys. Also, the question as to whether coderef2text output should be evallable or evalbytesable is still unanswered. (My gut feeling is that output from -MO=Deparse should be a stream of bytes, so it can be output without wide char warnings: $ ./perl -Ilib -MO=Deparse -e 'use utf8; our $фу' But that coderef2text should be a Unicode string so it can be fed to ‘eval’.) -- Father Chrysostomos |
From @cpansproutOn Wed Dec 10 22:00:32 2014, sprout wrote:
And here is a similar issue: use utf8; I recently made it so that the "Böck" is output with an escape, just to avoid malformation errors. (It was being emitted as Latin-1, so the output fed back to perl resulted in corrupt strings.) But now the problem is that the test (from t/re/pat.t) fails, because we do longer have a utf8-flagged string. Granted, this test is too sensitive, in that it is checking the internal storage of a scalar. But this is a *core* test that just ensures that the tests that follow are testing what we think they are testing. This is another case where the core tests don’t lend themselves to being deparsed and re-run. -- Father Chrysostomos |
From @ap* Father Chrysostomos via RT <perlbug-followup@perl.org> [2014-12-11 07:05]:
Ugh. *scrunchface* Your nose for lurking evil is just too good… :-) Now, what answer do you expect? If that leaves no other option, then it
Certainly.
Seems a wash outside of the usability issue that people are probably * Father Chrysostomos via RT <perlbug-followup@perl.org> [2014-12-11 07:20]:
Is that testing the regexp engine or the parser? If it’s not testing the It should still assert that the flag has the required value, of course, I don’t like perl making promises that particular forms of writing the So as far as I care, this is a bug in the test. Not a bug in Deparse. — • — OTOH, if the test *were* trying to test the parser, I would say this is It’s one thing for Deparse to preserve the exact semantics of a program. But because there are many semantically identical representations of any So at best you can test that Deparse re-deparses them consistently after Regards, |
What this does in 5.35.10 is |
I think the behavior is acceptable, and am taking this ticket for the purpose of closing if I don't hear objections by May 31, 2024. |
Migrated from rt.perl.org#90590 (status was 'open')
Searchable as RT90590$
The text was updated successfully, but these errors were encountered: