-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add copying where it might be necessary #4118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall approach looks good to me - thanks for working on this!
One quick tip - please make your pull request(s) from a branch, not from master - https://hynek.me/articles/pull-requests-branch/ explains why this is important. I think in this case you'll need to close the PR, make a branch with your commits, and open a new one from that branch.
We want to get any variables that might be modified in a way that might | ||
cause side-effects so that we can copy them. | ||
|
||
A variable are potentially modified if: | ||
- a method is called on it, an attribute under it, or an index | ||
of it is anywhere on the LHS of an ast.Assign or an ast.AugAssign | ||
- a method called on it, an attribute of it, or an index of it | ||
- Do we want to count methods on the RHS? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd only count method calls if they're top-level statements, not part of an assignment or inside an index or whatever - the heuristic being that methods typically either mutate state, xor have a return value.
arg[index] = ...
arg.attr = ...
arg.method()
result = arg.other_method() # probably safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely convinced that this heuristic works-- I think it is a safe assumption that a standard accessor doesn't modify anything (which is not necessarily true, but is close enough to true) but the pattern of a method that mutates and then returns is one that's relatively common-- for instance, even in the standard library, IO.read
returns a value and modifies the offset.
Files are, of course, a case where side effects exist outside of the program where copying wouldn't help, but the pattern itself is very well established in Python, and this was a simple example!
if kind is inspect.Parameter.VAR_POSITIONAL: | ||
# TODO: I don't know what correct behavior is here | ||
pass | ||
if kind is inspect.Parameter.VAR_KEYWORD: | ||
# TODO: I don't know what correct behavior is here | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We never ghostwrite inputs for these, so no need to support them 🙂
if copy_args: | ||
potentially_modified = potentially_modified_vars(ast.parse(inspect.getsource(func)).body[0], set(params_dd.keys())) | ||
else: | ||
potentially_modified = set() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably want to deepcopy all arguments unconditionally if copy_args=True
, and autodetect on =None
.
The code would be a bit cleaner if we also do a from copy import deepcopy
, meaning this function would report a str, set-of-imports
tuple - there are plenty of examples you can see of how that works.
I'm making the new PR right now! I'll act on the comments here, no need to rewrite them, but I will be copying over the whole description |
NOTE: This is not ready for review, I have a significant amount of documentation to add, I need to update the test suite, and I missed several cases that I am writing out below. I am posting this early to receive feedback on the broad strokes of the approach, the implementation is (right now) entirely incorrect.
Description
A proposed fix for #4113 that identifies what parameters might be mutated within a function that Ghostwriter is called on. To prevent side effects (particularly in equivalence mode), adds a call to
copy.deepcopy
to the parameter in the test function.Right now, I mark a variable as potentially modified if:
ast.Assign
orast.AugAssign
that is not inside anast.Subscript
orast.Slice
. This includes when it is being indexed into or an attribute of it is being assignedCall
ed. Even if this isn't on the left-hand side of an assignment, it is safe to assume that any call may create side effects.Todo
value
ofast.Subscript
orslice
inast.Slice
potentially_modified_vars