Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some caching occurs when using pylint in a server like mode. #129

Closed
ChienTeLee opened this issue Jun 15, 2022 · 25 comments · Fixed by #267
Closed

Some caching occurs when using pylint in a server like mode. #129

ChienTeLee opened this issue Jun 15, 2022 · 25 comments · Fixed by #267
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug info-needed Issue requires more information from poster triage-needed Issue is not triaged.

Comments

@ChienTeLee
Copy link

Hi, I just installed this extension and work on python 3.8.11. When I use relative import to import a file which have the same name with another file in other folder, pylint cannot import correctly and showing false positives. This happens globally, and I think it is the same issue with this one: pylint-dev/pylint#5645.

@github-actions github-actions bot added the triage-needed Issue is not triaged. label Jun 15, 2022
@ChienTeLee ChienTeLee changed the title import error [Bug] import error Jun 15, 2022
@karthiknadig karthiknadig self-assigned this Jun 15, 2022
@karthiknadig
Copy link
Member

@ChienTeLee This extension shows whatever we get from pylint. Try running pylint from the terminal and see if you get different results.

@karthiknadig karthiknadig added bug Issue identified by VS Code Team member as probable bug info-needed Issue requires more information from poster labels Jun 15, 2022
@DanielNoord
Copy link

DanielNoord commented Jun 24, 2022

I tried to create a reproducer for #132, but found that the extensions creates a different result compared to a manual invocation. I think it might be related to the bug reported here, but I'm not sure.

For:

cat test.py
def func(param: int) -> int:
    return paramcat test2.py
import test

test.func(1)

Pylint output:

/Users/daniel/.pyenv/versions/3.10.0/envs/pylint-3.10.0/bin/python -m pylint --reports=n --output-format=json --from-stdin /Users/daniel/DocumentenLaptop/Programming/Github/pylint/test2.py
CWD Linter: /Users/daniel/DocumentenLaptop/Programming/Github/pylint
file:///Users/daniel/DocumentenLaptop/Programming/Github/pylint/test2.py :
[
    {
        "type": "error",
        "module": "test2",
        "obj": "",
        "line": 3,
        "column": 0,
        "endLine": 3,
        "endColumn": 9,
        "path": "test2.py",
        "symbol": "no-member",
        "message": "Module 'test' has no 'func' member",
        "message-id": "E1101"
    }
]

Manual:

❯ /Users/daniel/.pyenv/versions/3.10.0/envs/pylint-3.10.0/bin/python -m pylint --reports=n --output-format=json /Users/daniel/DocumentenLaptop/Programming/Github/pylint/test2.py
[]

Note that I removed --from-stdin. I think we looked into --from-stdin previously, but there might still be something broken about that flag.

Edit: I also tried with --from-stdin:

cat test2.py | /Users/daniel/.pyenv/versions/3.10.0/envs/pylint-3.10.0/bin/python -m pylint --reports=n --output-format=json --from-stdin /Users/daniel/DocumentenLaptop/Programming/Github/pylint/test2.py
[]

So same difference..

@karthiknadig
Copy link
Member

karthiknadig commented Jun 24, 2022

@DanielNoord Can you dump what pylint sees as sys path in the manual run case. I think that might be the problem. Extension is not setting up the sys.path, as pylint would expect to see it.

You should be able to find the sys.path that the extension uses at the beginning of the logs.

@DanielNoord
Copy link

cat test2.py | /Users/daniel/.pyenv/versions/3.10.0/envs/pylint-3.10.0/bin/python -m pylint --reports=n --output-format=json --from-stdin /Users/daniel/DocumentenLaptop/Programming/Github/pylint/test2.py
['/Users/daniel/DocumentenLaptop/Programming/Github/pylint',
'/Users/daniel/.pyenv/versions/3.10.0/lib/python310.zip',
'/Users/daniel/.pyenv/versions/3.10.0/lib/python3.10', 
'/Users/daniel/.pyenv/versions/3.10.0/lib/python3.10/lib-dynload',
'/Users/daniel/.pyenv/versions/3.10.0/envs/pylint-3.10.0/lib/python3.10/site-packages',
'/Users/daniel/DocumentenLaptop/Programming/Github/pylint',
'/Users/daniel/DocumentenLaptop/Programming/Github/astroid']
[]

Just before we actually start checking the file.


diff --git a/pylint/lint/pylinter.py b/pylint/lint/pylinter.py
index e2bbf061e..092bdd6f9 100644
--- a/pylint/lint/pylinter.py
+++ b/pylint/lint/pylinter.py
@@ -715,6 +715,9 @@ class PyLinter(
         :param FileItem file: data about the file
         :raises AstroidError: for any failures stemming from astroid
         """
+        import sys
+
+        print(sys.path)
         self.set_current_module(file.name, file.filepath)
         # get the module representation
         ast_node = get_ast(file.filepath, file.name)

@DanielNoord
Copy link

sys.path of extension:

   /Users/daniel/.vscode/extensions/ms-python.pylint-2022.3.11671003/bundled/linter
   /Users/daniel/.pyenv/versions/3.10.0/lib/python310.zip
   /Users/daniel/.pyenv/versions/3.10.0/lib/python3.10
   /Users/daniel/.pyenv/versions/3.10.0/lib/python3.10/lib-dynload
   /Users/daniel/.pyenv/versions/3.10.0/envs/pylint-3.10.0/lib/python3.10/site-packages
   /Users/daniel/DocumentenLaptop/Programming/Github/pylint
   /Users/daniel/DocumentenLaptop/Programming/Github/astroid
   /Users/daniel/.vscode/extensions/ms-python.pylint-2022.3.11671003/bundled/libs

@karthiknadig How can I read anything printed to sys.stdout from the Extension?

@DanielNoord
Copy link

I noticed something strange as well. I'm using an editable instal pip install -e . in my environment. The extensions picks this up and shows that it is using both my local version of astroid (our underlying interpreter) and pylint.

However, whenever I make changes to those installs they don't get used by the extension. I need to restart the server for those changes to take effect.
So, when I tried to print out sys.path using the diff above for what the extensions uses (so not what the extension reports, but what pylint actually sees during a run) I had to restart the server for the print statement to take effect. So some element of the linter is being kept around in between server runs. This could create a "cache" that pylint doesn't know how to handle.

Furthermore, the sys.path reported by the extension and the sys.path used by pylint when run from the extension are not the same. For the sys.path above, using the diff from two posts previous I actually get:

['/Users/daniel/DocumentenLaptop/Programming/Github/pylint',
'/Users/daniel/.pyenv/versions/3.10.0/lib/python3.10',
'/Users/daniel/.pyenv/versions/3.10.0/lib/python3.10/lib-dynload',
'/Users/daniel/.pyenv/versions/3.10.0/envs/pylint-3.10.0/lib/python3.10/site-packages', 
/Users/daniel/DocumentenLaptop/Programming/Github/astroid',
'/Users/daniel/DocumentenLaptop/Programming/Github/pylint',
'/Users/daniel/.vscode/extensions/ms-python.pylint-2022.3.11671003/bundled/libs']

This could be because of the changes to sys.path pylint makes, but I think it might be useful to know that these are not the same.

@karthiknadig
Copy link
Member

@DanielNoord for diagnostic purposes, I dump everything that i get from sys.stdout each time to the Output>pylint. what you see in the logs when the linter runs on a file is exactly what we got from sys.stdout. result.stdout is the content of the substituted stream for stdout.

LSP_SERVER.show_message_log(f"{document.uri} :\r\n{result.stdout}")

If you want verbose logging then add "pylint.trace.server": "verbose", to your VSCode user (not workspace) settings.

@karthiknadig
Copy link
Member

So, when I tried to print out sys.path using the diff above for what the extensions uses (so not what the extension reports, but what pylint actually sees during a run) I had to restart the server for the print statement to take effect. So some element of the linter is being kept around in between server runs. This could create a "cache" that pylint doesn't know how to handle.

I believe this might be due to runpy and extension never unloading pylint after a running. The extension, loads pylint once, and reuses it. In the normal scenario, pylint would run, and exit, and the only way some state was persisted was by writing to a file. Now that pylint runs like a server, it seems some state is being re-used.

@DanielNoord
Copy link

Yeah, some of the caching in astroid depends on global states so that is likely causing issues then. I think it would take quite some time before we can untangle all the different states pylint and astroid need/use before we can smoothly support a server-like run. Is there anything we can do with runpy to avoid this? Like passing a flag to re-load?

@karthiknadig
Copy link
Member

@brettcannon Any suggestions here on how we can do reload safely?

@karthiknadig
Copy link
Member

For completeness I captured the sys.path for the ways of running pylint that we were looking at:

Running pylint as a module in terminal:

>type test2.py | python -m pylint --reports=n --output-format=json --from-stdin test2.py
sys.path = [
    'C:\\GIT\\repro\\linter2',
    'C:\\all_pys\\Python37\\python37.zip',
    'C:\\all_pys\\Python37\\DLLs',
    'C:\\all_pys\\Python37\\lib',
    'C:\\all_pys\\Python37',
    'C:\\GIT\\repro\\linter2\\.venv',
    'C:\\GIT\\repro\\linter2\\.venv\\lib\\site-packages'
]

Using runpy.run_module

>type test2.py | python viarunpy.py --reports=n --output-format=json --from-stdin test2.py
sys.path = [
    'C:\\GIT\\repro\\linter2',
    'C:\\all_pys\\Python37\\python37.zip',
    'C:\\all_pys\\Python37\\DLLs',
    'C:\\all_pys\\Python37\\lib',
    'C:\\all_pys\\Python37',
    'C:\\GIT\\repro\\linter2\\.venv',
    'C:\\GIT\\repro\\linter2\\.venv\\lib\\site-packages'
]

Via the extension

The typescript part launches this python bundled/linter/linter_server.py. Pylint is eventually launched by the handler for file open using runpy.run_module.

sys.path = [
    'c:\\GIT\\repro\\linter2',
    'C:\\all_pys\\Python37\\DLLs',
    'C:\\all_pys\\Python37\\lib',
    'C:\\all_pys\\Python37',
    'c:\\GIT\\repro\\linter2\\.venv',
    'c:\\GIT\\repro\\linter2\\.venv\\lib\\site-packages',
    'c:\\GIT\\linters\\vscode-pylint\\bundled\\libs'
]

The only thing missing from the sys.path in the extension C:\\all_pys\\Python37\\python37.zip, not sure why that gets skipped. May be because it is not running in a terminal?

@brettcannon
Copy link
Member

Is there anything we can do with runpy to avoid this? Like passing a flag to re-load?

No, because runpy isn't directly designed for this scenario; it's meant to handle -m from Python and everything else is left up to the calling code.

One potential way to solve this would be to call importlib.reload() on appropriate modules before each call into Pylint, but there's two issues with that.

One, you have to do it for all relevant modules, not just astroid.

Two, this will only work if the code imported the module and not objects off the module. For instance, if astroid.CACHE was an object being used as a cache, the reload trick only works if you did import astroid. If you did from astroid import CACHE then it won't work as the module that did the import will hold a reference to the old cache object. Now, if the caching is entirely self-contained in astroid then you could extend what you reloaded to not only the module with the caching but all modules utilizing the cache.

@DanielNoord
Copy link

@jacobtylerwalls You have a better understanding of clearing caches within astroid than I do. Do you think there is something we can easily fix before we eventually go through the effort of allowing a server-like pylint process to run?

@jacobtylerwalls
Copy link

Any suggestions here on how we can do reload safely?

Should be as simple as calling clear_cache() on any AstroidManager object, which can be accessed as either astroid.MANAGER or pylint.lint.pylinter.MANAGER. See the recent (and well-hidden) documentation at the bottom of this page:

https://pylint.pycqa.org/en/latest/development_guide/api/pylint.html?highlight=between%20runs#edited-between-runs

This should be done after every pylint call. Can we try that here?

If that doesn't work, then there's still a global state somewhere we don't know about (or we need to look into Brett's tip about importing off the module and verify it's not that). This is all assuming a single-process scenario. There are many known issues with parallel mode in pylint and it's not really safe to use IMO.


@DanielNoord we could look into some usability improvements, either always clearing the cache at the end of a run, and/or providing a kwarg argument to opt in to retaining the cache. WDYT?

@DanielNoord
Copy link

@jacobtylerwalls Seems like an option --clear-cache-post-run (forgive the name) would indeed be useful. That would require some version checking for this plugin but might be more user friendly going forward.

@brettcannon
Copy link
Member

This should be done after every pylint call. Can we try that here?

Yep, that should be doable. We can probably handle it somewhere in

def _run_module(
module: str, argv: Sequence[str], use_stdin: bool, source: str = None
) -> RunResult:
"""Runs linter as a module."""
str_output = CustomIO("<stdout>", encoding="utf-8")
str_error = CustomIO("<stderr>", encoding="utf-8")
try:
with substitute_attr(sys, "argv", argv):
with redirect_io("stdout", str_output):
with redirect_io("stderr", str_error):
if use_stdin and source is not None:
str_input = CustomIO("<stdin>", encoding="utf-8", newline="\n")
with redirect_io("stdin", str_input):
str_input.write(source)
str_input.seek(0)
runpy.run_module(module, run_name="__main__")
else:
runpy.run_module(module, run_name="__main__")
except SystemExit:
pass
return RunResult(str_output.get_value(), str_error.get_value())
.

There are many known issues with parallel mode in pylint and it's not really safe to use IMO.

We are running single-threaded, so that's not a concern here (perk of Pylint being run in a language server and those out-of-process to VS Code itself).

@karthiknadig karthiknadig changed the title [Bug] import error Some caching occurs when using pylint in a server like mode. Jun 30, 2022
@bthorsted
Copy link

Is this still being investigated? I have the same issue when doing refactoring in a project. Every time I change the name of a class, function or method that is imported elsewhere, I need to reload pylint to clear the error that is sure to show up after I save the changes, which triggers linting of the open files.

@brettcannon
Copy link
Member

@DanielNoord did any flag ever end up in Pylint? Or should we just hack up a forced reload of the appropriate modules?

@jacobtylerwalls
Copy link

jacobtylerwalls commented Nov 19, 2022

I put pylint-dev/pylint#7802 up just now as a proof of concept. I'd like to come back and add a test. EDIT: done.

@DanielNoord
Copy link

Does anybody from the team want to try out the new flag? We could put it an a patch release and get it out relatively easily, but since your the main target for this flag we'd like to get confirmation that we did indeed fix the issue. If not, we could try and find something else!

@brettcannon
Copy link
Member

We might not have time to get to this until January, but we will definitely give it a try when we can! No need to do a point release just for us.

@DanielNoord
Copy link

Thanks for getting back! I'll guess we'll just merge then and release at some point before January. We think this fits the use case, but don't hesitate to tell us if we missed anything!

@karthiknadig
Copy link
Member

@DanielNoord Which version of pylint has the flag?

@jacobtylerwalls
Copy link

2.16, which hasn't been released yet. See pylint-dev/pylint#7802

@Pierre-Sassoulas
Copy link

We're planning on releasing 2.16.0 soonish ™️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue identified by VS Code Team member as probable bug info-needed Issue requires more information from poster triage-needed Issue is not triaged.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants