Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

give kinda helpful message if too many open files #1110

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

leondz
Copy link
Collaborator

@leondz leondz commented Feb 24, 2025

OS can get upset if parallel_attempts goes too high. Give a clearer error message about this.

(garak) 09:13:05 x1:~/dev/garak [main] $ python -m garak -m nim -n meta/llama-3.2-3b-instruct -p phrasing.PastTenseMini --parallel_attempts 1000 -g 5
garak LLM vulnerability scanner v0.10.2.post1 ( https://github.com/NVIDIA/garak ) at 2025-02-24T09:13:12.943850
📜 logging to /home/lderczynski/.local/share/garak/garak.log
🦜 loading generator: NIM: meta/llama-3.2-3b-instruct
📜 reporting to /home/lderczynski/.local/share/garak/garak_runs/garak.fb21a28e-16c8-4496-bd9e-b0f694333003.report.jsonl
🕵️  queue of probes: phrasing.PastTenseMini
probes.phrasing.PastTenseMini:   0%|                                                                                                                        | 0/200 [00:00<?, ?it/s]Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/lderczynski/dev/garak/garak/__main__.py", line 14, in <module>
    main()
  File "/home/lderczynski/dev/garak/garak/__main__.py", line 9, in main
    cli.main(sys.argv[1:])
  File "/home/lderczynski/dev/garak/garak/cli.py", line 594, in main
    command.probewise_run(
  File "/home/lderczynski/dev/garak/garak/command.py", line 237, in probewise_run
    probewise_h.run(generator, probe_names, evaluator, buffs)
  File "/home/lderczynski/dev/garak/garak/harnesses/probewise.py", line 107, in run
    h.run(model, [probe], detectors, evaluator, announce_probe=False)
  File "/home/lderczynski/dev/garak/garak/harnesses/base.py", line 123, in run
    attempt_results = probe.probe(model)
                      ^^^^^^^^^^^^^^^^^^
  File "/home/lderczynski/dev/garak/garak/probes/base.py", line 219, in probe
    attempts_completed = self._execute_all(attempts_todo)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lderczynski/dev/garak/garak/probes/base.py", line 181, in _execute_all
    with Pool(_config.system.parallel_attempts) as attempt_pool:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/pool.py", line 215, in __init__
    self._repopulate_pool()
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/pool.py", line 306, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/pool.py", line 329, in _repopulate_pool_static
    w.start()
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/context.py", line 282, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/lderczynski/anaconda3/envs/garak/lib/python3.12/multiprocessing/popen_fork.py", line 65, in _launch
    child_r, parent_w = os.pipe()
                        ^^^^^^^^^
OSError: [Errno 24] Too many open files

Verification

List the steps needed to make sure this thing works

  • try garak -m test -p test.Test --parallel_attempts 1000, the new error should pop up on CLI and in log. If it doesn't, try a higher number, or reduce ulimit.

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me for parallel_attempts, what are your thoughts on adding a similar guard in generators/base.py related to parallel_requests as well?

In theory, if both were set the error would bubble up from the generator sub-processes however since parallel_requests is independent a generator that requires a single request per call could produce a similar error when parallel_attempts was not set.

At the same time I wonder about the value of catching OSError like this, are we going down a path that will require additional handlers for various resource limitation errors across supported operating systems?

Consider the command used to test this, run on a Windows installl with only 4GB of RAM can raise:

  File "C:\Users\Win10x64\miniconda3\Lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "C:\Users\Win10x64\miniconda3\Lib\multiprocessing\context.py", line 337, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Users\Win10x64\miniconda3\Lib\multiprocessing\popen_spawn_win32.py", line 75, in __init__
    hp, ht, pid, tid = _winapi.CreateProcess(
                       ^^^^^^^^^^^^^^^^^^^^^^
OSError: [WinError 1455] The paging file is too small for this operation to complete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants