What's the reason for ensuring 2 tests per node? #876

PrincipalsOffice · 2023-02-10T23:36:28Z

https://github.com/pytest-dev/pytest-xdist/blob/master/src/xdist/scheduler/load.py#L266
The comment doesn't seem to explain why we can't always do round robin.

nicoddemus · 2023-02-11T00:03:40Z

Because of the pytest_runtest_protocol hook: either the hook is called with an item as nextitem (so the test session is still running) or it needs to be called with None, which signals the session is ending.

So each node needs to keep 2 tests around, because if more tests come, it can continue calling the hook passing item and nextitem, otherwise when it receives the "finish session" signal, it can run the two tests and finally call the hook with nextitem=None.

This is requirement is historical, I was not even a pytest maintainer when it was designed. @RonnyPfannschmidt might have some other insights.

RonnyPfannschmidt · 2023-02-11T03:45:49Z

Setupstate cache retaining needs nextitem

PrincipalsOffice · 2023-02-12T03:35:38Z

@nicoddemus @RonnyPfannschmidt Thanks for the context. So how does the other case work given this constraint then https://github.com/pytest-dev/pytest-xdist/blob/master/src/xdist/scheduler/load.py#L257?

amezin · 2023-02-12T10:25:57Z

First of all,

The comment doesn't seem to explain why we can't always do round robin.

To reuse session/class fixtures from the previous test as much as possible. IMO this comment explains it: https://github.com/pytest-dev/pytest-xdist/blob/master/src/xdist/scheduler/load.py#L266

Yes, you can do round-robin always. LoadScheduling had been doing it before v3.0: 80130d5 But the performance with round-robin wasn't optimal at all #834

Round-robin distribution isn't optimal even in the "less than two tests per node" case. I kept existing code only because this corner case didn't look too important for me.

What's the reason for ensuring 2 tests per node?

It's a separate question, but the answer is the same: to avoid unnecessary fixture setup/teardown.

67f80b7#diff-ecec88c33adb7591ee6aa88e29b62ad52ef443611cba5e0f0ecac9b5725afdbaR7

fix pytest/xdist issue503: make sure that a node has usually
two items to execute to avoid scoped fixtures to be torn down
pre-maturely (fixture teardown/setup is "nextitem" sensitive).

~~503 isn't a github issue number, I guess it's some other issue tracker that was in use before~~
pytest-dev/pytest#503

PrincipalsOffice · 2023-02-14T18:25:47Z

Yes, you can do round-robin always. LoadScheduling had been doing it before v3.0: 80130d5 But the performance with round-robin wasn't optimal at all

@amezin Is that because the tests in self.pending are grouped by the same module, so consecutive tests are more likely to share the same fixtures (which means round robin would forcefully separate them to different nodes)? I initially assumed that if most tests are long enough (e.g. 10mins+), the benefit of running them in individual nodes would outweight the costs of additional fixture setup/teardown.

amezin · 2023-02-15T16:05:44Z

Is that because the tests in self.pending are grouped by the same module

Not only by module. Pytest sorts tests, taking parametrization into account:

https://github.com/pytest-dev/pytest/blob/464f29901fd437d03a3434bbcda0918c9c330868/src/_pytest/fixtures.py#L266-L345

I initially assumed that if most tests are long enough (e.g. 10mins+), the benefit of running them in individual nodes would outweight the costs of additional fixture setup/teardown.

Not enough context. Are we talking about some specific test suite? How many tests, how many nodes?

amezin · 2023-02-15T19:51:52Z

If the default scheduler doesn't work well for you, try --dist worksteal. I'd be really interested in the feedback.

PrincipalsOffice · 2023-02-15T20:17:01Z

For cases like running a single test file with ~20 parallelizable tests that each takes 10+ mins to finish, it seems like round robin is the more ideal solution. But the discussion in #812 seems to suggest round robin is almost never a good choice. I agree that when there are way more tests than nodes, which is probably the most common case, round robin prob wouldn't help much at all.

RonnyPfannschmidt · 2023-02-16T07:43:10Z

From arguably subjective experience I arrived at the opinion that for fanning out pytest tests, round Robin is never right

amezin · 2023-03-01T19:52:39Z

@PrincipalsOffice so. Initially, you've created the issue about the comment not being good enough.

Do you believe it's still the case?

If you just want to discuss scheduling algorithms - I think, Discussions is a more appropriate place

amezin closed this as completed Mar 1, 2023

robertschweizer mentioned this issue Apr 6, 2023

Allow one test per node #894

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the reason for ensuring 2 tests per node? #876

What's the reason for ensuring 2 tests per node? #876

PrincipalsOffice commented Feb 10, 2023

nicoddemus commented Feb 11, 2023 •

edited

Loading

RonnyPfannschmidt commented Feb 11, 2023

PrincipalsOffice commented Feb 12, 2023

amezin commented Feb 12, 2023 •

edited

Loading

PrincipalsOffice commented Feb 14, 2023

amezin commented Feb 15, 2023

amezin commented Feb 15, 2023

PrincipalsOffice commented Feb 15, 2023

RonnyPfannschmidt commented Feb 16, 2023

amezin commented Mar 1, 2023

What's the reason for ensuring 2 tests per node? #876

What's the reason for ensuring 2 tests per node? #876

Comments

PrincipalsOffice commented Feb 10, 2023

nicoddemus commented Feb 11, 2023 • edited Loading

RonnyPfannschmidt commented Feb 11, 2023

PrincipalsOffice commented Feb 12, 2023

amezin commented Feb 12, 2023 • edited Loading

PrincipalsOffice commented Feb 14, 2023

amezin commented Feb 15, 2023

amezin commented Feb 15, 2023

PrincipalsOffice commented Feb 15, 2023

RonnyPfannschmidt commented Feb 16, 2023

amezin commented Mar 1, 2023

nicoddemus commented Feb 11, 2023 •

edited

Loading

amezin commented Feb 12, 2023 •

edited

Loading