Pool Connections in RGWAdmin Client #48

Dorthu · 2020-05-18T15:52:32Z

It's common in my usage to create an RGWAdmin instance and use it for
multiple requests. As implemented now, this requires creating a new
connection per request, which isn't horribly slow but it's not as fast
as keeping the TCP connection open.

This change uses a requests.Session to keep the TCP connection open so
that multiple requests being made to the same endpoint don't have to
renegotiate the connection. This results in a small, but appreciable speed
improvement.

I tested the difference by making a client with pooling enabled and one
without and hitting the same admin endpoint 100 times in a loop, timing
the entire operation. The results were:

No pooling: ~19s
Pooling: ~6s

I made this configurable - it's enabled by default since I assume
this is the common usage pattern, but it can be disabled and fall back
to the old behavior.

It's common in my usage to create an RGWAdmin instance and use it for multiple requests. As implemented now, this requires creating a new connection per request, which isn't horribly slow but it's not as fast as keeping the TCP connection open. This change uses a `requests.Session` to keep the TCP connection open so that multiple requests being made to the same endpoint don't have to renegotiate the connection. This results in a small, but appreciable speed improvement. I tested the difference by making a client with pooling enabled and one without and hitting the same admin endpoint 100 times in a loop, timing the entire operation. The results were: No pooling: ~19s Pooling: ~6s I made this configurable - it's enabled by default since I assume this is the common usage pattern, but it can be disabled and fall back to the old behavior.

liammonahan · 2020-05-18T16:39:10Z

Hi @Dorthu

This is a great idea. I am going to try to test if this change has any implications for stale connections held open for long periods of time. I want to know what would happen in the exception handing in those cases and if we need any other edge cases to be caught that were possible, but unlikely to manifest in the non-pooled configuration.

I will try to look at this today or tomorrow. Let me know if you have additional thoughts.

Thanks,
Liam

Dorthu · 2020-05-18T18:05:43Z

I didn't think to test that - I'll give that a shot now as well.

liammonahan · 2020-05-18T22:06:19Z

So for some background we have a test and prod cluster that are fronted by HAProxy, which loadbalances to the rgw origins. HAProxy since v1.5 is supposed to attempt to do keepalive on both the frontend and the backend by default. I also explicitly added enable_keep_alive=yes to the rgw_frontends config option in Ceph.

If I make calls in a loop with 9 seconds or fewer in between everything works fine with connection pooling. Once I go to 10 seconds, I get connection errors: "Remote end closed connection without response"

I also added this and it did not seem to have the intended effect:

if pool_connections:
    self._session = requests.Session()
    self._session.headers.update({'Connection': 'keep-alive'})

Just adding this as a datapoint. What does your loadbalancing chain look like for your use case?

Thanks,
Liam

Dorthu · 2020-05-19T13:57:04Z

We're using OpenResty (nginx) to load balance to the rgws, and I haven't seen any issues like that in our setup. That said, the cluster I'm testing against is having some unrelated issues at the moment, so I haven't been able to try the loop you described.

Looks like this has the potential to cause issues for some deployments, so it should be opt-in.

Dorthu · 2020-05-19T14:46:01Z

Alright, I sorted out the issue with the test cluster, and ran the following without issue:

In [17]: for i in range(10):
    ...:     client.get_bucket() # has pooled_connections enabled
    ...:     sleep(10)

I tested sleeps between 5 and 20 seconds, all worked as expected.

That being said, if there's a possibility this is going to cause mysterious failures in some setups, I'd rather err on the side of caution and default it to disabled; I wouldn't want to break existing systems because they upgraded.

liammonahan · 2020-05-19T15:44:23Z

So I found where HAProxy has a default keep-alive timeout of 10 seconds. I'm guessing there may be other people with other loadbalancers that have a wide array of different options. I think defaulting to connection pooling off is the right call.

I am planning to add some documentation to rgwadmin this week and can include a section on this. I think the next thing this feature would need to be robust would be connection resumption/reestablishment if the library could detect that the pooled connection had been closed. If it could do this transparently that would be neat.

Merging now. Thanks.

liammonahan · 2020-05-19T16:02:31Z

Released in v.2.3.1.

https://pypi.org/project/rgwadmin/2.3.1/#files

liammonahan · 2020-05-29T21:40:31Z

I added a note about connection pooling to the docs here: https://rgwadmin.readthedocs.io/en/latest/rgwadmin/user-guide.html#connection-pooling

liammonahan self-assigned this May 18, 2020

Disable pooled connections by default

ddda34d

Looks like this has the potential to cause issues for some deployments, so it should be opt-in.

liammonahan merged commit bef0e93 into UMIACS:master May 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pool Connections in RGWAdmin Client #48

Pool Connections in RGWAdmin Client #48

Dorthu commented May 18, 2020

liammonahan commented May 18, 2020

Dorthu commented May 18, 2020

liammonahan commented May 18, 2020

Dorthu commented May 19, 2020

Dorthu commented May 19, 2020

liammonahan commented May 19, 2020

liammonahan commented May 19, 2020

liammonahan commented May 29, 2020

Pool Connections in RGWAdmin Client #48

Pool Connections in RGWAdmin Client #48

Conversation

Dorthu commented May 18, 2020

liammonahan commented May 18, 2020

Dorthu commented May 18, 2020

liammonahan commented May 18, 2020

Dorthu commented May 19, 2020

Dorthu commented May 19, 2020

liammonahan commented May 19, 2020

liammonahan commented May 19, 2020

liammonahan commented May 29, 2020