Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pool methods override context #1287

Closed
fredczj opened this issue Aug 24, 2022 · 5 comments · Fixed by jackc/puddle#21
Closed

Pool methods override context #1287

fredczj opened this issue Aug 24, 2022 · 5 comments · Fixed by jackc/puddle#21
Labels

Comments

@fredczj
Copy link

fredczj commented Aug 24, 2022

Describe the bug

Context cancelled does cancel Ping (maybe others methods?) operation systematically.

It does 1 time. But then the Ping call hangs.

I observe this compared on v4.17.0. This sounds to be a regression from v4.16.1. But maybe v4.16.1 was defective and I used the bug as a feature… 

To Reproduce
You initially connect to a DB to achieve a connection. Then you simulate a problem with the connection by shutting (pausing) de DB image.

package main

import (
	"context"
	"fmt"
	"log"
	"time"

	"github.com/jackc/pgx/v4/pgxpool"
)

func main() {
	config, err := pgxpool.ParseConfig("user=jami password=roquai host=localhost port=5432 dbname=dbtest")
	if err != nil {
		log.Fatal(err)
	}

	conn, _ := pgxpool.ConnectConfig(context.Background(), config)

	closeCtx, _ := context.WithTimeout(context.Background(), 10*time.Second)

	// Your code here...
	for closeCtx.Err() == nil {
		qctx, _ := context.WithTimeout(context.Background(), 2*time.Second)
		if errPing := conn.Ping(qctx); errPing == nil {
			fmt.Println("DB is here")
		} else {
			fmt.Println("DB is not here")
		}
	}

	fmt.Println("leaving")
}

Expected behavior
Ping should not hang. Here, it overrides the context cancellation.

Actual behavior
Once the DB shuts down, the Ping returns an error (timed-out). Then it hangs.

Version

  • Go: $ go version : go version go1.18.3 darwin/amd64
  • PostgreSQL: I use in i docker-compose:
   image: postgres:12-alpine
    environment:
      - POSTGRES_USER=jami
      - POSTGRES_PASSWORD=roquai
      - POSTGRES_DB=dbtest
    ports:
      - "5432:5432"
  • pgx: $ grep 'github.com/jackc/pgx/v[0-9]' go.mod -> v4.17.0

Additional context
v4.16.1 was acting the expected way. v4.17.0 introduces this different behavior.

I use this ping tips to check the connection status to a database after a query error in order to handle load-balancing over replicas, and error management.

Therefore, when I collect the query error correctly (as indicated above, the first Ping without DB returns an error), but the Ping to check the DB hangs forever (until the database reconnects at least). But as I lose the hand, I cannot update the database status.

@fredczj fredczj added the bug label Aug 24, 2022
@jackc
Copy link
Owner

jackc commented Aug 25, 2022

There was a small change in how context cancellation is handled in pgxpool in v4.17.0. But its not obvious how that change would have caused this.

I don't have docker on my machine, so I can't run your example the exact same way you do. It works when running on MacOS and simply stopping the local database server.

...

Was finally able to duplicate it by connecting to a remote server severing the network connection with a firewall rule.

...

Okay, I think I have figured it out.

@jameshartig It looks like this was broken by #1259 / 91c9e84. Establishing a connection is no longer affected by context, but that means that Ping() or Query() or anything else that can trigger the creation of a new connection will block waiting for that new connection even if the ctx is cancelled.

Looked at it for a while but did not see any simple fix.

@jameshartig
Copy link
Contributor

jameshartig commented Aug 25, 2022

@jackc this might need to be fixed in puddle. Ideally it could create the resource in a goroutine and select against the ctx.Done. The only issue with the goroutine is that it'll get hard to ensure it doesn't go over the max. I see that it adds to the pool before creation so the goroutine shouldn't be an issue. I'll work on a change and see what you think.

Also, with your latest changes it shouldn't hang forever. It should only hang until the connect timeout or 20 seconds, right?

@jameshartig
Copy link
Contributor

@jackc created an MR for puddle here: jackc/puddle#21

@jackc
Copy link
Owner

jackc commented Aug 27, 2022

Fix is in release v4.17.1.

@fredczj
Copy link
Author

fredczj commented Aug 30, 2022

I confirm it works

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants