-
-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"DiskFull" error when refreshing db views for service requests #1403
Comments
Hi @nichhk just trying to help you brainstorm and troubleshoot. I'm sure you did a lot of investigating however I was wondering if something in particular lead you to isolate this query to be the root cause of the error. Do we know the date the errors began to happen and what may have changed at that point in time? Have we eliminated the cause to be related to any other reason such as with running out of space we have on AWS or the docker container itself? Somebody had a similar issue and it had to do with the amount of space allocated in their docker container. There are a lot of possibilities why this might happen, perhaps it would be good to begin with a process of elimination: Also, is there a way for us to run the query adhoc within a console to see the error happen in real time? Would like to see how we can optimize the query if it comes down to it but not sure how to do so. Hope to somehow help you get to the bottom of this. Thanks! |
May have started failing on October 5th because selecting a date range from Oct 4 - Oct 17 (current date) is the last range to return data on the map: Oct 4 - Oct 17 (data exists)Oct 5 - Oct 17 (data does not exist)Perhaps we can inspect what happened differently during the nightly batch of Oct 4th |
Thanks for taking a look Edwin. I also noticed that the requests disappeared around that date. Unfortunately, the logs are only persisted for 1 week, so I wasn't able to look at the logs for 10/4. AFAICT, nothing has changed w.r.t. the data loading pipeline or the related infra in a long while. So the hypothesis that the query is causing issue makes sense--the amount of data that the query needs to look at has been steadily increasing each day, and now the query is too large. The other suggestions that you brought up are helpful! I don't think Docker is the issue, since all of the postgres temp files should be stored on AWS RDS (see "Short Description" from the AWS RDS link you shared), not some Docker container. I followed the suggestion from the AWS RDS link to look at FreeStorageSpace. We can see that this value is indeed declining, and it drops sharply everyday around 8am, which is when the daily data pipeline is run. The article offers an easier solution than optimizing the query: enable storage autoscaling for your instance. I'll try to enable that and hopefully that fixes the issue. |
Nice! |
Overview
Our nightly Prefect runs have been failing with this message:
It looks like the command "REFRESH MATERIALIZED VIEW CONCURRENTLY {view};" is quite expensive because it needs to create an entire extra copy of the database to allow the refresh and db reads to occur concurrently.
This post suggests that the query that we are using to create this view is too inefficient, so we'll have to look into fixing that. The current query certainly looks slow: we are joining 5 different tables. 4 of them are relatively small (~100 rows or less), but the last table, the requests table, is quite large--maybe millions of rows.
Action Items
Resources/Instructions
The text was updated successfully, but these errors were encountered: