Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added balances for Yield Yak vaults #5737

Merged
merged 30 commits into from
Apr 25, 2024
Merged

Conversation

yy-analytics
Copy link
Contributor

Thank you for contributing to Spellbook!

Thank you for taking the time to submit code in Spellbook. A few things to consider:

  • If you are a first-time contributor, please sign the CLA by copy & pasting exactly what the bot mentions in PR comment
  • Refer to docs section below to answer questions
  • Dune team will review submitted PRs as soon as possible

Best practices

To speed up your development process in PRs, keep these tips in mind:

  • Each commit to your feature branch will rerun CI tests (see example)
    • This includes all modified models on your branch
    • This includes all history of the data
  • Two tips for faster development iteration:
    • Ensure dbt is installed locally (refer to main readme) and run dbt compile
      • This will output raw SQL in target/ directory to copy/paste and run on Dune directly for initial query testing
    • Hardcode a WHERE filter for only ~7 days of history on large source tables, i.e. ethereum.transactions
      • This will speed up the CI tests and output results quicker -- whether that's an error or fully successful run
      • Once comfortable with small timeframe, remove filter and let full history run

Incremental model setup

  • Make sure your unique key columns are exactly the same in the model config block, schema yml file, and seed match columns (where applicable)
  • There cannot be nulls in the unique key columns
    • Be sure to double check key columns are correct or COALESCE() as needed on key column(s), otherwise the tests may fail on duplicates

🪄 Use the built CI tables for testing 🪄

Once CI completes, you can query the CI tables and errors in dune when it finishes running.

  • For example:
    • In the run initial models and test initial models, there will be a schema that looks like this: test_schema.git_dunesql_4da8bae_sudoswap_v2_base_pool_creations
    • This can be temporarily queried in Dune for ~24 hours

Leverage these tables to perform QA testing on Dune query editor -- or even full test dashboards!

Spellbook contribution docs

The docs directory has been implemented to answer as many questions as possible. Please take the time to reference each .md file within this directory to understand how to efficiently contribute & why the repo is designed as it is 🪄

Example questions to be answered:

Please navigate through the docs directory to find as much info as you can.

Note: happy to take PRs to improve the docs, let us know 🤝

@dune-eng
Copy link

dune-eng commented Apr 4, 2024

Workflow run id 8552515453 approved.

@dune-eng
Copy link

dune-eng commented Apr 4, 2024

Workflow run id 8552515769 approved.

@dune-eng
Copy link

dune-eng commented Apr 4, 2024

Workflow run id 8552521437 approved.

@dune-eng
Copy link

dune-eng commented Apr 4, 2024

Workflow run id 8552521711 approved.

@yy-analytics
Copy link
Contributor Author

I'm expecting some issues with this PR but would like to discuss the approach. I opened a thread in Discord here but haven't heard anything back yet. I've not used the FROM {{ source(schema, name) }} style when querying certain tables because the list of tables to UNION is generated dynamically by using run_query combined with the yield_yak_strategies(blockchain) macro, and I can't use macros in my sources.yml file so don't have a way to dynamically update the sources if using FROM {{ source(schema, name) }} type statements. I've done it this way to avoid having to hardcode strategy contract names (new ones can sometimes be added fairly often and I'm just trying to avoid having to open a PR every time a new strategy is added).

@jeff-dude jeff-dude added the WIP work in progress label Apr 5, 2024
@jeff-dude
Copy link
Member

I'm expecting some issues with this PR but would like to discuss the approach. I opened a thread in Discord here but haven't heard anything back yet. I've not used the FROM {{ source(schema, name) }} style when querying certain tables because the list of tables to UNION is generated dynamically by using run_query combined with the yield_yak_strategies(blockchain) macro, and I can't use macros in my sources.yml file so don't have a way to dynamically update the sources if using FROM {{ source(schema, name) }} type statements. I've done it this way to avoid having to hardcode strategy contract names (new ones can sometimes be added fairly often and I'm just trying to avoid having to open a PR every time a new strategy is added).

i believe i responded yesterday. unfortunately, i don't think your approach here will work with run_query. does it make sense to build a separate spell solely that outputs a list of tables you need, then call that spell in the main spell to obtain the list? then dbt will handle the dependencies for you?

we don't allow direct access to db, so run_query will break some things.

@yy-analytics
Copy link
Contributor Author

I'm expecting some issues with this PR but would like to discuss the approach. I opened a thread in Discord here but haven't heard anything back yet. I've not used the FROM {{ source(schema, name) }} style when querying certain tables because the list of tables to UNION is generated dynamically by using run_query combined with the yield_yak_strategies(blockchain) macro, and I can't use macros in my sources.yml file so don't have a way to dynamically update the sources if using FROM {{ source(schema, name) }} type statements. I've done it this way to avoid having to hardcode strategy contract names (new ones can sometimes be added fairly often and I'm just trying to avoid having to open a PR every time a new strategy is added).

i believe i responded yesterday. unfortunately, i don't think your approach here will work with run_query. does it make sense to build a separate spell solely that outputs a list of tables you need, then call that spell in the main spell to obtain the list? then dbt will handle the dependencies for you?

we don't allow direct access to db, so run_query will break some things.

Thanks @jeff-dude. I could be wrong on this, but the issue I see with what you're describing is that I need the result of the query as a jinja variable (specifically, a list) that I can then loop over to create the "main" query. If I were to have a separate spell and select from that using ref then isn't that just the same as having it in a CTE? The main approach I'm going for in this is in the yield_yak_balances.sql macro file

@yy-analytics
Copy link
Contributor Author

I'm expecting some issues with this PR but would like to discuss the approach. I opened a thread in Discord here but haven't heard anything back yet. I've not used the FROM {{ source(schema, name) }} style when querying certain tables because the list of tables to UNION is generated dynamically by using run_query combined with the yield_yak_strategies(blockchain) macro, and I can't use macros in my sources.yml file so don't have a way to dynamically update the sources if using FROM {{ source(schema, name) }} type statements. I've done it this way to avoid having to hardcode strategy contract names (new ones can sometimes be added fairly often and I'm just trying to avoid having to open a PR every time a new strategy is added).

i believe i responded yesterday. unfortunately, i don't think your approach here will work with run_query. does it make sense to build a separate spell solely that outputs a list of tables you need, then call that spell in the main spell to obtain the list? then dbt will handle the dependencies for you?
we don't allow direct access to db, so run_query will break some things.

Thanks @jeff-dude. I could be wrong on this, but the issue I see with what you're describing is that I need the result of the query as a jinja variable (specifically, a list) that I can then loop over to create the "main" query. If I were to have a separate spell and select from that using ref then isn't that just the same as having it in a CTE? The main approach I'm going for in this is in the yield_yak_balances.sql macro file

@jeff-dude , do you know if the "no direct access to db" would also mean that the dbt_utils.get_column_values macro also cannot be run? From the source code it looks like it relies on a statement block (same as run_query) so I'm guessing not, but thought I might just confirm (?) as I can make a separate spell as you said if dbt_utils.get_column_values is working and can then point to that spell in the arguments for this macro. If not possible then I think the only solution is going to be hard-coding the list of strategies and submitting a new PR each time that list changes.

@dune-eng
Copy link

Workflow run id 8649689954 approved.

@dune-eng
Copy link

Workflow run id 8649690036 approved.

@dune-eng
Copy link

Workflow run id 8649698543 approved.

@dune-eng
Copy link

Workflow run id 8649698631 approved.

@dune-eng
Copy link

Workflow run id 8649927884 approved.

@dune-eng
Copy link

Workflow run id 8649928099 approved.

@dune-eng
Copy link

Workflow run id 8650272962 approved.

@dune-eng
Copy link

Workflow run id 8650272744 approved.

@yy-analytics
Copy link
Contributor Author

Given the "QUERY HAS TOO MANY STAGES" error that's coming now, I'm going to try to break up some of the balances model/macro first into separate Deposits, Withdraws and Reinvests models/macros which can then hopefully be combined to make the balances model/macro without this error.

…h combine to give balances, and included APY calculation for reinvests model
@dune-eng
Copy link

Workflow run id 8691125503 approved.

@dune-eng
Copy link

Workflow run id 8691125823 approved.

@dune-eng
Copy link

Workflow run id 8783217293 approved.

@dune-eng
Copy link

Workflow run id 8783217639 approved.

@dune-eng
Copy link

Workflow run id 8784212730 approved.

@dune-eng
Copy link

Workflow run id 8784212812 approved.

@yy-analytics yy-analytics requested a review from Hosuke April 22, 2024 12:57
@jeff-dude jeff-dude self-assigned this Apr 22, 2024
@jeff-dude jeff-dude added in review Assignee is currently reviewing the PR and removed ready-for-review this PR development is complete, please review labels Apr 23, 2024
Copy link
Member

@jeff-dude jeff-dude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fantastic code here, thank you for applying all the best practices (and even teaching me some more)!

left a few thoughts below prior to finalizing and merging

Comment on lines +60 to +61
{{ source(blockchain, 'transactions') }} t
ON t.hash = c.tx_hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if we can't apply an incremental filter, we should be able to enhance the join.

can you add block_number to the join conditions?

transactions is also partitioned on block_date -- we could likely add condition for block_date = cast(date_trunc('day', block_time) as date) (assuming block_date doesn't exist in the decoded tables. if it does, then simply join on block_date)

Comment on lines +60 to +61
{{ source(blockchain, 'transactions') }} t
ON t.hash = c.tx_hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, can it be an inner join?
that's more efficient on the dunesql engine.

if inner join isn't possible, it's usually best to first read from the larger of the two tables, then left/right join as needed to smaller tables

@jeff-dude
Copy link
Member

also, there is some complex incremental logic here. are you aware of the output tables built via CI pipelines attached, so you can query them on dune and test the data and ensure all looks good prior to merge?

@yy-analytics
Copy link
Contributor Author

also, there is some complex incremental logic here. are you aware of the output tables built via CI pipelines attached, so you can query them on dune and test the data and ensure all looks good prior to merge?

@jeff-dude , no I'm not sure on this part. I know about the tables generated by "dbt compile" and have been testing those to make sure the output looks correct, but as far as I could tell, that's for the "initial model". Where do I find the attached CI pipelines/tables to check the incremental logic is working correctly?

@dune-eng
Copy link

Workflow run id 8812336139 approved.

@dune-eng
Copy link

Workflow run id 8812336260 approved.

@jeff-dude
Copy link
Member

@jeff-dude , no I'm not sure on this part. I know about the tables generated by "dbt compile" and have been testing those to make sure the output looks correct, but as far as I could tell, that's for the "initial model". Where do I find the attached CI pipelines/tables to check the incremental logic is working correctly?

yep, dbt compile is useful for testing query in general for full history or initial runs. however, when you introduce complex incremental logic in spellbook, sometimes that can bring bugs into subsequent runs. in order to test the data quality post-incremental run, we open up the CI test tables (from attached gh actions on PRs) to be queried on dune for ~24 hours before being deleted. please note that it's once a day at set time, not 24 hours after build, so a rerun of gh action may be needed if bad timing happens and tables are dropped prior to finishing your testing.

check out these docs, hopefully they help (if not, let me know, we can modify those docs as needed):
https://github.com/duneanalytics/spellbook/blob/main/docs/ci_test/ci_test_overview.md

@jeff-dude jeff-dude added ready-for-merging and removed in review Assignee is currently reviewing the PR labels Apr 24, 2024
@jeff-dude
Copy link
Member

i'll let this sit for a little bit, if you wanted to query those CI output tables for data quality. let me know when ready to merge 🤝

@yy-analytics
Copy link
Contributor Author

@jeff-dude , no I'm not sure on this part. I know about the tables generated by "dbt compile" and have been testing those to make sure the output looks correct, but as far as I could tell, that's for the "initial model". Where do I find the attached CI pipelines/tables to check the incremental logic is working correctly?

yep, dbt compile is useful for testing query in general for full history or initial runs. however, when you introduce complex incremental logic in spellbook, sometimes that can bring bugs into subsequent runs. in order to test the data quality post-incremental run, we open up the CI test tables (from attached gh actions on PRs) to be queried on dune for ~24 hours before being deleted. please note that it's once a day at set time, not 24 hours after build, so a rerun of gh action may be needed if bad timing happens and tables are dropped prior to finishing your testing.

check out these docs, hopefully they help (if not, let me know, we can modify those docs as needed): https://github.com/duneanalytics/spellbook/blob/main/docs/ci_test/ci_test_overview.md

I think the 24 hr window passed. To trigger the actions again I was just going to "Update branch" but it's been showing "Checking for ability to merge automatically..." for a while now (see below). Is there any other way to rerun the gh action?
image

@jeff-dude
Copy link
Member

I think the 24 hr window passed. To trigger the actions again I was just going to "Update branch" but it's been showing "Checking for ability to merge automatically..." for a while now (see below). Is there any other way to rerun the gh action?

i was able to merge in main. i think there is a universal gh issue right now, so it's been slow today. other ways to do it:

  • on your local setup, merge main from spellbook into your main branch, push changes via git locally (assuming there are new changes on main)
  • if no new changes, you can also go into the action itself and manually rerun

image

  • final approach: push a dummy change to your branch, so a commit is made. each commit will run them again.

@yy-analytics
Copy link
Contributor Author

i was able to merge in main. i think there is a universal gh issue right now, so it's been slow today. other ways to do it:

  • on your local setup, merge main from spellbook into your main branch, push changes via git locally (assuming there are new changes on main)
  • if no new changes, you can also go into the action itself and manually rerun

image

  • final approach: push a dummy change to your branch, so a commit is made. each commit will run them again.

Thanks for the tips! I've been able to review the CI tables and I'm happy with how it all looks so am happy for you to merge when ready 🤝

@jeff-dude jeff-dude merged commit 236d382 into duneanalytics:main Apr 25, 2024
3 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants