Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommendations Accelerator #5

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

Jack-Keene
Copy link
Collaborator

@Jack-Keene Jack-Keene commented Oct 10, 2023

@netlify
Copy link

netlify bot commented Oct 10, 2023

Deploy Preview for data-product-accelerators ready!

Name Link
🔨 Latest commit 21c20c8
🔍 Latest deploy log https://app.netlify.com/sites/data-product-accelerators/deploys/654789f4898b7300087da189
😎 Deploy Preview https://deploy-preview-5--data-product-accelerators.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@Jack-Keene Jack-Keene requested review from misterpig and zloff October 11, 2023 08:57
Copy link

@zloff zloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgive my ignorance - once this is merged to main does it appear on the website? Do we leave this unmerged and branch off this one to update the accelerator until it's ready for release, or can we merge this and keep updating without it being published?

1h :s5, after s4, 05:00
section 6. Next steps
30min :s6, after s5, 05:30
{{</mermaid>}}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated to this PR, but Jack do you have any idea how to make this chart look better, so that the text doesn't go into the chart area? I tried when I was writing it but couldn't figure it out, and it looks kinda janky

@Jack-Keene
Copy link
Collaborator Author

Yes once this is merged it will appear at https://snowplow.io/data-product-accelerators/snowplow-recommendations/
I'm not sure this is massive problem as we haven't linked it from anywhere so no-one will stumble across it probably, so up to you how you want to managed development work.

It would definitely work to make branches off this PR and merge them as you go like a normal release though!

… the Flask app; updated the create Personalize Python script.
@misterpig misterpig requested a review from zloff October 27, 2023 12:20
Copy link

@zloff zloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@misterpig looks good, just a couple small comments

…for the AWS Personalize role manual creation.
@misterpig misterpig requested a review from zloff November 5, 2023 12:26
Copy link

@zloff zloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@misterpig changes all look good

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has no child pages?

Comment on lines +11 to +12
In your AWS account, create an access key for your user - https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey. Copy the access key id and secret access key to a file called credentials in the .aws folder in your home directory. The file should look like this:
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just draw a file tree here to visulaise where this file is meant to go?

Comment on lines +17 to +23
In this accelerator, we will enable you to:

* Run dbt to transform your Snowplow data into a format that can be used by AWS Personalize
* Set up the infrastructure required to run AWS Personalize, including the necessary IAM roles and policies
* Create ecommerce and/or media (video on demand) recommenders in AWS Personalize
* Run a local Flask application to interact with AWS Personalize
***
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an image of the app here, having a picture on the front page is always good

Comment on lines +77 to +83

If you are not using the sample data:
- An implementation of the dbt-snowplow-ecommerce package. This will have created the `snowplow_ecommerce_product_interactions` table, which is used by this dbt package to create the ecommerce dataset for AWS Personalize. If you don't have this table, you can create it by following the instructions [here](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/ecommerce/).

And/or

- An implementation of the dbt-snowplow-media-player package (>= 0.6.0). This will have created the `snowplow_media_player_base` table, which is used by this dbt package to create the video_on_demand dataset for AWS Personalize. If you don't have this table, you can create it by following the instructions [here](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/media-player/).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to format/write this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add something here to note the dbt package is under SPAL licensing

Comment on lines +28 to +30
The dataset/group, schema and import job names can be configured to your preference. The `s3_bucket_name` should be just the name of the S3 bucket (not the full s3 path) you created either manually or via Terraform; similarly the `role_arn` is the name of the Personalize IAM role you created, it corresponds to the Terraform variable `personalize_role_name` with a default name of `PersonalizeIAMRole`.

You may choose which or both of the domains for Personalize to create by setting the `enabled` flag in the config file to `true` or `false`. Additionally you can choose which datasets to be created for training by Personalize by adding them to the list for the `datasets` parameter. Ensure that whichever datasets you add here are also exported by your dbt package by adding the corresponding table names to the `dbt_project.yml`'s `tables_to_export` config value. Finally you will be able to select which recommender you'd like Personalize to generate for your datasets by adding their config names in to the list parameter `recommenders` (refer to the table below for the list of available recommenders).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be formatted as code?

Comment on lines +31 to +34

#### **Step 2:** Create virtual environment
Before running the script to create the recommenders with AWS personalize, use the requirements.txt file to set up the necessary packages in a virtual environment.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify python venv, maybe link to instructions to set up a venv?

Comment on lines +43 to +45

| ![Request Form](../images/recommendation_flask_ui.png) |
|:--:|
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this not a full-width image? Seems a bit big

Comment on lines +47 to +50

Which recommenders are available to you will depend on the domain(s) you chose and the dataset types you created. See the below table for more information about each recommender:

| Domain | Recommender | Required dataset types | Optional dataset types | Required input parameters | Optional input parameters |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put this under a new header to make clearer it applies to all options?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we normally have a recommended set of other accelerators at the end or did I make that up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants