Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] - Mechanism to pre-populate Nebari with demo/example content #2179

Closed
dharhas opened this issue Jan 4, 2024 · 8 comments · Fixed by #2207
Closed

[ENH] - Mechanism to pre-populate Nebari with demo/example content #2179

dharhas opened this issue Jan 4, 2024 · 8 comments · Fixed by #2207

Comments

@dharhas
Copy link
Member

dharhas commented Jan 4, 2024

Feature description

We need a mechanism to deploy examples/demos to Nebari for new users to interact with.

Ideally, this would be pulled from a set of git repos on first login but could potentially also be put in a read-only examples folder. The second option is not great since folks trying to run notebooks directly in the shared folder would hit errors.

options to explore include nbgitpuller and gitsync etc

Value and/or benefit

It gives people a starting point to explore Nebari's features and helps us customize for different use cases.

Anything else?

No response

@costrouc
Copy link
Member

costrouc commented Jan 4, 2024

Several ideas on this:

I think init containers captures what we want to do better. It is possible to have multiple init containers and this is guaranteed to run before the user starts jupyterlab.

@pavithraes pavithraes moved this from New 🚦 to TODO 📬 in 🪴 Nebari Project Management Jan 4, 2024
@kcpevey
Copy link
Contributor

kcpevey commented Jan 4, 2024

Copying over some brainstorming ideas:
image

@viniciusdc
Copy link
Contributor

Both of those seem like really compelling ideas, I am leaning towards the init_container approach as I we would have more control in case we would like to quickly test something new (just a configmap update) while the use of nbgitpuller requires updates in the nebari yaml in case we would like to update the link.

So far we have:

  • bash script or use of gitsync as part of the init container for the jupyterlab pod, this allows further customization as we would have access to traitlets/python, username and quick debugging using k9s/configmap updates

    • The downside that I can see right now is the fact those changes would be included in the nebari codebase, so it will require direct update in the jupyter configmap to include this dynamic control.
    • The github repository would be statically linked to the codebase updates as well
  • Use of git sync as suggested above on @kcpevey comment; this would allow easy inclusion of any GitHub repo without touching any code, though in case we want to update, this would require an admin to perform a redeployment.

    • The downside that I can foresee, is the management of the github repository after the user interacts with it, and the need of an admin to redeploy when a change is required.

@dcmcand
Copy link
Contributor

dcmcand commented Jan 9, 2024

a bash script that checks for a .firstrun file and no-ops if it is present is a way to ensure that the script only runs the first time. If the .firstrun file isn't present, then the script runs and creates the file at the end.

@pavithraes
Copy link
Member

pavithraes commented Jan 9, 2024

Notes from the Nebari sync:

  • Dharhas suggested it'll be nice to have an examples section in nebari-config.yaml where the admin can specify example repos.
  • The examples should be in the users' home dir

@dharhas
Copy link
Member Author

dharhas commented Jan 13, 2024

fyi. we will need to use git-lfs after the git clone to pull in datasets.

@viniciusdc
Copy link
Contributor

viniciusdc commented Jan 15, 2024

Thank you all for the comments; I had an issue with the usage of multiple repos and git sync as the tool to handle that, so I opted to go with a more straightforward solution of a bash command execution as part of the initContainer, thus granting the same level of control and flexibility. The only downside is that it will create a new code to maintain.

I have a solution already and will open 2 PRs soon to address that case.

@pavithraes
Copy link
Member

From community meeting, the config will look like:

Image

Instead of pre_populate_repositories, use one of the following:

  • default_repositories
  • initial_repositories
  • example_repositories
  • demo_repositories

cc @dharhas for name preference

Additionally, this features need to be documented and note that the docs should include:

  • users may need to run git-lfs seperately
  • we can only clone public repos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

6 participants