Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a velero status command to report if the server is ready to create backup or not #979

Open
carlisia opened this issue Oct 22, 2018 · 18 comments
Assignees
Labels
Area/CLI related to the command-line interface Enhancement/User End-User Enhancement to Velero kind/requirement Reviewed Q2 2021

Comments

@carlisia
Copy link
Contributor

carlisia commented Oct 22, 2018

There are two issues in this ticket, one is addressed with #1967 (BSL/VSL readiness).

Will use this to track adding the status command to report on the readiness via the CLI.

Original report:

I purposefully started the ark server w/o setting a backup location via the cli or deploying the backupstoragelocation.yaml file.

In the end, the backup sync kept retrying for a really long time.

I wonder if the warning at the top should be an error, if the message should be stronger, or else there is a better way to avoid the user being stuck in this scenario.

image

@carlisia carlisia added this to the v0.10.0 milestone Oct 22, 2018
@carlisia carlisia changed the title Better ux for when there's no backuplocation Better ux for when there's no backup storage location Oct 22, 2018
@carlisia
Copy link
Contributor Author

For reference: #898 (comment)

@carlisia
Copy link
Contributor Author

Question: is it expected that Ark be used only for PV backup?

@ncdc
Copy link
Contributor

ncdc commented Oct 23, 2018

In the end, the backup sync kept retrying for a really long time.

This is acceptable and a reasonable design, imho. It should keep trying until it has work to do.

I wonder if the warning at the top should be an error, if the message should be stronger, of else there is a better way to avoid the user being stuck in this scenario.

What if we added an ark status command that could give you an overview of the server's status? Things like "waiting on BSL", "waiting on VSL", etc?

@ncdc
Copy link
Contributor

ncdc commented Oct 23, 2018

Question: is it expected that Ark be used only for PV backup?

Are you asking if it should handle volumes that are directly referenced by the pod (e.g. spec.volumes[i].gcePersistentDisk)? If so, that's #408

@carlisia
Copy link
Contributor Author

carlisia commented Oct 23, 2018

No, I'm asking something different. Currently we have backup storage and volume storage. Can we bootup the ark server w/o setting up/deploying configuration for the backup storage if we only want to do pv storage?

I know the answer to that ^ is no, but should we?

@carlisia
Copy link
Contributor Author

What if we added an ark status command that could give you an overview of the server's status? Things like "waiting on BSL", "waiting on VSL", etc?

Yes, I'd like that. Also, could we change the message from Checking for backup storage to Waiting for backup storage? I think the later gives an idea that there's a user action that needs to be taken.

@ncdc
Copy link
Contributor

ncdc commented Oct 23, 2018

No, I'm asking something different. Currently we have backup storage and volume storage. Can we bootup the ark server w/o setting up/deploying configuration for the backup storage if we only want to do pv storage?

I know the answer to that ^ is no, but should we?

That is along the lines of #504. I'm interested in exploring this, but I'm not sure how/where we'd record the information (volume snapshot info).

@nrb
Copy link
Contributor

nrb commented Oct 23, 2018

A bit of a tangent, but I think that the mechanism for getting this information would be somewhat similar - would ark status need creation of an endpoint like #770?

@ncdc
Copy link
Contributor

ncdc commented Oct 23, 2018

It depends on how we implement it. I could see 3 options:

  1. The ark server keeps a new status-y CR up to date and ark status would retrieve the CR's contents. This would potentially report false positives, e.g. in the event the ark server is offline.
  2. ark status creates and sends a new StatusRequest CR, and the ark server has a controller that updates it. The client waits/watches for it to be processed.
  3. The ark server exposes an HTTP endpoint that the client communicates with.

Of these, I'd probably vote for 2.

@nrb
Copy link
Contributor

nrb commented Oct 23, 2018

I like 2, too. Seems like a lot less plumbing for users to have to wrangle.

@carlisia
Copy link
Contributor Author

carlisia commented Oct 23, 2018

I like 1 better. I'm thinking when I want to check on a status it's bc something is happening/not happening and I want to get a quick read on the situation. Sound that there'd be a latency with option 2. For option 1, could the ark status not ping and check if the server is online? If not online maybe add a disclaimer saying so and providing last known status?

@carlisia
Copy link
Contributor Author

Adding this for future reference: we might consider adding to the ark status a field that indicates what the ark server looks for as far as default locations. Currently, if there's more than 1 location we can't tell which one is the default:

image

@nrb
Copy link
Contributor

nrb commented Nov 14, 2018

Given this wasn't included in #1052, I'm removing it from the v0.10.0 milestone

@nrb nrb removed this from the v0.10.0 milestone Nov 14, 2018
@rosskukulinski
Copy link
Contributor

In the end, the backup sync kept retrying for a really long time.

This is acceptable and a reasonable design, imho. It should keep trying until it has work to do.

@ncdc If Ark doesn't have a functioning backup storage location, the process should eventually exit. If the Ark pod is running then I would expect that it's functional. Alternatively, we could leverage a liveness probe to help let users know that Ark is not in a happy state.

@ncdc
Copy link
Contributor

ncdc commented Nov 26, 2018

I'm afraid that if we have the server crashloop because something such as the default BSL is missing, we're going to get bug reports that the server is crashlooping. Do you agree or disagree?

@rosskukulinski
Copy link
Contributor

@ncdc ahh! I didn't realize that this is related to a missing default. Agree crashlooping for that is bad.

@skriss skriss added the Enhancement/User End-User Enhancement to Velero label Aug 29, 2019
@carlisia carlisia changed the title Better ux for when there's no backup storage location Add a velero status command to report if the server is ready to create backup or not Apr 13, 2020
@carlisia carlisia added Area/CLI related to the command-line interface and removed Bug labels Apr 13, 2020
@carlisia carlisia self-assigned this Apr 13, 2020
@skriss skriss added this to the v1.5 milestone May 28, 2020
@carlisia carlisia removed their assignment Jul 28, 2020
@carlisia carlisia modified the milestones: v1.5, v1.6 Jul 28, 2020
@carlisia
Copy link
Contributor Author

This is different, but related: #675.

@eleanor-millman
Copy link
Contributor

Looks like @codegold79 's Design doc might address this: https://github.com/vmware-tanzu/velero/pull/4270/files

Frankie, can you please take a look at this issue and mark it as a dupe of whatever other issue you are working off of if this is the case? Thanks!

@reasonerjt reasonerjt added kind/requirement Reviewed Q2 2021 Area/CLI related to the command-line interface Enhancement/User End-User Enhancement to Velero and removed Enhancement/User End-User Enhancement to Velero Area/CLI related to the command-line interface Reviewed Q2 2021 labels May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/CLI related to the command-line interface Enhancement/User End-User Enhancement to Velero kind/requirement Reviewed Q2 2021
Projects
None yet
Development

No branches or pull requests

8 participants