Skip to content

Commit

Permalink
Make parallel restore configurable
Browse files Browse the repository at this point in the history
Signed-off-by: Ming Qiu <[email protected]>
  • Loading branch information
qiuming-best committed Mar 19, 2024
1 parent 84c1eca commit 64a3f2a
Show file tree
Hide file tree
Showing 15 changed files with 240 additions and 19 deletions.
2 changes: 2 additions & 0 deletions changelogs/unreleased/7512-qiuming-best
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Make parallel restore configurable

4 changes: 4 additions & 0 deletions config/crd/v1/bases/velero.io_restores.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,10 @@ spec:
description: UploaderConfig specifies the configuration for the restore.
nullable: true
properties:
parallelFilesDownload:
description: ParallelFilesDownload is the concurrency number setting
for restore.
type: integer
writeSparseFiles:
description: WriteSparseFiles is a flag to indicate whether write
files sparsely or not.
Expand Down
2 changes: 1 addition & 1 deletion config/crd/v1/crds/crds.go

Large diffs are not rendered by default.

49 changes: 49 additions & 0 deletions design/Implemented/velero-uploader-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,5 +177,54 @@ Roughly, the process is as follows:
4. Each respective controller within the CRs calls the uploader, and the WriteSparseFiles from map in CRs is passed to the uploader.
5. When the uploader subsequently calls the Kopia API, it can use the WriteSparseFiles to set the WriteSparseFiles parameter, and if the uploader calls the Restic command it would append `--sparse` flag within the restore command.

### Parallel Restore
Setting the parallelism of restore operations can improve the efficiency and speed of the restore process, especially when dealing with large amounts of data.

### Velero CLI
The Velero CLI will support a --parallel-files-download flag, allowing users to set the parallelism value when creating restores. when no value specified, the value of it would be the number of CPUs for the node that the node agent pod is running.
```bash
velero restore create --parallel-files-download $num
```

### UploaderConfig
below the sub-option parallel is added into UploaderConfig:

```go
type UploaderConfigForRestore struct {
// ParallelFilesDownload is the number of parallel for restore.
// +optional
ParallelFilesDownload int `json:"parallelFilesDownload,omitempty"`
}
```

#### Kopia Parallel Restore Policy

Velero Uploader can set restore policies when calling Kopia APIs. In the Kopia codebase, the structure for restore policies is defined as follows:

```go
// first get concurrrency from uploader config
restoreConcurrency, _ := uploaderutil.GetRestoreConcurrency(uploaderCfg)
// set restore concurrency into restore options
restoreOpt := restore.Options{
Parallel: restoreConcurrency,
}
// do restore with restore option
restore.Entry(..., restoreOpt)
```

#### Restic Parallel Restore Policy

Configurable parallel restore is not supported by restic, so we would return one error if the option is configured.
```go
restoreConcurrency, err := uploaderutil.GetRestoreConcurrency(uploaderCfg)
if err != nil {
return extraFlags, errors.Wrap(err, "failed to get uploader config")
}

if restoreConcurrency > 0 {
return extraFlags, errors.New("restic does not support parallel restore")
}
```

## Alternatives Considered
To enhance extensibility further, the option of storing `UploaderConfig` in a Kubernetes ConfigMap can be explored, this approach would allow the addition and modification of configuration options without the need to modify the CRD.
3 changes: 3 additions & 0 deletions pkg/apis/velero/v1/restore_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,9 @@ type UploaderConfigForRestore struct {
// +optional
// +nullable
WriteSparseFiles *bool `json:"writeSparseFiles,omitempty"`
// ParallelFilesDownload is the concurrency number setting for restore.
// +optional
ParallelFilesDownload int `json:"parallelFilesDownload,omitempty"`
}

// RestoreHooks contains custom behaviors that should be executed during or post restore.
Expand Down
6 changes: 0 additions & 6 deletions pkg/builder/restore_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -171,9 +171,3 @@ func (b *RestoreBuilder) ItemOperationTimeout(timeout time.Duration) *RestoreBui
b.object.Spec.ItemOperationTimeout.Duration = timeout
return b
}

// WriteSparseFiles sets the Restore's uploader write sparse files
func (b *RestoreBuilder) WriteSparseFiles(val bool) *RestoreBuilder {
b.object.Spec.UploaderConfig.WriteSparseFiles = &val
return b
}
10 changes: 9 additions & 1 deletion pkg/cmd/cli/restore/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ type CreateOptions struct {
ItemOperationTimeout time.Duration
ResourceModifierConfigMap string
WriteSparseFiles flag.OptionalBool
ParallelFilesDownload int
client kbclient.WithWatch
}

Expand Down Expand Up @@ -151,6 +152,8 @@ func (o *CreateOptions) BindFlags(flags *pflag.FlagSet) {

f = flags.VarPF(&o.WriteSparseFiles, "write-sparse-files", "", "Whether to write sparse files during restoring volumes")
f.NoOptDefVal = cmd.TRUE

flags.IntVar(&o.ParallelFilesDownload, "parallel-files-download", 0, "The number of restore operations to run in parallel. If set to 0, the default parallelism will be the number of CPUs for the node that node agent pod is running.")
}

func (o *CreateOptions) Complete(args []string, f client.Factory) error {
Expand Down Expand Up @@ -200,6 +203,10 @@ func (o *CreateOptions) Validate(c *cobra.Command, args []string, f client.Facto
return errors.New("existing-resource-policy has invalid value, it accepts only none, update as value")
}

if o.ParallelFilesDownload < 0 {
return errors.New("parallel-files-download cannot be negative")
}

Check warning on line 208 in pkg/cmd/cli/restore/create.go

View check run for this annotation

Codecov / codecov/patch

pkg/cmd/cli/restore/create.go#L207-L208

Added lines #L207 - L208 were not covered by tests

switch {
case o.BackupName != "":
backup := new(api.Backup)
Expand Down Expand Up @@ -324,7 +331,8 @@ func (o *CreateOptions) Run(c *cobra.Command, f client.Factory) error {
Duration: o.ItemOperationTimeout,
},
UploaderConfig: &api.UploaderConfigForRestore{
WriteSparseFiles: o.WriteSparseFiles.Value,
WriteSparseFiles: o.WriteSparseFiles.Value,
ParallelFilesDownload: o.ParallelFilesDownload,
},
},
}
Expand Down
4 changes: 3 additions & 1 deletion pkg/cmd/cli/restore/create_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ func TestCreateCommand(t *testing.T) {
allowPartiallyFailed := "true"
itemOperationTimeout := "10m0s"
writeSparseFiles := "true"

parallel := 2
flags := new(pflag.FlagSet)
o := NewCreateOptions()
o.BindFlags(flags)
Expand All @@ -108,6 +108,7 @@ func TestCreateCommand(t *testing.T) {
flags.Parse([]string{"--allow-partially-failed", allowPartiallyFailed})
flags.Parse([]string{"--item-operation-timeout", itemOperationTimeout})
flags.Parse([]string{"--write-sparse-files", writeSparseFiles})
flags.Parse([]string{"--parallel-files-download", "2"})
client := velerotest.NewFakeControllerRuntimeClient(t).(kbclient.WithWatch)

f.On("Namespace").Return(mock.Anything)
Expand Down Expand Up @@ -144,6 +145,7 @@ func TestCreateCommand(t *testing.T) {
require.Equal(t, allowPartiallyFailed, o.AllowPartiallyFailed.String())
require.Equal(t, itemOperationTimeout, o.ItemOperationTimeout.String())
require.Equal(t, writeSparseFiles, o.WriteSparseFiles.String())
require.Equal(t, parallel, o.ParallelFilesDownload)
})

t.Run("create a restore from schedule", func(t *testing.T) {
Expand Down
21 changes: 13 additions & 8 deletions pkg/cmd/util/output/restore_describer.go
Original file line number Diff line number Diff line change
Expand Up @@ -178,10 +178,7 @@ func DescribeRestore(ctx context.Context, kbClient kbclient.Client, restore *vel
d.Println()
d.Printf("Preserve Service NodePorts:\t%s\n", BoolPointerString(restore.Spec.PreserveNodePorts, "false", "true", "auto"))

if restore.Spec.UploaderConfig != nil && boolptr.IsSetToTrue(restore.Spec.UploaderConfig.WriteSparseFiles) {
d.Println()
DescribeUploaderConfigForRestore(d, restore.Spec)
}
describeUploaderConfigForRestore(d, restore.Spec)

Check warning on line 181 in pkg/cmd/util/output/restore_describer.go

View check run for this annotation

Codecov / codecov/patch

pkg/cmd/util/output/restore_describer.go#L181

Added line #L181 was not covered by tests

d.Println()
describeRestoreItemOperations(ctx, kbClient, d, restore, details, insecureSkipTLSVerify, caCertFile)
Expand All @@ -199,10 +196,18 @@ func DescribeRestore(ctx context.Context, kbClient kbclient.Client, restore *vel
})
}

// DescribeUploaderConfigForRestore describes uploader config in human-readable format
func DescribeUploaderConfigForRestore(d *Describer, spec velerov1api.RestoreSpec) {
d.Printf("Uploader config:\n")
d.Printf("\tWrite Sparse Files:\t%T\n", boolptr.IsSetToTrue(spec.UploaderConfig.WriteSparseFiles))
// describeUploaderConfigForRestore describes uploader config in human-readable format
func describeUploaderConfigForRestore(d *Describer, spec velerov1api.RestoreSpec) {
if spec.UploaderConfig != nil {
d.Println()
d.Printf("Uploader config:\n")
if boolptr.IsSetToTrue(spec.UploaderConfig.WriteSparseFiles) {
d.Printf("\tWrite Sparse Files:\t%v\n", boolptr.IsSetToTrue(spec.UploaderConfig.WriteSparseFiles))
}
if spec.UploaderConfig.ParallelFilesDownload > 0 {
d.Printf("\tParallel Restore:\t%d\n", spec.UploaderConfig.ParallelFilesDownload)
}
}
}

func describeRestoreItemOperations(ctx context.Context, kbClient kbclient.Client, d *Describer, restore *velerov1api.Restore, details bool, insecureSkipTLSVerify bool, caCertPath string) {
Expand Down
56 changes: 56 additions & 0 deletions pkg/cmd/util/output/restore_describer_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/builder"
"github.com/vmware-tanzu/velero/pkg/itemoperation"
"github.com/vmware-tanzu/velero/pkg/util/boolptr"
"github.com/vmware-tanzu/velero/pkg/util/results"
)

Expand Down Expand Up @@ -181,3 +182,58 @@ func TestDescribePodVolumeRestores(t *testing.T) {
})
}
}
func TestDescribeUploaderConfigForRestore(t *testing.T) {
cases := []struct {
name string
spec velerov1api.RestoreSpec
expected string
}{
{
name: "UploaderConfigNil",
spec: velerov1api.RestoreSpec{}, // Create a RestoreSpec with nil UploaderConfig
expected: "",
},
{
name: "test",
spec: velerov1api.RestoreSpec{
UploaderConfig: &velerov1api.UploaderConfigForRestore{
WriteSparseFiles: boolptr.True(),
ParallelFilesDownload: 4,
},
},
expected: "\nUploader config:\n Write Sparse Files: true\n Parallel Restore: 4\n",
},
{
name: "WriteSparseFiles test",
spec: velerov1api.RestoreSpec{
UploaderConfig: &velerov1api.UploaderConfigForRestore{
WriteSparseFiles: boolptr.True(),
},
},
expected: "\nUploader config:\n Write Sparse Files: true\n",
},
{
name: "ParallelFilesDownload test",
spec: velerov1api.RestoreSpec{
UploaderConfig: &velerov1api.UploaderConfigForRestore{
ParallelFilesDownload: 4,
},
},
expected: "\nUploader config:\n Parallel Restore: 4\n",
},
}

for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
d := &Describer{
Prefix: "",
out: &tabwriter.Writer{},
buf: &bytes.Buffer{},
}
d.out.Init(d.buf, 0, 8, 2, ' ', 0)
describeUploaderConfigForRestore(d, tc.spec)
d.out.Flush()
assert.Equal(t, tc.expected, d.buf.String(), "Output should match expected")
})
}
}
14 changes: 12 additions & 2 deletions pkg/uploader/kopia/snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,8 @@ func Restore(ctx context.Context, rep repo.RepositoryWriter, progress *Progress,
IgnorePermissionErrors: true,
}

restoreConcurrency := runtime.NumCPU()

Check warning on line 415 in pkg/uploader/kopia/snapshot.go

View check run for this annotation

Codecov / codecov/patch

pkg/uploader/kopia/snapshot.go#L415

Added line #L415 was not covered by tests
if len(uploaderCfg) > 0 {
writeSparseFiles, err := uploaderutil.GetWriteSparseFiles(uploaderCfg)
if err != nil {
Expand All @@ -419,9 +421,17 @@ func Restore(ctx context.Context, rep repo.RepositoryWriter, progress *Progress,
if writeSparseFiles {
fsOutput.WriteSparseFiles = true
}

concurrency, err := uploaderutil.GetRestoreConcurrency(uploaderCfg)
if err != nil {
return 0, 0, errors.Wrap(err, "failed to get parallel restore uploader config")
}
if concurrency > 0 {

Check warning on line 429 in pkg/uploader/kopia/snapshot.go

View check run for this annotation

Codecov / codecov/patch

pkg/uploader/kopia/snapshot.go#L428-L429

Added lines #L428 - L429 were not covered by tests
restoreConcurrency = concurrency
}
}

log.Debugf("Restore filesystem output %v", fsOutput)
log.Debugf("Restore filesystem output %v, concurrency %d", fsOutput, restoreConcurrency)

err = fsOutput.Init(ctx)
if err != nil {
Expand All @@ -436,7 +446,7 @@ func Restore(ctx context.Context, rep repo.RepositoryWriter, progress *Progress,
}

stat, err := restoreEntryFunc(kopiaCtx, rep, output, rootEntry, restore.Options{
Parallel: runtime.NumCPU(),
Parallel: restoreConcurrency,
RestoreDirEntryAtDepth: math.MaxInt32,
Cancel: cancleCh,
ProgressCallback: func(ctx context.Context, stats restore.Stats) {
Expand Down
4 changes: 4 additions & 0 deletions pkg/uploader/provider/restic.go
Original file line number Diff line number Diff line change
Expand Up @@ -246,5 +246,9 @@ func (rp *resticProvider) parseRestoreExtraFlags(uploaderCfg map[string]string)
extraFlags = append(extraFlags, "--sparse")
}

if restoreConcurrency, err := uploaderutil.GetRestoreConcurrency(uploaderCfg); err == nil && restoreConcurrency > 0 {
return extraFlags, errors.New("restic does not support parallel restore")
}

Check warning on line 251 in pkg/uploader/provider/restic.go

View check run for this annotation

Codecov / codecov/patch

pkg/uploader/provider/restic.go#L250-L251

Added lines #L250 - L251 were not covered by tests

return extraFlags, nil
}
7 changes: 7 additions & 0 deletions pkg/uploader/provider/restic_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -434,6 +434,13 @@ func TestParseUploaderConfig(t *testing.T) {
},
expectedFlags: []string{},
},
{
name: "RestoreConcorrency",
uploaderConfig: map[string]string{
"Parallel": "5",
},
expectedFlags: []string{},
},
}

for _, testCase := range testCases {
Expand Down
17 changes: 17 additions & 0 deletions pkg/uploader/util/uploader_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import (
const (
ParallelFilesUpload = "ParallelFilesUpload"
WriteSparseFiles = "WriteSparseFiles"
RestoreConcurrency = "ParallelFilesDownload"
)

func StoreBackupConfig(config *velerov1api.UploaderConfigForBackup) map[string]string {
Expand All @@ -42,6 +43,10 @@ func StoreRestoreConfig(config *velerov1api.UploaderConfigForRestore) map[string
} else {
data[WriteSparseFiles] = strconv.FormatBool(false)
}

if config.ParallelFilesDownload > 0 {
data[RestoreConcurrency] = strconv.Itoa(config.ParallelFilesDownload)
}
return data
}

Expand All @@ -68,3 +73,15 @@ func GetWriteSparseFiles(uploaderCfg map[string]string) (bool, error) {
}
return false, nil
}

func GetRestoreConcurrency(uploaderCfg map[string]string) (int, error) {
restoreConcurrency, ok := uploaderCfg[RestoreConcurrency]
if ok {
restoreConcurrencyInt, err := strconv.Atoi(restoreConcurrency)
if err != nil {
return 0, errors.Wrap(err, "failed to parse RestoreConcurrency config")
}
return restoreConcurrencyInt, nil
}
return 0, nil
}
Loading

0 comments on commit 64a3f2a

Please sign in to comment.