Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #8277: vtctld logs leak pii http headers #9669

Merged

Conversation

vmogilev
Copy link
Contributor

Description

This PR fixes vtctld logging which is currently leaking http header values that can contain PII.

Related Issue(s)

Fixes #8277

Checklist

  • Should this PR be backported?
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

@deepthi deepthi requested a review from doeg February 11, 2022 00:13
HTTPHeaderRedactedMessage string = "*** redacted by sanitizeRequestHeader() ***"
)

var whiteListedHTTPHeaders = map[string]interface{}{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we tend to use "allowList" instead of "whitelist", as with #8630.

(Also thank you for listing the keys in alphabetical order. 😍)

Copy link
Contributor Author

@vmogilev vmogilev Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good 👍

@@ -157,9 +158,48 @@ func getID(url, base string) (int, error) {
return i, nil
}

const (
LogLevelDumpAllHTTPHeaders glog.Level = 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, using the loglevel to determine behavior is very elegant!

Another way to do this that you may have already considered is to add a flag. Recently, I came across #9550 by @mattlord, which is similar in spirit in that it redacts sensitive information from the logs (bind variables, in this particular case). That PR introduces the sanitize_log_messages flag on the vttablet for this purpose.

I'd bias towards using a flag, since it's explicitly opt-in, and I can see an (admittedly imaginary) scenario where one would want debug logging on but to still redact the headers. However! I acknowledge there are definitely trade-offs in adding a flag. (vtctlds already have a billion flags, for example.)

I don't feel very strongly about this, but I'm curious what you think. :)

(@mattlord maybe you have an opinion too, since you've been thinking about this for #9469 🙇 )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that adding a flag with the same name and general behavior makes sense for vtctld as well (I think we'll eventually add this to vtgate too). Thank you for working on this! ❤️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, it's good to be consistent with the rest of the project in how these toggles are implemented. Let me see how to wire that up with a flag 🚧

Copy link
Contributor Author

@vmogilev vmogilev Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any objections on making this new flag ( dump_all_http_headers ) default's value = false?

Copy link
Contributor Author

@vmogilev vmogilev Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most likely it'll end-up being workflow_manager_sanitize_http_headers with the default value = true ( to make default behavior safe out of the box )

how does that sound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... with the goal to make default behavior safe out of the box

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I do like making things safer by default, but because this is arguably a breaking change, it might be easier to start with a default of workflow_manager_sanitize_http_headers=false. That way the behaviour is entirely opt-in. This is what #9550 does.

Either way, I would also suggest adding this to the release notes, which will be for Vitess 14.0, unless you feel strongly about backporting this to the current Vitess 13.0 release candidate. (FWIW it'll make no difference to Slack as we'll have to cherry pick this change to our fork anyhow, given that we're not yet ready for 13.0.) I don't think the release notes for 14.0 are being drafted yet, but I'll tag this PR with "release notes" so we don't forget! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, with respect to the flag name of workflow_manager_sanitize_http_headers, it might be fine to make this flag a bit more general, which is again what #9550 does with its sanitize_log_messages flag. We could even use the same name, as @mattlord suggests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok this is done

@@ -132,3 +134,87 @@ func TestLongPolling(t *testing.T) {
cancel()
wg.Wait()
}

func TestSanitizeRequestHeader(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are :chef-kiss: ...thank you!

@vmogilev vmogilev force-pushed the vm_vitess-8277_vtctld-leaks-pii-http-logs2 branch from 1f839ca to 96d79ed Compare February 11, 2022 18:28
@doeg
Copy link
Contributor

doeg commented Feb 11, 2022

Hmm it looks like at least some of the CI failures are related to this change. For example, https://github.com/vitessio/vitess/runs/5160694796?check_suite_focus=true

ok  	vitess.io/vitess/go/vt/vtctl/workflow/vexec	0.018s
/tmp/go-build783617084/b1289/vtctld.test flag redefined: sanitize_log_messages
panic: /tmp/go-build783617084/b1289/vtctld.test flag redefined: sanitize_log_messages

goroutine 1 [running]:
flag.(*FlagSet).Var(0xc0000c8120, {0x2dba4e8, 0xc0000e1016}, {0x214d70d, 0x15}, {0x217307e, 0x24})
	/opt/hostedtoolcache/go/1.17.6/x64/src/flag/flag.go:879 +0x2f4
flag.(*FlagSet).BoolVar(...)
	/opt/hostedtoolcache/go/1.17.6/x64/src/flag/flag.go:632
flag.(*FlagSet).Bool(0x10, {0x214d70d, 0x15}, 0x0, {0x217307e, 0x24})
	/opt/hostedtoolcache/go/1.17.6/x64/src/flag/flag.go:645 +0x7d
flag.Bool(...)
	/opt/hostedtoolcache/go/1.17.6/x64/src/flag/flag.go:652
vitess.io/vitess/go/vt/vtctld.init()
	/home/runner/work/vitess/vitess/go/vt/vtctld/vtctld.go:45 +0x2db	

I wonder if this is somehow conflicting with sanitize_log_messages in vt/vttablet/tabletserver/tabletenv/config.go? That would be really surprising to me given that vttablet and vtctld are different things but I do know that scoping with the flag package can get messy. 😬 Let me double check this hypothesis in the Vitess #developers channel. (Link: https://vitess.slack.com/archives/CMDJ2KFEZ/p1644607886669289)

@doeg
Copy link
Contributor

doeg commented Feb 11, 2022

@vmogilev so @mattlord kindly brought me up to speed on flags and offered some suggestions for this PR:

  1. Re-use the existing sanitize_log_messages defined in vttablet like this:
diff --git a/go/vt/vtctld/vtctld.go b/go/vt/vtctld/vtctld.go
index 8ddfd67f3e..948eb348a2 100644
--- a/go/vt/vtctld/vtctld.go
+++ b/go/vt/vtctld/vtctld.go
@@ -43,6 +43,7 @@ var (
 	enableRealtimeStats = flag.Bool("enable_realtime_stats", false, "Required for the Realtime Stats view. If set, vtctld will maintain a streaming RPC to each tablet (in all cells) to gather the realtime health stats.")
 	enableUI            = flag.Bool("enable_vtctld_ui", true, "If true, the vtctld web interface will be enabled. Default is true.")
 	durabilityPolicy    = flag.String("durability_policy", "none", "type of durability to enforce. Default is none. Other values are dictated by registered plugins")
+	sanitizeLogMessages bool

 	_ = flag.String("web_dir", "", "NOT USED, here for backward compatibility")
 	_ = flag.String("web_dir2", "", "NOT USED, here for backward compatibility")
@@ -60,6 +61,9 @@ func InitVtctld(ts *topo.Server) error {
 		return err
 	}

+	sanitizeLogMessages = flag.Lookup("sanitize_log_messages").Value.(flag.Getter).Get().(bool)
+
 	actionRepo := NewActionRepository(ts)

 	// keyspace actions
  1. Generalize the flag's help text defined here as it's currently written to be specific for tablets. (I confirmed that it's ok to leave the actual flag.BoolVar definition there, even as we're re-using it elsewhere. Go flags are kinda like the wild west, but I think there are plans to use a more structured flag library at some point.)

So perhaps we can update flag.BoolVar(&currentConfig.SanitizeLogMessages, "sanitize_log_messages", false, "Remove potentially sensitive information in tablet INFO, WARNING, and ERROR log messages such as query parameters.") to something like Remove potentially sensitive information in INFO, WARNING, and ERROR log messages, such as query parameters and sensitive request headers.

I don't think you'll need to modify your tests at all, since this PR is calling sanitizeRequestHeader with a parameter rather than relying on flags. I think this is fine! If, for some reason, you do want to test this behaviour by toggling the flag, then https://github.com/vitessio/vitess/pull/9614/files#diff-522719ab2adeb062f5c7ccfbe820d27b8fdbd9e048946ce0858b1665108f7d50R46 might be a useful example on using flag.Set in a test. :)

@doeg
Copy link
Contributor

doeg commented Feb 11, 2022

Ha, ok I'm back with an even more detailed example from @mattlord 💖 here's a way to more safely look up a flag (in our case, to use the existing flag):

diff --git a/go/vt/vtctld/vtctld.go b/go/vt/vtctld/vtctld.go
index 8ddfd67f3e..26bdb8cf8c 100644
--- a/go/vt/vtctld/vtctld.go
+++ b/go/vt/vtctld/vtctld.go
@@ -43,6 +43,7 @@ var (
 	enableRealtimeStats = flag.Bool("enable_realtime_stats", false, "Required for the Realtime Stats view. If set, vtctld will maintain a streaming RPC to each tablet (in all cells) to gather the realtime health stats.")
 	enableUI            = flag.Bool("enable_vtctld_ui", true, "If true, the vtctld web interface will be enabled. Default is true.")
 	durabilityPolicy    = flag.String("durability_policy", "none", "type of durability to enforce. Default is none. Other values are dictated by registered plugins")
+	sanitizeLogMessages, sanitizeErrorMessages bool

 	_ = flag.String("web_dir", "", "NOT USED, here for backward compatibility")
 	_ = flag.String("web_dir2", "", "NOT USED, here for backward compatibility")
@@ -60,6 +61,23 @@ func InitVtctld(ts *topo.Server) error {
 		return err
 	}

+	slmFlag := flag.Lookup("sanitize_log_messages")
+	if slmFlag != nil {
+		getter, found := slmFlag.Value.(flag.Getter)
+		if found {
+			sanitizeLogMessages = getter.Get().(bool)
+		}
+	}
+	semFlag := flag.Lookup("sanitize_error_messages")
+	if semFlag != nil {
+		getter, found := semFlag.Value.(flag.Getter)
+		if found {
+			sanitizeErrorMessages = getter.Get().(bool)
+		}
+	}
+
+	log.Infof("Starting vtctld. Log sanitization: %v Error sanitization: %v", sanitizeLogMessages, sanitizeErrorMessages)
+
 	actionRepo := NewActionRepository(ts)

 	// keyspace actions

Also, per suggestion (2) above, if we update the helptext for the existing sanitize_log_messages flag, we'll also want to (separately) update the references in the vitess.io website: https://github.com/vitessio/website/search?q=sanitize_log_messages

@vmogilev
Copy link
Contributor Author

Hi @doeg , thank you for helping with CI failures - good catch on the duplicate flag name.

Small concern I'd like to run by you -- if we go with this shared flag solution, do you know if we'll be able to pull-in this change into our version of Vitess?

My concern is around the bit where we now depend on something we don't yet have in our branch (flag defined in another commit/pkg).

I also feel a bit weird about this indirect dependency, but if this is the pattern we already use - then perhaps it's warranted to repeat it here.

@deepthi
Copy link
Member

deepthi commented Feb 11, 2022

I also feel a bit weird about this indirect dependency, but if this is the pattern we already use - then perhaps it's warranted to repeat it here.

It's unfortunate that vtctld depends on some vttablet packages.
go list tells me that one of them is vitess.io/vitess/go/vt/vttablet/tabletserver/tabletenv so unfortunately that means that vtctld includes all flags from that package whether or not they make any sense for vtctld.

We don't use the flag.Lookup pattern very much. It's only in a couple of places in the logutil package right now. But it does seem like that is the only option if we want to use the same flag name.

Someday we will implement VEP-4 and clean all this up.

@vmogilev
Copy link
Contributor Author

between the two options:

  1. reuse vttablet flag ( sanitize_log_messages )
  2. implement a local vtctld flag ( vtctld_sanitize_log_messages )

I tend to favor 2)

@doeg
Copy link
Contributor

doeg commented Feb 14, 2022

between the two options:

  1. reuse vttablet flag ( sanitize_log_messages )
  2. implement a local vtctld flag ( vtctld_sanitize_log_messages )
    I tend to favor 2)

The second option sounds fine to me, but I defer to @mattlord and @deepthi as I work in the vtctld codebase too rarely to have formed any strong opinions. :)

Both ways have their merits. Selfishly, I do like that a separate flag will make cherry picking this onto our old release more straightforward.

(If Vitess's flag usage will change with vitessio/enhancements#8, perhaps more granular flags will make a migration to Cobra a tiny bit easier?)

Signed-off-by: Vitaliy Mogilevskiy <[email protected]>
@vmogilev
Copy link
Contributor Author

vmogilev commented Feb 14, 2022

sounds good - I think we already have a 👍 from @deepthi (see above)

@mattlord any objections if on this PR, we stick with a localized ( vtctld_sanitize_log_messages ) flag?

@doeg I just implemented this change to verify it works - it's currently pending an approval of the CI tests.

Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Really appreciate the time and effort put into the tests. ❤️

@doeg doeg merged commit bd5ad49 into vitessio:main Feb 16, 2022
@vmogilev vmogilev deleted the vm_vitess-8277_vtctld-leaks-pii-http-logs2 branch February 16, 2022 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vtctld should sanitize HTTP requests before logging to avoid logging sensitive headers
4 participants