-
Notifications
You must be signed in to change notification settings - Fork 64
Add flag to enable pprof + fix --metrics-bind-address #193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for agent-sandbox canceled.
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: antonipp The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @antonipp. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
cmd/agent-sandbox-controller/main.go
Outdated
| if enablePprof { | ||
| setupLog.Info("pprof enabled") | ||
| metricsOpts.ExtraHandlers = map[string]http.Handler{ | ||
| "/debug/pprof/": http.HandlerFunc(pprof.Index), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can enable just pprof profile and rest under a different flag ?
barney-s
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding pprof support and fixing the metrics address! This will be very helpful for performance analysis.
I have a few suggestions to improve this further, mainly around separating the health probe endpoint and considering the security implications of exposing all pprof endpoints. Please see the detailed comments.
cmd/agent-sandbox-controller/main.go
Outdated
| "Enable leader election for controller manager. "+ | ||
| "Enabling this will ensure there is only one active controller manager.") | ||
| flag.BoolVar(&extensions, "extensions", false, "Enable extensions controllers.") | ||
| flag.BoolVar(&enablePprof, "enable-pprof", false, "Enable pprof profiling endpoints on the metrics server.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The help text could be more descriptive. For example: "Enable profiling via pprof handlers on the metrics endpoint. All pprof endpoints will be exposed under /debug/pprof/."
| } | ||
| if enablePprof { | ||
| setupLog.Info("pprof enabled") | ||
| metricsOpts.ExtraHandlers = map[string]http.Handler{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exposing all pprof endpoints, especially the index, with a single flag in a production environment can be a security concern as it might leak sensitive information. It is better to have separate flags for different profiles or make it clear this is for debugging purposes only.
cmd/agent-sandbox-controller/main.go
Outdated
| if enablePprof { | ||
| setupLog.Info("pprof enabled") | ||
| metricsOpts.ExtraHandlers = map[string]http.Handler{ | ||
| "/debug/pprof/": http.HandlerFunc(pprof.Index), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pprof.Index handler provides links to all available profiles. If this controller is used in a multi-tenant environment, this could potentially leak information. Consider exposing only specific, non-sensitive profiles like profile and trace if this is intended for production use.
| metricsOpts.ExtraHandlers = map[string]http.Handler{ | ||
| "/debug/pprof/": http.HandlerFunc(pprof.Index), | ||
| "/debug/pprof/cmdline": http.HandlerFunc(pprof.Cmdline), | ||
| "/debug/pprof/profile": http.HandlerFunc(pprof.Profile), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The /debug/pprof/cmdline endpoint exposes the command-line arguments of the process. In a Kubernetes environment, this could leak sensitive information passed as flags. It would be safer to not expose this particular endpoint by default.
cmd/agent-sandbox-controller/main.go
Outdated
| "/debug/pprof/": http.HandlerFunc(pprof.Index), | ||
| "/debug/pprof/cmdline": http.HandlerFunc(pprof.Cmdline), | ||
| "/debug/pprof/profile": http.HandlerFunc(pprof.Profile), | ||
| "/debug/pprof/symbol": http.HandlerFunc(pprof.Symbol), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The /debug/pprof/symbol endpoint can be used to look up program counters, potentially giving an attacker more information about the binary's structure to aid in exploiting other vulnerabilities. This endpoint is not typically needed for standard profiling and should be disabled unless required for a specific debugging scenario.
|
Ok I split the flags into two: just tbh I just initially copied what Cilium had here: https://github.com/cilium/cilium/blob/3f34d20b73548b6fb68df9b3e79198178e2e1939/pkg/pprof/cell.go#L125-L129 |
Signed-off-by: Anton Ippolitov <[email protected]>
Description
This PR adds the possibility of enabling profiling. We would like to use the controller to run at a fairly large scale (pools of 1000s-10000s of sandboxes) so it would be great to have extra observability to detect potential bottlenecks.
I also noticed that
--metrics-bind-addresswas not used at all so I fixed that too. If enabled, thepprofendpoints will be exposed on the same port.Testing
Verified that this worked by running a forked version of the controller and enabling pprof. The profiles were properly collected (in our case by Datadog):
