Eventing pods request much more CPU and memory than they need #17168

pbochynski · 2023-03-24T12:26:19Z

Description

The default installation of eventing requests a lot of resources:

publisher proxy: 410m CPU and 448Mi memory x 2 pods
controller: 410m CPU and 704Mi memory x 1 pod
nats: 400m CPU and 576Mi memory x 3 pods

What gives together: 2,43 CPU and 3,33 GB memory
The actual usage of eventing pods in the idle cluster is: 0.015 CPU and 0.35 GB

Could you adjust requests settings to allow better cluster resources utilization, please?

Acceptance

combine settings used in MPS-Config and kyma/production profile and make them as low as possible
validate using loadtester that with minimal setting eventing still comes up and is able to run a low workload (10eps)

kyma-bot · 2023-05-23T15:29:24Z

This issue or PR has been automatically marked as stale due to the lack of recent activity.
Thank you for your contributions.

This bot triages issues and PRs according to the following rules:

After 60d of inactivity, lifecycle/stale is applied
After 7d of inactivity since lifecycle/stale was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Close this issue or PR with /close

If you think that I work incorrectly, kindly raise an issue with the problem.

/lifecycle stale

pbochynski · 2023-05-25T05:30:30Z

Still valid.

kyma-bot · 2023-07-24T06:18:57Z

This issue or PR has been automatically marked as stale due to the lack of recent activity.
Thank you for your contributions.

This bot triages issues and PRs according to the following rules:

After 60d of inactivity, lifecycle/stale is applied
After 7d of inactivity since lifecycle/stale was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Close this issue or PR with /close

If you think that I work incorrectly, kindly raise an issue with the problem.

/lifecycle stale

kyma-bot · 2023-07-31T06:22:58Z

This issue or PR has been automatically closed due to the lack of activity.
Thank you for your contributions.

This bot triages issues and PRs according to the following rules:

After 60d of inactivity, lifecycle/stale is applied
After 7d of inactivity since lifecycle/stale was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle stale

If you think that I work incorrectly, kindly raise an issue with the problem.

/close

kyma-bot · 2023-07-31T06:23:01Z

@kyma-bot: Closing this issue.

In response to this:

This issue or PR has been automatically closed due to the lack of activity.
Thank you for your contributions.

This bot triages issues and PRs according to the following rules:

After 60d of inactivity, lifecycle/stale is applied

After 7d of inactivity since lifecycle/stale was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle stale

If you think that I work incorrectly, kindly raise an issue with the problem.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k15r · 2023-08-03T07:40:37Z

production profile states:

controller:
  resources:
    limits:
      cpu: 1000m
      memory: 512Mi
    requests:
      cpu: 10m
      memory: 256Mi
  publisherProxy:
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 10m
        memory: 256Mi

nats:
    resources:
      limits:
        cpu: 500m
        memory: 1Gi
      requests:
        cpu: 10m
        memory: 512Mi
    logging:
      debug: false
      trace: false

this results in

requests:
  cpu: 70m
  memory: 2.5Gi

on skrs the current configuration states:

  controller.publisherProxy.replicas: "2"
  controller.publisherProxy.resources.limits.cpu: "500m"
  controller.publisherProxy.resources.limits.memory: "512Mi"
  controller.publisherProxy.resources.requests.cpu: "100m"
  controller.publisherProxy.resources.requests.memory: "64Mi"

  controller.resources.limits.cpu: "500m"
  controller.resources.limits.memory: "1Gi"
  controller.resources.requests.cpu: "100m"
  controller.resources.requests.memory: "64Mi"

  nats.cluster.replicas: "3"
  nats.nats.resources.limits.cpu: "500m"
  nats.nats.resources.limits.memory: "1Gi"
  nats.nats.resources.requests.cpu: "400m"
  nats.nats.resources.requests.memory: "512Mi"

that results in:

requests:
  cpu: 1.5
  memory: 1728Mi

CPU-requests can safely be decreased for all 3 deployments
Memory should also be decreased for EPP and EC

mfaizanse · 2023-08-07T07:35:36Z

@k15r There is a PR, which is even increasing the request resources for eventing.

marcobebway · 2023-08-24T08:03:13Z

On SKRs, we reduced the Eventing requested resources as follows:

EC(1 replica):

CPU: 40m
MEM: 64Mi

EPP(sum of 2 replicas):

CPU: 80m (2 * 40m)
MEM: 128Mi (2 * 64Mi)

NATS(sum of 3 replicas):

CPU: 120m (3 * 40m)
MEM: 192Mi (3 * 64Mi)

Total:

CPU: 240m (6 * 40m)
MEM: 384Mi (6 * 64Mi)

marcobebway · 2023-08-24T08:05:49Z

OSS Production:

kyma/resources/eventing/profile-production.yaml

Lines 8 to 51 in 35a75e2

    
           controller: 
        
             jetstream: 
        
               retentionPolicy: interest 
        
               streamReplicas: 3 
        
               consumerDeliverPolicy: new 
        
               maxMessages: -1 
        
             resources: 
        
               limits: 
        
                 cpu: 500m 
        
                 memory: 512Mi 
        
               requests: 
        
                 cpu: 10m 
        
                 memory: 64Mi 
        
             publisherProxy: 
        
               replicas: 1 
        
               resources: 
        
                 limits: 
        
                   cpu: 500m 
        
                   memory: 512Mi 
        
                 requests: 
        
                   cpu: 10m 
        
                   memory: 64Mi 
        
           nats: 
        
             cluster: 
        
               enabled: true 
        
               replicas: 3 
        
             reloader: 
        
               enabled: false 
        
             nats: 
        
               jetstream: 
        
                 memStorage: 
        
                   enabled: true 
        
                   size: 1Gi 
        
               resources: 
        
                 limits: 
        
                   cpu: 500m 
        
                   memory: 1Gi 
        
                 requests: 
        
                   cpu: 10m 
        
                   memory: 64Mi 
        
               logging: 
        
                 debug: false 
        
                 trace: false

OSS Evaluation

kyma/resources/eventing/profile-evaluation.yaml

Lines 8 to 51 in 35a75e2

    
           controller: 
        
             jetstream: 
        
               retentionPolicy: interest 
        
               streamReplicas: 1 
        
               consumerDeliverPolicy: new 
        
               maxMessages: -1 
        
             resources: 
        
               limits: 
        
                 cpu: 20m 
        
                 memory: 256Mi 
        
               requests: 
        
                 cpu: 1m 
        
                 memory: 32Mi 
        
             publisherProxy: 
        
               replicas: 1 
        
               resources: 
        
                 limits: 
        
                   cpu: 10m 
        
                   memory: 32Mi 
        
                 requests: 
        
                   cpu: 1m 
        
                   memory: 16Mi 
        
           nats: 
        
             cluster: 
        
               enabled: false 
        
               replicas: 1 
        
             reloader: 
        
               enabled: false 
        
             nats: 
        
               jetstream: 
        
                 memStorage: 
        
                   enabled: true 
        
                   size: 64Mi 
        
               resources: 
        
                 limits: 
        
                   cpu: 20m 
        
                   memory: 64Mi 
        
                 requests: 
        
                   cpu: 1m 
        
                   memory: 16Mi 
        
               logging: 
        
                 debug: true 
        
                 trace: true

pbochynski assigned pbochynski and k15r and unassigned pbochynski Mar 24, 2023

muralov added the area/eventing Issues or PRs related to eventing label Mar 24, 2023

kyma-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 23, 2023

pbochynski removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 25, 2023

kyma-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 24, 2023

kyma-bot closed this as completed Jul 31, 2023

k15r removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2023

k15r reopened this Aug 3, 2023

marcobebway assigned marcobebway and unassigned k15r Aug 16, 2023

marcobebway mentioned this issue Aug 18, 2023

Reduce Eventing resources #18006

Merged

marcobebway linked a pull request Aug 18, 2023 that will close this issue

Reduce Eventing resources #18006

Merged

kyma-bot closed this as completed in #18006 Aug 23, 2023

marcobebway mentioned this issue Aug 24, 2023

runtime overrides-2 #17651

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eventing pods request much more CPU and memory than they need #17168

Eventing pods request much more CPU and memory than they need #17168

pbochynski commented Mar 24, 2023 •

edited by marcobebway

Loading

kyma-bot commented May 23, 2023

pbochynski commented May 25, 2023

kyma-bot commented Jul 24, 2023

kyma-bot commented Jul 31, 2023

kyma-bot commented Jul 31, 2023

k15r commented Aug 3, 2023 •

edited

Loading

mfaizanse commented Aug 7, 2023

marcobebway commented Aug 24, 2023

marcobebway commented Aug 24, 2023

Eventing pods request much more CPU and memory than they need #17168

Eventing pods request much more CPU and memory than they need #17168

Comments

pbochynski commented Mar 24, 2023 • edited by marcobebway Loading

kyma-bot commented May 23, 2023

pbochynski commented May 25, 2023

kyma-bot commented Jul 24, 2023

kyma-bot commented Jul 31, 2023

kyma-bot commented Jul 31, 2023

k15r commented Aug 3, 2023 • edited Loading

mfaizanse commented Aug 7, 2023

marcobebway commented Aug 24, 2023

marcobebway commented Aug 24, 2023

pbochynski commented Mar 24, 2023 •

edited by marcobebway

Loading

k15r commented Aug 3, 2023 •

edited

Loading