Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Scheduler Bugs: Deployment Update Resource Allocation and GPU Utilization #303

Open
michael-nammi opened this issue May 10, 2024 · 3 comments

Comments

@michael-nammi
Copy link

michael-nammi commented May 10, 2024

Environment

Kubernetes version: v1.27.9
HAMi version: v2.3.9

Bug 1: Possible Scheduler Bug When Updating Deployment with Insufficient Resources

Encountered a scheduler bug when updating a Deployment's resource requirements beyond the available capacity in a Kubernetes cluster with heterogeneous memory and GPU resources

Steps to reproduce the issue

  1. Pre-conditions:
    • Node 1: 4GiB Memory, 1 GPU
    • Node 2: 4GiB Memory, 1 GPU
    • Node 3: 16GiB Memory, 2 GPUs (each GPU with 16GiB)
  2. Create Deployment A:
    • Replicas: 1
    • Memory requirement: 16GiB
    • GPU requirement: 2
  3. Create Deployment B:
    • Replicas: 1
    • Memory requirement: 4GiB
    • GPU requirement: 1
  4. Delete Deployment A
  5. Modify Deployment B
    • Change replicas to 3
    • Change memory requirement to 8GiB
    • Change GPU requirement to 2

Expected Behavior

The update should fail because there is not enough memory and GPUs available in the cluster to satisfy the requirements of 3 replicas of Deployment B with the specified resources.

  • Node 1: 4GiB Memory occupied by the pre-existing resources of Deployment B
  • Node 2: Unchanged (idle)
  • Node 3: 2 replicas of Deployment B fully occupied the memory of two GPUs

Actual Behavior

The update fails, but the node resource allocation is incorrectly reported:

  • Node 1: 4GiB Memory
  • Node 2: Unchanged (idle)
  • Node 3: Resources are reported as 8GiB and 12GiB, which is inconsistent with the expected result of having all GPUs with full memory

Prometheus Metrics

image

Bug 2: Incorrect GPU Utilization

Encountered a scheduler bug when updating a Deployment's resource requirements beyond the available capacity in a Kubernetes cluster with heterogeneous memory and GPU resources

Steps to reproduce the issue

  1. Pre-conditions:
    • Node 1: 4GiB Memory, 1 GPU (Max Utilization: 100%)
    • Node 2: 4GiB Memory, 1 GPU (Max Utilization: 100%)
    • Node 3: 16GiB Memory, 2 GPUs (each GPU with 16GiB Memory and Max Utilization: 100%)
  2. Create Deployment A:
    • Replicas: 1
    • Memory requirement: 4GiB
    • GPU requirement: 1
    • GPUcores requirement: 120 (which implies a requirement of more than 100% GPU utilization if taking "100" as the maximum)

Expected Behavior

The deployment should fail to be scheduled due to the GPU utilization requirement exceeding the maximum limit of 100%.

  • Node 1: Memory should remain unallocated (4GiB)
  • Node 2: Memory should remain unallocated (4GiB)
  • Node 3: Both GPUs should remain unallocated (16GiB + 100%, and 16GiB + 100%)

Actual Behavior

The deployment is incorrectly scheduled with the following resource allocation:

  • Node 1: Unchanged (4GiB Memory idle)
  • Node 2: Unchanged (4GiB Memory idle)
  • Node 3: Resources are reported incorrectly:
    • First GPU: Appears as if 4GiB Memory + 100% Utilization has been allocated to Deployment A (should be no allocation)
    • Second GPU: Unallocated (16GiB Memory and 100% Utilization idle)
Copy link

Hi @michael-nammi,
Thanks for opening an issue!
We will look into it as soon as possible.

Details

Instructions for interacting with me using comments are available here.
If you have questions or suggestions related to my behavior, please file an issue against the gh-ci-bot repository.

@michael-nammi michael-nammi changed the title Multiple Scheduler Bugs: Deployment Update Resource Allocation and GPU Utilization Misreporting Multiple Scheduler Bugs: Deployment Update Resource Allocation and GPU Utilization May 10, 2024
@wawa0210
Copy link
Member

@michael-nammi It would be great if you could provide the yaml of each test process deployment, which would speed up our troubleshooting.

@michael-nammi
Copy link
Author

Here are the yaml files of the deployments:

Test for bug 1

Steps:
  1. Create Deployment A:
    • Replicas: 1
    • Memory requirement: 16GiB
    • GPU requirement: 2
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-a
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpu
  template:
    metadata:
      labels:
        app: gpu
    spec:
      containers:
      - name: ubuntu-container
        image: ubuntu:18.04
        command: ["bash", "-c", "sleep 86400"]
        resources:
          limits:
            nvidia.com/gpu: 2
            nvidia.com/gpumem: 16384
  1. Create Deployment B:
    • Replicas: 1
    • Memory requirement: 4GiB
    • GPU requirement: 1
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-b
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpu
  template:
    metadata:
      labels:
        app: gpu
    spec:
      containers:
      - name: ubuntu-container
        image: ubuntu:18.04
        command: ["bash", "-c", "sleep 86400"]
        resources:
          limits:
            nvidia.com/gpu: 1
            nvidia.com/gpumem: 4096
  1. Delete Deployment A

    • kubectl delete deployment deployment-a
  2. Modify Deployment B

    • Change replicas to 3
    • Change memory requirement to 8GiB
    • Change GPU requirement to 2
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-b
spec:
  replicas: 3
  selector:
    matchLabels:
      app: gpu
  template:
    metadata:
      labels:
        app: gpu
    spec:
      containers:
      - name: ubuntu-container
        image: ubuntu:18.04
        command: ["bash", "-c", "sleep 86400"]
        resources:
          limits:
            nvidia.com/gpu: 2
            nvidia.com/gpumem: 8192

Test for bug 2

Steps:
  1. Create Deployment A:
    • Replicas: 1
    • Memory requirement: 4GiB
    • GPU requirement: 1
    • GPUcores requirement: 120
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-a
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpu
  template:
    metadata:
      labels:
        app: gpu
    spec:
      containers:
      - name: ubuntu-container
        image: ubuntu:18.04
        command: ["bash", "-c", "sleep 86400"]
        resources:
          limits:
            nvidia.com/gpu: 1
            nvidia.com/gpumem: 4096
            nvidia.com/gpucores: 120

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants