28.11.22

uEFI and Hardware versions

Theres a new PR in flight to set VMX version of machiens when booting https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/pull/1673 . 


I wanted to know wether this would mean there are hackarounds which would mean i could take

- A non-UEFI (i.e. a BIOS) image

- turn EFI boot on (by modifying VSphere template parameters)

- set the hardware version for the booted VM to 17 (by doing a "Upgrade Template to ESXI version" in vsphere).


If so that would mean people could easily run GPU workloads on TKG with Vsphere without needing to build custom OVAs.  However, I found out I couldnt get this to work b/c evidently uEFI images need to be setup at the disk level, so, even if you
-  enable the Firmware (to use uEFI)

- enable the Hardware version ( to support GPUs) , 

the disk image still has to be setup to use uEFI to begin with.  

Well anyways heres some screenshots on what i learned/tried.  For folks wanting to run GPU containers on Kubernetes in VMWare Tanzu, we have our TKG docs here https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.6/vmware-tanzu-kubernetes-grid-16/GUID-build-images-linux.html which have official instructions.  consult that for official advice.


these are just some hacky notes and screenshots i took during my failed experiment :) 

If you dont have a uEFI partition but you set  "EFI" as the boot option in Vsphere, 



You get this thing where the boot fails



So, in order to uEFI TKG 1.6, you need to 

- run TKGs image-builder to build a uEFI compatible image with the filesystem laid out

- upgrade the hardware version from 15 -> 17 in this process, this is an image builder parameter

how can i know if my image has this setting

run govc vm.info -json /dc0/vm/ubuntu-2004-kube-v1.23.10+vmware.1-tkg.1 | grep -i vmx

you'll see something like this 

        "Config": {

          "Name": "gpu3-upgraded-to-host",

          "Template": true,

          "VmPathName": "[108-ds01] gpu3-upgraded-to-host/gpu3-upgraded-to-host.vmtx",

          "MemorySizeMB": 2048,

          "CpuReservation": 0,

          "MemoryReservation": 0,

          "NumCpu": 2,

          "NumEthernetCards": 1,

          "NumVirtualDisks": 1,

          "Uuid": "42390be8-c541-c75a-434a-78b885684a47",

          "InstanceUuid": "503947a2-a4cf-a0c5-12d2-57fda4a482f3",

          "GuestId": "ubuntu64Guest",

          "GuestFullName": "Ubuntu Linux (64-bit)",

          "Annotation": "Cluster API vSphere image - Ubuntu 20.04 and Kubernetes v1.23.10+vmware.1 - https://github.com/kubernetes-sigs/cluster-api-provider-vsphere",

          "Product": {

            "Key": 0,

            "ClassId": "",

            "InstanceId": "",

            "Name": "Ubuntu 20.04 and Kubernetes v1.23.10+vmware.1",

            "Vendor": "VMware Inc.",

            "Version": "kube-v1.23.10+vmware.1",

            "FullVersion": "kube-v1.23.10+vmware.1",

            "VendorUrl": "https://vmware.com",

            "ProductUrl": "",

            "AppUrl": ""

          },

          "InstallBootRequired": false,

          "FtInfo": null,

          "ManagedBy": null,

          "TpmPresent": false,

          "NumVmiopBackings": 0,

          "HwVersion": "vmx-19"

        },




No comments:

Post a Comment