Theres a new PR in flight to set VMX version of machiens when booting https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/pull/1673 .
I wanted to know wether this would mean there are hackarounds which would mean i could take
- A non-UEFI (i.e. a BIOS) image
- turn EFI boot on (by modifying VSphere template parameters)
- set the hardware version for the booted VM to 17 (by doing a "Upgrade Template to ESXI version" in vsphere).
If so that would mean people could easily run GPU workloads on TKG with Vsphere without needing to build custom OVAs. However, I found out I couldnt get this to work b/c evidently uEFI images need to be setup at the disk level, so, even if you
- enable the Firmware (to use uEFI)
- enable the Hardware version ( to support GPUs) ,
the disk image still has to be setup to use uEFI to begin with.
Well anyways heres some screenshots on what i learned/tried. For folks wanting to run GPU containers on Kubernetes in VMWare Tanzu, we have our TKG docs here https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.6/vmware-tanzu-kubernetes-grid-16/GUID-build-images-linux.html which have official instructions. consult that for official advice.
these are just some hacky notes and screenshots i took during my failed experiment :)
If you dont have a uEFI partition but you set "EFI" as the boot option in Vsphere,
You get this thing where the boot fails
So, in order to uEFI TKG 1.6, you need to
- run TKGs image-builder to build a uEFI compatible image with the filesystem laid out
- upgrade the hardware version from 15 -> 17 in this process, this is an image builder parameter
how can i know if my image has this setting
run govc vm.info -json /dc0/vm/ubuntu-2004-kube-v1.23.10+vmware.1-tkg.1 | grep -i vmx
you'll see something like this
"Config": {
"Name": "gpu3-upgraded-to-host",
"Template": true,
"VmPathName": "[108-ds01] gpu3-upgraded-to-host/gpu3-upgraded-to-host.vmtx",
"MemorySizeMB": 2048,
"CpuReservation": 0,
"MemoryReservation": 0,
"NumCpu": 2,
"NumEthernetCards": 1,
"NumVirtualDisks": 1,
"Uuid": "42390be8-c541-c75a-434a-78b885684a47",
"InstanceUuid": "503947a2-a4cf-a0c5-12d2-57fda4a482f3",
"GuestId": "ubuntu64Guest",
"GuestFullName": "Ubuntu Linux (64-bit)",
"Annotation": "Cluster API vSphere image - Ubuntu 20.04 and Kubernetes v1.23.10+vmware.1 - https://github.com/kubernetes-sigs/cluster-api-provider-vsphere",
"Product": {
"Key": 0,
"ClassId": "",
"InstanceId": "",
"Name": "Ubuntu 20.04 and Kubernetes v1.23.10+vmware.1",
"Vendor": "VMware Inc.",
"Version": "kube-v1.23.10+vmware.1",
"FullVersion": "kube-v1.23.10+vmware.1",
"VendorUrl": "https://vmware.com",
"ProductUrl": "",
"AppUrl": ""
},
"InstallBootRequired": false,
"FtInfo": null,
"ManagedBy": null,
"TpmPresent": false,
"NumVmiopBackings": 0,
"HwVersion": "vmx-19"
},
No comments:
Post a Comment