Najnowsza aktualizacja VMware ESXi 8.0, oznaczona jako Update 3b, przynosi istotne usprawnienia i poprawki. Jednym z naprawionych problemów jest niejasny komunikat pojawiający się podczas precheck przed aktualizacją ESXi. Poprzednio komunikat „Hosts are remediated” mógł wprowadzać w błąd, jednak został on zmieniony na bardziej precyzyjny: „Hosts are remediated sequentially”. Dodatkowo, w poprzedniej wersji (8.0 Update 2) występowały problemy z synchronizacją czasu na serwerach ESXi, wynikające ze zmian w pakiecie NTP. W najnowszej aktualizacji błędy te zostały wyeliminowane, co przywraca poprawne działanie synchronizacji z serwerami NTP. Po więcej, informacji zapraszam do dalszej części artykułu.
Rozwiązane problemy:
-
PR 3421434: When you migrate virtual machines with snapshots from a vSAN ESA 8.x datastore, you might see errors at the destination datastore
In vSAN ESA 8.x environments, under certain conditions, migrating VMs with snapshots from a vSAN ESA datastore by using either a Storage vMotion, cross vMotion, cold relocate or clone operation might result in errors at the destination datastore, such as VMs failing to boot up. The issue is specific to vSAN ESA and is not applicable to vSAN OSA. It can affect both the VM snapshots and the running VM.
This issue is resolved in this release.
-
PR 3388844: You cannot activate Kernel Direct Memory Access (DMA) Protection for Windows guest OS on ESXi hosts with Intel CPU
If an input–output memory management unit (IOMMU) is active on a Windows guest OS, the Kernel DMA Protection option under System Information might be off for VMs running on ESXi hosts with Intel CPUs. As a result, you might not be able to fulfill some security requirements for your environment.
This issue is resolved in this release. The fix deactivates Kernel DMA Protection by default. ESXi 8.0 Update 3b adds the vmx parameter
acpi.dmar.enableDMAProtection
and to activate Kernel DMA Protection in a Windows guest OS, you must addacpi.dmar.enableDMAProtection=TRUE
to the vmx file. -
PR 3377863: You see „Hosts are remediated” message in the upgrade precheck results
When running a precheck before an ESXi update, you see a message such as
Hosts are remediated
in the precheck results, which is not clear and might be misleading.This issue is resolved in this release. The new message is
Hosts are remediated sequentially
. -
PR 3417329: Virtual machine tasks might intermittently fail due to a rare issue with the memory slab
Due to a rare issue with the vSAN DOM object, where a reference count on a component object might not decrement correctly, the in-memory object might never be released from the slab and can cause the component manager slab to reach its limit. As a result, you might not be able to create VMs, migrate VMs or might encounter VM power-on failures on vSAN clusters, either OSA or ESA.
This issue is resolved in this release.
-
PR 3406140: Extracting a vSphere Lifecycle Manager image from an existing ESXi host might fail after a kernel configuration change
Each update of the kernel configuration also triggers an update to the
/bootbank/useropts.gz
file, but due to a known issue, thebasemisc.tgz
might not contain the defaultuseropts
file after such an update. As a result, when attempting to extract a vSphere Lifecycle Manager image from an existing ESXi host, the absence of the defaultuseropts
file leads to failure to create theesx-base.vib
file and the operation also fails.This issue is resolved in this release.
-
PR 3408477: Some ESXi hosts might not have a locker directory after an upgrade from ESXi 6.x to 8.x
When you upgrade an ESXi host with a boot disk of less than 10 GB and not on USB from ESXi 6.x to 8.x, the upgrade process might not create a locker directory and the
/locker
symbolic link is not active.This issue is resolved in this release. If you already face the issue, upgrading to ESXi 8.0 Update 3b creates a locker directory but does not automatically create a VMware Tools repository. As a result, clusters that host such ESXi hosts display as non-compliant. Remediate the cluster again to create a VMware Tools repository and to become compliant.
-
PR 3422005: ESXi hosts of version 8.0 Update 2 and later might fail to synchronize time with certain NTP servers
A change in the ntp-4.2.8p17 package in ESXi 8.0 Update 2 might cause the NTP client to reject certain server packets as poorly formatted or invalid. For example, if a server sends packets with a
ppoll
value of0
, the NTP client on ESXi does not synchronize with the server.This issue is resolved in this release.
-
PR 3408145: The vmx service might fail with a core dump due to rare issue with the vSphere Data Protection solution running out of resources
In rare cases, if a backup operation starts during high concurrent guest I/O load, for example a VM with high write I/O intensity and a high number of overwrites, the VAIO filter component of the vSphere Data Protection solution might run out of resources to handle the guest write I/Os. As a result, the vmx service might fail with a core dump and restart.
This issue is resolved in this release. Alternatively, run backups during periods of low guest I/O activity.
-
PR 3405912: In the vSphere Client, you do not see the correct total vSAN storage consumption for which you have a license
Due to a rare race condition in environments with many vSAN clusters managed by a single vCenter instance, in the vSphere Client under Licensing > Licenses you might see a discrepancy between the total claimed vSAN storage capacity and the reported value for the clusters.
This issue is resolved in this release.
-
PR 3414588: Snapshot tasks on virtual machines on NVMe/TCP datastores might take much longer than on VMs provisioned on NVMe/FC datastores
When creating or deleting a snapshot on a VM provisioned on datastores backed by NVMe/TCP namespaces, such tasks might take much longer than on VMs provisioned on datastores backed by NVMe/FC namespaces. The issue occurs because the nvmetcp driver handles some specific NVMe commands not in the way the NVMe/TCP target systems expect.
This issue is resolved in this release.
-
PR 3392173: A rare issue with the Virsto vSAN component might cause failure to create vSAN objects, or unmount disk groups, or reboot ESXi hosts
In very rare cases, if a Virsto component creation task fails, it might not be properly handled and cause background deletion of virtual disks to stop. As a result, deleting virtual disks in tasks such as creating vSAN objects, or unmounting of disk groups, or rebooting ESXi hosts does not occur as expected and might cause such tasks to fail.
This issue is resolved in this release.
-
PR 3415365: ESXi upgrade to 8.0 Update 3 fails with an error in the vFAT bootbank partitions
ESXi 8.0 Update 3 adds a precheck in the upgrade workflow that uses the
dosfsck
tool to catch vFAT corruptions. One of the errors thatdosfsck
flags is thedirty bit set
but ESXi does not use that concept and such errors are false positive.In the vSphere Client, you see an error such as
A problem with one or more vFAT bootbank partitions was detected. Please refer to KB 91136 and run dosfsck on bootbank partitions
.In the remediation logs on ESXi hosts, you see logs such as:
2024-07-02T16:01:16Z In(14) lifecycle[122416262]: runcommand:199 runcommand called with: args = ['/bin/dosfsck', '-V', '-n', '/dev/disks/naa.600508b1001c7d25f5336a7220b5afc1:6'], outfile = None, returnoutput= True, timeout = 10.
2024-07-02T16:01:16Z In(14) lifecycle[122416262]: upgrade_precheck:1836 dosfsck output: b'CP850//TRANSLIT: Invalid argument\nCP850: Invalid argument\nfsck.fat 4.1+git (2017-01-24)\n0x25: Dirty bit is set. Fswas not properly unmounted and some data may be corrupt.\n Automatically removing dirty bit.\nStarting check/repair pass.\nStarting verification pass.\n\nLeaving filesystem unchanged.\n/dev/disks/naa.600508b1001c7d25f5336a7220b5afc1:6: 121 files, 5665/65515 clusters\n'
This issue is resolved in this release.
-
PR 3421084: An ESXi host might become temporarily inaccessible from vCenter if an NFSv3 datastore fails to mount during reboot or bring up
During an ESXi host reboot, if an NFSv3 datastore fails to mount during the reboot or bring up in VMware Cloud Foundation environments, retries to mount the datastore continue in the background. However, while the datastore is still not available, the hostd daemon might fail with a core dump when trying to access it and cause the host to lose connectivity to the vCenter system for a short period.
This issue is resolved in this release.
-
PR 3407251: ESXi host fails with a purple diagnostic screen due to a rare physical CPU (PCPU) lockup
In the vSphere Client, when you use the Delete from Disk option to remove a virtual machine from a vCenter system and delete all VM files from the datastore, including the configuration file and virtual disk files, if any of the files is corrupted, a rare issue with handling corrupted files in the delete path might lead to a PCPU lockup. As a result, the ESXi host fails with a purple diagnostic screen and a message such as
NMI IPI: Panic requested by another PCPU
.This issue is resolved in this release.
-
PR 3406627: VMFS6 automatic UNMAP feature might fail to reclaim filesystem space beyond 250GB
In certain cases, when you delete a filesystem space of more than 250GB on a VMFS6 volume, for example 1 TB, if the volume has no active references such as active VMs, then the VMFS6 automatic UNMAP feature might fail to reclaim the filesystem space beyond 250GB.
This issue is resolved in this release.
-
PR 3419241: After deleting a snapshot or snapshot consolidation, some virtual machines intermittently fail
When a VM on a NFSv3 datastore has multiple snapshots, such as s1, s2, s3, s4, if the VM reverts to one of the snapshots, for example s2, then powers on, and then one of the other snapshots, such as s3, is deleted, the vmx service might fail. The issue occurs because the code tries to consolidate links of a disk that is not part of the VM current state and gets a null pointer. As a result, snapshot consolidation might also fail and cause the vmx service to fail as well.
This issue is resolved in this release. If you already face the issue, power off the VM, edit its
.vmx
file to add the following setting:consolidate.upgradeNFS3Locks = "FALSE"
, and power on the VM. -
PR 3407532: VMs with snapshots and active encryption experience higher I/O latency
In rare cases, encrypted VMs with snapshots might experience higher than expected latency. This issue occurs due to unaligned I/O operations that generate excessive metadata requests to the underlying storage and lead to increased latency.
This issue is resolved in this release. To optimize performance, the VMcrypt I/O filter is enhanced to allocate memory in 4K-aligned blocks for both read and write operations. This reduction in metadata overhead significantly improves overall I/O performance.
-
PR 3410311: You cannot log in to the Direct Console User Interface (DCUI) with regular Active Directory credentials
When you try to log with a regular Active Directory account in to the DCUI of an ESXi host by using either a remote management application such as HP Integrated Lights-Out (iLO) or Dell Remote Access Card (DRAC), or a server management system such as Lenovo XCC or Huawei iBMC, login might fail. In the DCUI, you see an error such as
Wrong user name or password
. In thevmkernel.log
file, you see logs such as:2024-07-01T10:40:53.007Z In(182) vmkernel: cpu1:264954)VmkAccess: 106: dcui: running in dcuiDom(7): socket = /etc/likewise/lib/.lsassd (unix_stream_socket_connect): Access denied by vmkernel access control policy.
The issue occurs due to a restriction of ESXi processes to access certain resources, such as Likewise.
This issue is resolved in this release.
-
PR 3403706: Hot extending a non-shared disk in a Windows Server Failover Cluster might result in lost reservations on shared disks
In a WSFC cluster, due to an issue with releasing SCSI reservations, in some cases hot extending a non-shared disk might result in lost reservations on shared disks and failover of the disk resource.
This issue is resolved in this release. The fix makes sure releasing SCSI reservations are properly handled for all type of shared disks.
-
PR 3394043: Creating a vSAN File Service fails when you use IPs within the 172.17.0.0/16 range as mount points
Prior to vSphere 8.0 Update 3b, you must change your network configuration in cases when the specified file service network overlaps with the Docker default internal network 172.17.0.0/16. As a result, you see Skyline Health warnings for DNS lookup failures and you cannot create vSAN File Services.
This issue is resolved in this release. The fix routes traffic to the correct endpoint to avoid possible conflicts.
-
PR 3389766: During a vSphere vMotion migration of a fault tolerant Primary VM with encryption, the migration task might fail and vSphere FT failover occurs
In rare cases, the encryption key package might not be sent or received correctly during a vSphere vMotion migration of a fault tolerant Primary VM with encryption, and as a result the migration task fails and vSphere FT failover occurs.
This issue is resolved in this release. The fix sends the encryption key at the end of the vSphere FT checkpoint to avoid errors.
-
PR 3403680: Mounting of vSphere Virtual Volumes stretched storage container fails with an undeclared fault
In the vSphere Client, you might see the error
Undeclared fault
while mounting a newly created vSphere Virtual Volumes stretched storage container from an existing storage array. The issue occurs due to a rare race condition. vSphere Virtual Volumes generates a core dump and restarts after the failure.This issue is resolved in this release.
-
PR 3408300: You cannot remove or delete a VMFS partition on a 4K native (4Kn) Software Emulation (SWE) disk
When you attempt to remove or delete a VMFS partition on a 4Kn SWE disk, in the vSphere Client you see an error such as
Read-only file system during write on /dev/disks/<device name>
and the operation fails. In thevmkernel.log
, you see entries such asin-use partition <part num>, modification is not supported
.This issue is resolved in this release.
-
PR 3402823: Fresh installation or creating VMFS partitions on a Micron 7500 or Intel D5-P5336 NVMe drives might fail with a purple diagnostic screen
UNMAP commands enable ESXi hosts to release storage space that is mapped to data deleted from the host. In NVMe, the equivalent of UNMAP commands is a deallocate DSM request. Micron 7500 and Intel D5-P5336 devices advertise a very large value in one of the deallocate limit attributes, DMSRL, which is the maximum number of logical blocks in a single range for a Dataset Management command. This leads to an integer overflow when the ESXi unmap split code converts number of blocks to number of bytes, which in turn might cause a failure of either installation or VMFS creation. You see a purple diagnostics screen with an error such as
Exception 14 or corruption in dlmalloc
. The issue affects ESXi 8.0 Update 2 and later.This issue is resolved in this release.
-
PR 3396479: Standard image profiles for ESXi 8.0 Update 3 show last modified date as release date
The Release Date field of the standard image profile for ESXi 8.0 Update 3 shows the Last Modified Date value. The issue is only applicable to the image profiles used in Auto Deploy or ESXCLI. Base images used in vSphere Lifecycle Manager workflows display the release date correctly. This issue has no functional impact. The side effect is that if you search for profiles by release date, the profile does not show with the actual release date.
This issue is resolved in this release.