Blog Post

Performance tuning KubeVirt for SQL Server

,

Following on from my last post about Getting Started With KubeVirt & SQL Server, in this post I want to see if I can improve the performance from the initial test I ran.

In the previous test, I used SQL Server 2025 RC1…so wanted to change that to RTM (now that’s it’s been released) but I was getting some strange issues running in the StatefulSet. However, SQL Server 2022 seemed to have no issues and as much as I want to investigate what’s going on with 2025 (pretty sure it’s host based, not an issue with SQL 2025)…I want to dive into KubeVirt more…so let’s go with 2022 in both KubeVirt and the StatefulSet.

I also separated out the system databases, user database data, and user database log files onto separate volumes…here’s what the StatefulSet manifest looks like: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
      annotations:
        stork.libopenstorage.org/disableHyperconvergence: "true"
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/rhel/server:2022-CU22-rhel-9.1
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
            - name: MSSQL_DATA_DIR
              value: "/opt/sqlserver/data"
            - name: MSSQL_LOG_DIR
              value: "/opt/sqlserver/log"
          resources:
            requests:
              memory: "8192Mi"
              cpu: "4000m"
            limits:
              memory: "8192Mi"
              cpu: "4000m"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql/
            - name: sqldata
              mountPath: /opt/sqlserver/data/
            - name: sqllog
              mountPath: /opt/sqlserver/log/
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        storageClassName: px-fa-direct-access
    - metadata:
        name: sqldata
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: px-fa-direct-access
    - metadata:
        name: sqllog
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: px-fa-direct-access

And here’s what the KubeVirt VM manifest looks like: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        firmware:
          bootloader:
            efi:
              secureBoot: false
        resources: # requesting same limits and requests for guaranteed QoS
          requests:
            memory: "8Gi"
            cpu: "4"
          limits:
            memory: "8Gi"
            cpu: "4"
        devices:
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi
            # Disk 2: SQL System
            - name: sqlsystem
              disk:
                bus: scsi
            # Disk 3: SQL Data
            - name: sqldata
              disk:
                bus: scsi
            # Disk 4: SQL Log
            - name: sqllog
              disk:
                bus: scsi
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2022-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio
              bridge: {}
              ports:
                - port: 3389 # port for RDP
                - port: 1433 # port for SQL Server      
      networks:
        - name: default
          pod: {}
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqlsystem
          persistentVolumeClaim:
            claimName: sqlsystem
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2022-iso
          persistentVolumeClaim:
            claimName: sql2022-pvc

I then ran the hammerdb test again…running for 10 minutes with a 2 minute ramp up time. Here are the results: –

# Statefulset result
TEST RESULT : System achieved 46594 NOPM from 108126 SQL Server TPM

# KubeVirt result
TEST RESULT : System achieved 18029 NOPM from 41620 SQL Server TPM

Oooooook…that has made a difference! KubeVirt TPM is now up to 38% of the statefulset TPM. But I’m still seeing a high privileged CPU time in the KubeVirt VM: –

So I went through the docs and found that there are a whole bunch of options for VM configuration…the first one I tried was the Hyper-V feature. This should allow Windows to use paravirtualized interfaces instead of emulated hardware, reducing VM exit overhead and improving interrupt, timer, and CPU coordination performance.

Here’s what I added to the VM manifest: –

        features:
          hyperv: {} # turns on Hyper-V feature so the guest “thinks” it’s running under Hyper-V - needs the Hyper-V clock timer too, otherwise VM pod will not start
        clock:
          timer:
            hyperv: {} 

N.B. – for more information on what’s happening here, check out this link: –

https://www.qemu.org/docs/master/system/i386/hyperv.html

Stopped/started the VM and then ran the test again. Here’s the results: –

TEST RESULT : System achieved 40591 NOPM from 94406 SQL Server TPM

Wait, what!? That made a huge difference…it’s now 87% of the StatefulSet result! AND the privileged CPU time has come down: –

But let’s not stop there…let’s keep going and see if we can get TPM parity between KubeVirt and SQL in a StatefulSet.

There’s a bunch more flags that can be set for the Hyper-V feature and the overall VM, so let’s set some of those: –

        features:
          acpi: {} # ACPI support (power management, shutdown, reboot, device enumeration)
          apic: {} # Advanced Programmable Interrupt Controller (modern interrupt handling for Windows/SQL)
          hyperv: # turns on Hyper-V vendor feature block so the guest “thinks” it’s running under Hyper-V. - needs the Hyper-V clock timer too, otherwise VM pod will not start
            reenlightenment: {} # Allows guest to update its TSC frequency after migrations or time adjustments
            ipi: {} # Hyper-V IPI acceleration - faster inter-processor interrupts between vCPUs
            synic: {} # Hyper-V Synthetic Interrupt Controller - improves interrupt delivery
            synictimer: {} # Hyper-V synthetic timer - stable high-resolution guest time source
            spinlocks:
              spinlocks: 8191 # Prevents Windows spinlock stalls on SMP systems - avoids boot/timeouts under load
            reset: {} # Hyper-V reset infrastructure - cleaner VM resets
            relaxed: {} # Relaxed timing - reduces overhead when timing deviations occur under virtualization
            vpindex: {} # Per-vCPU indexing - improves Windows scheduler awareness of vCPU layout
            runtime: {} # Hyper-V runtime page support - gives guest better insight into hypervisor behavior
            tlbflush: {} # Hyper-V accelerated TLB flush - improves scalability on multi-vCPU workloads
            frequencies: {} # Exposes host CPU frequency data - allows proper scaling & guest timing
            vapic: {} # Virtual APIC support - reduces interrupt latency and overhead
        clock:
          timer:
            hyperv: {} # Hyper-V clock/timer - stable time source, recommended when using Hyper-V enlightenments

And now, what’s the TPM we’re getting with hammerdb…

TEST RESULT : System achieved 40483 NOPM from 94051 SQL Server TPM

Ha ha! Nearly there…so what else can we do?

Memory and CPU wise…I went and added: –

       ioThreadsPolicy: auto # Automatically allocate IO threads for QEMU to reduce disk I/O contention
        cpu:
          cores: 4
          dedicatedCpuPlacement: true # Guarantees pinned physical CPUs for this VM to improve latency & stability
          isolateEmulatorThread: true # Pins QEMU’s emulator thread to a dedicated pCPU instead of sharing with vCPUs
          model: host-passthrough # Exposes all host CPU features directly to the VM
          numa:
            guestMappingPassthrough: {} # Mirrors host NUMA topology to the guest to reduce cross-node latency
        memory:
          hugepages:
            pageSize: 1Gi # Uses 1Gi hugepages for reduced TLB pressure

N.B. – this required configuring the host to reserve hugepages at boot

And then for disks…I installed the latest virtio drivers on the VM…switched the disks for the SQL system, data, and log files to use virtio instead of a scsi bus and then added for each disk: –

dedicatedIOThread: true

Other device settings added were: –

autoattachGraphicsDevice: false # Do not attach a virtual graphics/display device (VNC/SPICE) - removes unnecessary emulation
autoattachMemBalloon: false # Disable the VirtIO memory balloon - prevents dynamic memory changes, improves consistency
autoattachSerialConsole: true # Attach a serial console for debugging and virtctl console access
networkInterfaceMultiqueue: true # Enable multi-queue virtio-net so NIC traffic can use multiple RX/TX queues

All of this results in a bit of a monster manifest file for the VM: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        ioThreadsPolicy: auto # Automatically allocate IO threads for QEMU to reduce disk I/O contention
        cpu:
          cores: 4
          dedicatedCpuPlacement: true # Guarantees pinned physical CPUs for this VM - improves latency & stability
          isolateEmulatorThread: true # Pins QEMU’s emulator thread to a dedicated pCPU instead of sharing with vCPUs
          model: host-passthrough # Exposes host CPU features directly to the VM - best performance (but less portable)
          numa:
            guestMappingPassthrough: {} # Mirrors host NUMA topology to the guest - reduces cross-node memory latency
        memory:
          hugepages:
            pageSize: 1Gi # Uses 1Gi hugepages for reduced TLB pressure - better performance for large-memory SQL
        firmware:
          bootloader:
            efi:
              secureBoot: false # Disable Secure Boot (often required when using custom/older virtio drivers)
        features:
          acpi: {} # ACPI support (power management, shutdown, reboot, device enumeration)
          apic: {} # Advanced Programmable Interrupt Controller (modern interrupt handling for Windows/SQL)
          hyperv: # Enable Hyper-V enlightenment features for Windows guests to improve performance & timing
            reenlightenment: {} # Allows guest to update its TSC frequency after migrations or time adjustments
            ipi: {} # Hyper-V IPI acceleration - faster inter-processor interrupts between vCPUs
            synic: {} # Hyper-V Synthetic Interrupt Controller - improves interrupt delivery
            synictimer: {} # Hyper-V synthetic timer - stable high-resolution guest time source
            spinlocks:
              spinlocks: 8191 # Prevents Windows spinlock stalls on SMP systems - avoids boot/timeouts under load
            reset: {} # Hyper-V reset infrastructure - cleaner VM resets
            relaxed: {} # Relaxed timing - reduces overhead when timing deviations occur under virtualization
            vpindex: {} # Per-vCPU indexing - improves Windows scheduler awareness of vCPU layout
            runtime: {} # Hyper-V runtime page support - gives guest better insight into hypervisor behavior
            tlbflush: {} # Hyper-V accelerated TLB flush - improves scalability on multi-vCPU workloads
            frequencies: {} # Exposes host CPU frequency data - allows proper scaling & guest timing
            vapic: {} # Virtual APIC support - reduces interrupt latency and overhead
        clock:
          timer:
            hyperv: {} # Hyper-V clock/timer - stable time source, recommended when using Hyper-V enlightenments
        resources: # requests == limits for guaranteed QoS (exclusive CPU & memory reservation)
          requests:
            memory: "8Gi"
            cpu: "4"
            hugepages-1Gi: "8Gi"
          limits:
            memory: "8Gi"
            cpu: "4"
            hugepages-1Gi: "8Gi"
        devices:
          autoattachGraphicsDevice: false # Do not attach a virtual graphics/display device (VNC/SPICE) - removes unnecessary emulation
          autoattachMemBalloon: false # Disable the VirtIO memory balloon - prevents dynamic memory changes, improves consistency
          autoattachSerialConsole: true # Attach a serial console for debugging and virtctl console access
          networkInterfaceMultiqueue: true # Enable multi-queue virtio-net so NIC traffic can use multiple RX/TX queues
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi   # Keep OS disk on SCSI - simpler boot path once VirtIO storage is already in place
              cache: none
            # Disk 2: SQL System
            - name: sqlsystem
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Give this disk its own IO thread - reduces contention with other disks
            # Disk 3: SQL Data
            - name: sqldata
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Separate IO thread for data file I/O - improves parallelism under load
            # Disk 4: SQL Log
            - name: sqllog
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Separate IO thread for log writes - helps with low-latency sequential I/O
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2022-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio # High-performance paravirtualized NIC (requires NetKVM driver in the guest)
              bridge: {} # Bridge mode - VM gets an IP on the pod network (via the pod’s primary interface)
              ports:
                - port: 3389 # RDP
                - port: 1433 # SQL Server
      networks:
        - name: default
          pod: {} # Attach VM to the default Kubernetes pod network
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqlsystem
          persistentVolumeClaim:
            claimName: sqlsystem
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2022-iso
          persistentVolumeClaim:
            claimName: sql2022-pvc

And then I ran the tests again: –

# StatefulSet
TEST RESULT : System achieved 47200 NOPM from 109554 SQL Server TPM

# KubeVirt
TEST RESULT : System achieved 46563 NOPM from 108184 SQL Server TPM

BOOOOOOOOOM! OK, so that’s 98% of the TPM achieved in the StatefulSet. And there’s a bit of variance in those results so these are now pretty much the same!

Ok so it’s not the most robust performance testing ever done…and I am fully aware that testing in a lab like this is one thing, whereas running SQL Server in KubeVirt…even in a dev/test environment is a completely other situation. There are still questions over stability and resiliency BUT from this I hope that it shows that we shouldn’t be counting KubeVirt out as a platform for SQL Server, based on performance.

Thanks for reading!

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating