• 12

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191


File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

I have two different Hyper-V 2012 R2 environments that use iSCSI to connect to their virtual machine storage. While the environments are different (one is all 10 GB whereas the other is mixed 1 GB / 10 GB, one is using an SSD array in RAID 6 whereas the other is RAID 10 spread across two arrays), the odd behavior I am seeing is the same.

Bottom line is that when I run a disk i/o test directly on the host against the CSV, I get a particular value for average IOPS. However when I run the same test within the virtual machine against its "local" disk (the vhdx file that is stored on the CSV), I get a greatly reduced IOPS value.

To put things in perspective, here is the environment I am testing

  • Host
    • Windows 2012 R2 Datacenter
    • 512 GB
    • 48 Logical processors
    • 10 GB fiber for iSCSI traffic
    • One (1) virtual machine running
  • Storage
    • EqualLogic PS6210S
    • 24 800GB SSD in RAID 6
    • One (1) 1TB volume containing one (1) vm
    • 10 GB fiber
    • Host and Array are connect to dedicated network switches
  • Virtual Machine
    • Windows 2012 R2 Datcenter
    • 127 GB dynamic disk
    • Dynamic RAM
  • I/O Test
    • FIO 2.2.10 -- Test Software
    • 70/30 R/W mix against 500 MB test files (see below for actual test config file)

When I run the test against the CSV from the host (C:\ClusterStorage\VM-Infrastructure), I get read/write IOPS of about 22k/9k, respectively. However, when I run that same test within the VM against its C:\Temp folder (with the VM's VHDX file being stored on the array in C:\ClusterStorage\VM-Infrastructure') I get numbers of 13k/6k.

Is this a known problem? Are there any particular host/vm settings that I should be looking at to get the vm performance closer to what I get on the host? A drop from 22k read performance to 13k is pretty dramatic. I figured that there would be a slight hit within the vm but not this much -- as high as 40% in some cases.

direct=1 ; 1 for direct IO, 0 for buffered IO
iodepth=32 ; For async io, allow 'x' ios in flight
invalidate=1 ; Invalidate page cache for file prior to doing io
numjobs=16 ; Create 'x' similar entries for this job
group_reporting ; ?
thread ; Use pthreads instead of forked jobs

      • 2
    • 1. Just to be sure, nothing else is running in this environment, right? 2. Try creating a VHDX in the host, mount it and test it the same way. This way you'll be more focused as to where the problem might be. 3. Are you doing random RW? If you do, try testing sequential RW instead. I'm not sure, but I have a hunch thatrandomness might be a problem for VHDX.
    • 1) I am positive nothing else is running. 2) You are suggesting that I create a vhdx on the local disks on the host as opposed to putting it on the cluster? I can test that. 3) As you can see in the config file with the rw=randrw, the testing is random and not sequential.
      • 1
    • @EliadTech -- (1) The test from the host with the VHDX mounted gave numbers in line with the test from the host directly to the volume. That seems to point to a Hyper-V/VM issue. (2) Oddly enough, changing the test to perform sequential read/write gave very close (but not veyr good) numbers. Testing from both the host and the VM gave about 7k/3k read/write IOPS.
      • 2
    • I decided to perform a packet capture with my original tests and found some interesting differences. When performing the random r/w test from the VM, a good amount of the traffic captured was listed as using iSCSI protocol which makes sense since my hosts use iSCSI to connect to the storage. However, when performing the test from the host directly to the iSCSI connected volume, I see very little iSCSI protocol traffic. Instead I see a lot of TCP traffic from port 3260. (FYI -- That port is also iSCSI but displays differently it seems.) Not sure if this is relevant.

After further research and some discussions with storage experts, the culprit has been found.

Even though the host was running a single virtual machine and that vm was the only client reading and writing the storage array, the built-in Hyper-V storage and networking load balancer was kicking in and throttling back the vm. When the load balancer was disabled, the virtual machine put up IOPS numbers very close to what we saw directly from the host.

For storage operations, the latency threshold value is 83 ms and 2 ms for networking. As best we can tell, the default threshold values are overly aggressive or just not suited to iSCSI storage connections. (iSCSI connections will of course add latency that you would not see with directly connected or local storage.) The registry setting that controls this (for storage) is HKLM\System\CurrentControlSet\Services\StorVsp\IOBalance\Enabled. Setting a value of 0 disables the balancer.

More information can be found at http://www.aidanfinn.com/?p=13232

We have not decided if we will keep the balancer turned off. Obviously it is there, and kicks in, for a reason. While it probably should not be on for a handful of virtual machines, when I start loading up the host it will be more beneficial. My main goal was understanding why my numbers were so disparate.

  • 1
Reply Report

Trending Tags