The biggest Linked Clone “IO” Split Study – Part 1/2

An article by Andre Leibovici from myvirtualcloud.net

In my article Get hold of VDI IOPs, Read/Write Ratios and Storage Tiering I discussed the importance of understanding the virtual desktop IO pattern in VDI deployments. On the same article I briefly discussed the I/O split between Replicas and Linked Clones.

The main idea is that at moment ‘zero’ after the creation of a linked clone, the Replica disk is responsible for 100% of the Read IO, and the Linked Clone disk (delta) is responsible for 100% of the Write IO.

When Windows boot for the first time, is customized, and users start to use the desktop, the Linked Clone disk will have increased Read IO, not only Writes.

The concept is easy to understand since we know that data that has committed to Linked Clone disks will eventually be read. Please note that Replica disks will always be 100% Read IO.

When I read performance benchmarks stating that a virtual desktop require 30 IOps I ask:

  • What percentage of those 30 IOs are Read and what percentage are Write?
  • From Reads IOs how many are hitting replica disk and how many the Linked Clone disk?
  • If Persistent Disks (Old UDD used for User Profile) are in use, how many IOs are hitting them?
    Those are difficult questions to answer without running a Pilot. In fact, most of the Pilots I have seen do not even get to this level of detail. However, without these answers  it is impossible to properly size and architect storage arrays for performance without adding extra  fat to the solution. (FAT = $$$)
    Technologies such as DRAM Cache and EMC FAST Cache enormously help to diminish IO contention. However, those technologies must be put in consideration only after you have determined the real number of IOs per storage tier.

Storage tiers are used by VMware View to separate disk types in a VDI environment, and they often provide different RAID types for various performance or redundancy objectives. VMware View allow storage tiering for Replicas, Linked Clone and Persistent disks.

The picture below illustrates a virtual desktop scenario where the the overall Read/Write ratio of 60/40 has different ratios across each disk type.

 

clip_image004

 

These are the fundamentals behind a project I decided to execute myself. I wanted to find out how much of the IO pattern change overtime as virtual desktops gets utilized and Linked Clone disks grow in size.

 

Architecture

 

First and foremost I need to explain the architecture I am using to simulate and run the tests. The important architecture definition is the GuestOS. The number of Read/Write IOs is different between Windows XP and 7; and different between 32 and 64 bit versions.

I have chosen Windows 7 64bit as I believe that’s what new deployments should be using.

The amount of vRAM and vCPU may also alter the IO pattern. I have chosen to use 2GB RAM and 1vCPU as I believe this is the most common configuration out there.

  • Windows 7 64bit
  • 2 GB vRAM
  • 1 vCPU
  • 40GB Disk (Approximately 10GB used in the NTFS partition)

The applications installed on the Parent VM also help to shape the IO profile of the virtual desktop, independent if they are in use or not. For these tests I have chosen to install Microsoft Office 2007 64bit only.

To make sure analysis and measurements are valid and accurate I created dedicate datastores to host a single Linked Clone virtual desktop. This configuration is important to avoid IOs intrusion from other virtual machines.

Another important piece was to remove the Windows user profile IO from the Linked Clone Disk. This was accomplished with the creation of a persistent disk (UDD) and placing it on a dedicated datastore.

Ideally I would have created an additional Disposable disk for Windows Temporary Files; however VMware View does not allow placement of disposable disks in a separate datastore, other than the linked clones datastore. For the desired end results there was no benefit in creating a Disposable disk.

The image below demonstrate how the test linked clone virtual machine was configured and also identifies the three datastores used. Each datastore uses a different LUN.

image

 

In VMware View the Datastore selection screen was set to:

image

 

Analysis Tools

 

Tree tools were used to collect and analyze the IO: vCenter Performance Monitor, Veeam Monitor and Unisphere Analyzer.

 

Replica Creation Statistics

 

Replica disk creation is a one-off operations and should not cause IO burden unless if created during production hours in an IO constrained environment.

For the Parent VM with the configuration described above was required an average total of 10,654 Write IOs to generate the Replica. The average peak IO was at 412 IOps.

image

 

Creating Replica disks is a write intensive operation. However it is also Read intensive on the original Parent VM. I have not collected the number of Read IOs at the Parent VM for this test but I recommend to have Parent VMs in a storage pool that can provide the same number of IOs used for creating the replica disk.

 

Virtual Desktop Customization and Boot Statistics

 

The initial objective for this study was not to identify IO during boot time; however as I was already doing the monitoring Why Not?! I have seen several boot storm studies but I never really found any of them conclusive because none of them went to the level of detail I’m covering here.

Please, keep in mind that this is the IO for a virtual desktop with configuration specified above and the IO for desktops with different configurations will always be different. The aim of this article (more like a whitepaper) is to demonstrate how IOs should be used to help sizing VDI solutions. However, the number below can be used as a baseline.

 

PowerOn, Customization and 1st boot generated an average total of:

  • 2192 Replica Read IO
  • 549 Linked Clone Write IO
  • 203 Linked Clone Read IO

Average Peak IOs happened at:

  • 640 Replica Read IO
  • 93 Linked Clone Write IO
  • 51 Linked Clone Read IO
    An averaged total of 2944 IOs to have a linked clone virtual desktop ready for use by users. In a View Composer environment this values are applicable to all virtual desktops. Attention! This is not the so called bootstorm and happens only once in the virtual machine lifetime, as long the virtual desktop is not deleted after use.

 

image

 

After PowerOn, Customization and 1st boot the virtual machine was shutdown. A 2nd PowerOn is used here to identify the real number of IO required to boot the virtual desktop without any customization process. Attention! This is the so called bootstorm and will happen every time the virtual desktop is powered on again.

The 2nb boot generated an average total of:

  • 839 Replica Read IO
  • 108 Linked Clone Write IO
  • 83 Linked Clone Read IO

The average Peak IOs happened at:

  • 472 Replica Read IO
  • 53 Linked Clone Write IO
  • 51 Linked Clone Read IO
    An averaged total of 1,030 IOs were necessary to get the virtual desktop ready for use after reboot.

    image

     

    You now should know exactly what your storage array will require to provide from a IO perspective from each storage tier.

    For 100 virtual desktops booting at the same time the array would need to be able to respond to approximately:

    • 4720 Replica Read IO
    • 530 Linked Clone Write IO
    • 510 Linked Clone Read IO
      * The numbers above assume no storage latency.
        It’s not possible to boot all virtual desktop at the same time. VMware View will only boot 5 virtual desktop simultaneously using Default settings. This value can be changed.
        If you accept the default five simultaneous PowerOn operations it’s necessary 2,360 Read IO from the Replica tier, 415 Read IO from the Linked Clone Tier and 265 Write IO from the Linked Clone Tier.

      As virtual desktop boot they will start to become ready to be used and users will start log in. For this reason the calculation of the total number of IO required is a hybrid between the number of desktops the may boot simultaneously, plus the number of desktops in logon process, plus desktops in steady idle state.

      If desktop pools are configured to refresh or delete desktops after use, that should also be part of the calculation.

       

      Persistent Disk and User Profile Statistics

       

      To make the IO assessment accurate the Windows User Profile was redirected to another .vmdk through the use of Linked Clone Persistent Disk. The Persistent Disk was also stored in a dedicated datastore for better IO tracking.

      Because user profiles will often have different sizes and IO requirements I decided to collect data only for Virtual Desktop Customization and Boot Statistics. User Login is not under consideration.

      Based on the virtual configuration described above the numbers below are consistent across different desktops.

       

      PowerOn, Customization and 1st boot generated an average total of:

      • 64 Persistent Disk Write IO
      • 51 Persistent Disk Read IO

      Average Peak IOs happened at:

      • 43 Persistent Disk Write IO
      • 20 Persistent Disk Read IO
        In a production environment you may choose not to use Persistent Disks. In this case the IOs demonstrated here are not applicable. However, upon user logon IOs will be generated against linked clone disk instead of persistent disk.
        If Roaming Profiles are in use the IOs are still applicable and will be generated against the linked clone disk.

       

            Summarizing the Numbers

        The best way to understand how many IOs are required for PowerOn, Customization and 1st boot is to find out the averaged maximum IO per datastore. The reason for that is that each storage tier will require different performance.

        For this specific virtual desktop configuration the baseline for PowerOn, Customization and 1st boot process is:

        image

        The same numbers from the picture above can be demonstrated in a percent style per storage tier. I think this spreadsheet gives you great visibility of what is happening with the virtual machine during creation process.

        The bottom part of the spreadsheet demonstrate those five default simultaneous PowerOn operations allowed by VMware View.

         

        image

         

        Now you understand the importance of sizing storage tiers independently.
        Don’t blindly accept the same old 15 IOPS per VM anymore.

         

        This is the Part 1 of 2 of this article. Part 2 discuss IO trending as a Linked Clone delta disk grow. I personally have not seen any reports or white papers on this subject; however I may be wrong. If you know about the existence of such documentation please forward to me or let me know.

        Your comments are very much appreciated.



         

        Tags: , ,

        Comments

        No comments so far.

        • Leave a Reply
           
          Your gravatar
          Your Name