Citrix Blogs

PVS Internals #2 – How to properly size your memory

I’m really surprised myself, but it’s been already 8 months since I wrote the first part of PVS internals. I’ve been extremely busy recently, so please accept my apologies it took so long to finish the second part of the blog post.

In first part, I’ve prepared the theoretical ground for the discussion about proper sizing of the memory. You should already understand the concept of Windows caching manager.

Introduction

There are a lot of misunderstanding and misconceptions about Provisioning Services – and one of them is that PVS requires a huge amounts of memory to work properly. Yes, PVS is using memory (as a cheap caching mechanism), however that doesn’t mean that you need to build supercomputer to use Provisioning Services. The reasons why PVS requires spare memory were already discussed in my previous article – it’s a service that is actually designed toleverage your unused memory.

When you ask any consultant about proper sizing of the PVS memory, you will probably either get answer “It depends” (don’t you love it?) or you will get calculations based on safe estimates. If you don’t have time or resources to do proper sizing, I highly recommend our consulting white paper describing advanced memory considerations for Provisioning Services (CTX125126). However (similar to the situation with pagefiles), it’s highly recommended to perform a proper sizing exercise – each project is different and sizing requirements can be completely different if you publish desktops with tons of applications compared to the situation where you simply publish locked-down published applications. And you want to identify if more memory is required before production rollout.

There are two utilities that I like to use whenever I work with PVS – one will give you very nice overview of the environment, while the second one can be considered as caching Swiss-army knife and can give you very specific details for your sizing.

Tools for memory sizing

There are two tools that I really love and use whenever I touch a PVS server. They work perfectly together:

When you combine them, you can see a global overview of your memory usage, but also see required details to make proper decisions. Be aware that while Resource Monitor is updated in real-time, RamMap requires manual refresh.

Recommended approach

For monitoring of memory usage, we will use both utilities. While Resource Monitor will provide us with information about general memory consumption, we will use RamMap to determine how much memory we actually need for each vDisk.

The whole process can be defined as following:

  1. Empty standby cache to get clear results
  2. Boot target device to logon screen – 1st check
  3. Wait for Windows to fully load – 2nd check
  4. Logon first user – 3rd check
  5. Run regular applications, ideally performed by regular users – 4th check
  6. Shutdown target device – 5th check
  7. Review memory requirements
  8. Boot as many devices as you want and let them run for few days\weeks

Example sizing

Below you can find a step-by-step process to better understand memory usage of your PVS servers.

Environment consists of a single Windows Server 2008 R2 image that will deliver only the IE application. The size of the image is 40 GB and we store our vDisks locally.

First, we shut down all devices that are using PVS to get a clear picture of our environment. Nothing should read from PVS server and make sure you’re not copying any data to\from PVS server.

We use RamMap to clear the standby cache (option “Empty Standby List”):

At this moment, our PVS server is running with empty standby cache. We can easily confirm this by looking at Resource Monitor:

Now it is time to start our VM(s). I usually boot up multiple VMs at the same time – since they’re booting at the same time from standard image, there shouldn’t be much difference between booting one or multiple VMs.

As soon as we start booting the VMs, we can see that standby cache is increasing:

If we switch to RamMap and select “File Summary” tab, we can clearly see who is responsible for filling the cache:

Have you noticed something? The .VHD file is not only stored in standby cache, but in active page pool also. This is caused by StreamProcess.exe process. This is important because some pages are active and monitoring only the size of standby cache wouldn’t be an accurate representation (since almost 25% of the total is not stored in the standby cache).

As soon as we hit the logon screen (Press Ctrl + Alt + Delete to log on), we can see that the image occupies around 450 MBs of memory:

Standby cache on the other hand is already 561 MBs. This is caused by the fact that Windows is not only caching our .VHD file, but also any other buffered read operation:

Now we could say that PVS needs to read ~450 MBs to fully boot Windows Server 2008 R2, however this statement would not be correct. Remember the “Lie-To-Children” section from my previous article? Windows is a lot more complicated than we like to admit and there is a lot more under the hood than meets the eye. So while preparing for logon (and you can actually logon already), there are still a lot of operations going on in the background.

Therefore, it’s important to identify when we decide that Windows is fully booted. In my case, I always deliver farm configurations through group policy, so I will wait until XenApp joins the farm – and to be sure it’s really idle, I will wait until the reported load is 0 (Qfarm /Load) – just for your information, the different between “farm join” and “zero load” is around 250 MBs:

Now I can say that the fully loaded XenApp image requires around 735 MBs. If you compare it to simple logon screen check, you can see the difference of almost 300 MBs.

The next important step is user logon. On the average, the first user logon will require an additional 50-60 MBs (less with local accounts), even when your default profile is only 1 MB. The potential reason (I just take it as a fact and never spend too much time investigating) is that there are other OS components involved in first logon – for example calls of specific APIs (from libraries that were not downloaded yet):

After we log off this first user, we can see that our situation is actually very similar with first logoff – again, it’s not a simple removal of the folder, but additional APIs are involved:

Just to show you the difference between standby cache and the actual file memory usage, here is the same screenshot from Resource Monitor. Notice that standby cache is 1292 MBs (as we have seen before, only 740MBs are actually used by our vDisk):

The difference between those two numbers can actually tell you how much memory you should allocate to PVS server cache at a minimum (operational memory + default system cache + vDisk system cache). Our default recommendation is to reserve 512 MBs of system cache for operating system itself, you can see that this number pretty much reflects real life experience.

This is penalty of first logon\logoff – the rest of the users are not affected by this. This is similar to first user impact of application streaming building its cache and streaming the application. Can you pre-cache the first user? Hardly, the only potential solution would be auto-logon with a strictly limited user or initiating the process that will also load the user profile, however, this would require careful consideration due to security reasons. The majority of data reads during profile creation are actually from the System32 folder and not from the C:\Users folder itself. The potential gain is probably very small for XenApp servers, however could be more interesting for XenDesktop workloads (and could potentially lead to PVS Internals 3 article).

Our server can be considered fully prepared for user load at this point – now it’s time to ask your pilot users to perform some tests. At the end, you might be very surprised – in my case, the average memory requirements for running few servers for 4 days was only around 1 GB:

Does this mean that I need only 3 GBs of memory for my PVS server (2GB for OS and 1GB for vDisk)? Definitely not and it would be very bad decision. Let’s have a look at our sizing formula:

2GB + (#XA_vDisk * 4GB) + (#XD_vDisk * 2GB) + 15% (Buffer)

We’ve just proved that our vDisk doesn’t require more than recommended memory, so in this case recommendation would be to stick with 4GB of memory per vDisk. If our testing would reveal higher memory requirements, we would need to increase this requirement, however with very few exceptions, this formula should cover majority of cases. Don’t forget that you need to provide some system cache for non-PVS related pages (this is included in 15% buffer).

Be aware that this doesn’t actually mean that you need to reserve 1GB of memory for this vDisk. The goal of PVS is to optimize ~80% of the disk load operations, however if PVS needs to read from a disk every now and then, it doesn’t actually mean your sizing is wrong.

Summary

There are a few general recommendations that you should follow when doing memory sizing for PVS.

             2GB + (#XA_vDisk * 4GB) + (#XD_vDisk * 2GB) + 15% (Buffer)

I would also like to thank my colleagues Dan Allen, Nicholas Rintalan and Pablo Legorreta (in alphabetical order, so they don’t fight) for their tremendous help!

Martin Zugec

Exit mobile version