As I planned out this blog series, my original intention was to cover the Microsoft Cloud Adoption Framework (CAF) for Azure in three parts: the first dedicated to introducing the design principles for enterprise-scale landing zones, and the second and third covering the critical design areas. If you’ve read the previous entries in the series (and I encourage you to), you were probably wondering, what about the critical design areas for Azure networking and monitoring?
As I started, and as the content started piling up, it got me thinking back on a few projects and the critical design areas where customers have commonly stumbled. So, I decided to mix up the order and add a fourth part for a more in-depth discussion on the remaining critical design areas:
In this post, I will highlight the relevant Azure capabilities and considerations when applying these capabilities to Citrix design and operations. Like before, each of these areas could have enough information for their own blog posts, so I will focus primarily on key insights or lessons learned.
Network Topology and Connectivity
CAF recommendation: “Creating a virtual network to experiment with is easy enough, but chances are, you will deploy multiple virtual networks over time to support the production needs of your organization. With some planning, you will be able to deploy virtual networks and connect the resources you need more effectively.”
One of the most critical prerequisites to any Citrix on Azure deployment is coordinating with your cloud platform, network, and security teams in planning a virtual network. The maturity of this plan will be a key indicator for landing zone readiness and overall project deployment timelines. I typically ask customers the following questions to understand their overall readiness:
Have requirements and standards been defined for a cloud virtual network?
Multiple foundational decision points drive an Azure virtual network design: on-premises connectivity, network security baselines, cross-subscription connectivity, ingress/egress design, and DNS, among others. Often, until this is defined, an Azure datacenter may not yet exist or may be considered “untrusted” to existing datacenter resources.
What’s the impact on your Citrix project? Citrix is often the first major workload for Azure adoption to accomplish initiatives such as improving business continuity, increasing business agility, or reducing cost. When this is the case there is often no existing network standard for IaaS. Without this, an Azure datacenter might not yet exist or connecting IaaS resources to critical backend apps or services that reside on-premises might not be possible.
What should I do? I have seen multiple projects experience multi-month delays or be completely shuttered due to issues with incomplete or rushed Azure network planning. Set expectations on effectiveness of a POC or pilot until initial planning conversations are finished. Add these planning conversations to your timelines. For Citrix and Microsoft partners, consider expanding existing project proposals to guide these discussions.
Is connectivity available between Azure and your on-premises datacenter?
Connectivity between Azure and on-premises networks is often a crucial dependency for a Citrix on Azure architecture. Organizations have two different ways to create this interconnection: an Azure Site-to-Site VPN or ExpressRoute.
What’s the impact on your Citrix project? Datacenter-to-Azure connectivity will provide access to key resources such as directory services, apps not yet migrated to Azure, or to on-premises web or storage infrastructure. If this is critical for your Citrix use case, this will be a deployment prerequisite, and you will need to complete it before you can move forward. Some basic connectivity may exist, for example, for services such as Office 365, but it may not yet be scaled to support full enterprise needs. (For example, having a site-to-site VPN in place but requiring an ExpressRoute for the projected Citrix scale.)
What should I do? Following the guidance I provided above, there is an incomplete Azure network design and you need to finalize before connectivity can be established or scaled. If connectivity exists and additional capacity or resiliency is needed, you may be in a scenario where you can complete a POC but not support a larger scale pilot or production rollout. Outline these dependencies to your stakeholders and set expectations on the scale of what can be accomplished. Factor in time to update an existing Azure networking standard and your Citrix-on-Azure design prior to pilot or production rollout.
What is the design for the perimeter network?
Perimeter networks enable secure connectivity between an Azure network and on-premises, along with any connectivity to and from Azure and the internet.
What’s the impact on your Citrix project? Perimeter network design will have a direct impact on scale and performance of the Citrix deployment, whether it be user ICA sessions or supporting applications such as Office365 within a Citrix VDI. Perimeter networks are typically shared infrastructure for all services running in Azure, making the supporting components performance bottlenecks that should be evaluated.
What should I do? Review the existing perimeter network design and inquire on the scalability of supporting network virtual appliances (NVAs), datacenter connectivity method, and routing for ingress/egress traffic. Actively monitor supporting components for performance bottlenecks, such as NVA CPU and bandwidth utilization or ExpressRoute bandwidth utilization. Estimate overall Citrix bandwidth to provide guidance to the platform team if sessions will transit from Azure through an ExpressRoute and your datacenter prior to reaching end users. If all Azure egress internet traffic is routed through your datacenter, determine if known cloud services, such as Citrix Cloud or Office365, can be routed more directly to reduce latency or mitigate the impact of bandwidth constraints.
CAF recommendation: “Delegate subnet creation to the landing zone owner. This will enable them to define how to segment workloads across subnets (for example, a single large subnet, multitier application, or network-injected application).”
Virtual networking has a direct impact on Citrix resource management by being a component of the Azure hosting unit leveraged by Citrix Machine Creation Services. The hosting unit defines “where” machines are placed from a network and storage perspective. When using Azure, this hosting unit is comprised of a subscription, Azure Region, virtual network, and series of subnets. A machine cataglog can be associated with a single hosting unit so your networking of Citrix workloads can have a direct impact on your machine management and increase total cost of ownership if you are overly segmented.
Network Security Groups (NSGs) can be applied at the Azure subnet level and are supplemental to a networking virtual appliance providing security capabilities such as firewalling and intrusion detection. NSGs provide simple, stateful packet inspection devices that use the five-tuple approach (source IP, source port, destination IP, destination port, and layer 4 protocol) to create allow/deny rules for network traffic. With the common ports that are used by Citrix components, you can establish NSGs between Citrix infrastructure and workload subnets. Additionally, using Azure Policy you can enable NSG Flow Logs to log traffic flows into Azure Traffic Analytics. Using Traffic Analytics you can generate insights such as virtual network traffic distribution by network or geography, NSG rules hit, and virtual machines receiving traffic from the internet.
Microsoft provides guidance on network segmentation and logically segmenting subnets, and you can also use the following guidance to help with initial planning on segmenting your Citrix workloads:
- Segmenting by Citrix Workload Types – Creating separate single-session and multi-session virtual networks or subnets enables growth of both persona types without impacting the scalability of the other. For example, filling a shared multi-session and single-session subnet with VDI, resulting in the creation of a new hosting unit to support an app use case. This forces the usage of multiple machine catalogs to support scaling the application or migrating the existing app catalogs to a new subnet. If using workload subscriptions as part of a multi-subscription architecture, consider MCS limits on the number of VMs per Azure subscription as you plan for IP addressing and your virtual network design.
- Segmenting by Tenant / Business Unit / Security Zone – If you are running a multi-tenanted deployment, such as a Citrix Service Provider architecture, isolating tenants between networks or subnets is a recommended practice. For separate business units or security zones within the same organization, this should be considered if there are specific isolation requirements needed at a network level based on your existing security standards. Segmenting specific business units outside of workload specific networks should be weighed against the impact of increased complexity on the overall environment. This methodology should be the exception rather than the rule and applied with the right justification and projected scale. (For example, creating a network for 1,000 contractors supporting finance to accommodate security needs above and beyond the standard single-session VDI network.) Application security groups can be used to limit access to business unit application backends to specific VMs on a shared virtual network. For example, limiting CRM backend access to the CRM machine catalog VMs used by marketing in the multi-session VDA network.
Building a baseline understanding of Azure networking is essential to helping you align with key landing zone requirements from your platform, networking, and security teams. A solid Azure network is critical to enabling growth of your Citrix deployment, and I recommend the following resources for additional information:
- Microsoft Cloud Adoption Framework – Networking best practices for Azure readiness
- Microsoft Documentation – Azure Virtual Network concepts and best practices
- Microsoft Documentation – Azure security baseline for Virtual Network
Management and Monitoring
CAF recommendation: “Use a centralized Azure Monitor Log Analytics workspace to collect logs and metrics from IaaS and PaaS application resources and control log access with Azure RBAC.”
A comprehensive cloud monitoring strategy should include actively measuring the health of the Citrix deployment and its subsystems, defining a red, amber, green model to represent health, and planning to respond to failures across Citrix components. When managing a Citrix architecture on Azure, tools such as Azure Monitor can enable this cloud monitoring strategy.
To make the most out of Azure Monitor, you need access to a Log Analytics workspace. While Log Analytics design may vary based on the needs of your organization, I will highlight a few key areas where Log Analytics can provide useful insights above and beyond the current capabilities within Citrix tools. If Log Analytics is centralized, this can help when having discussions with your platform team on why access would be beneficial, or it can be used to help justify a Log Analytics deployment for Citrix if you are following a decentralized design.
Using Log Analytics for the Citrix deployment, you can aggregate key data such as resource availability or missing updates to check if Citrix infrastructure is missing critical Windows security patches. Log Analytics has pre-built queries that can access this information and then summarize in a customizable dashboard. Custom queries can also be built to summarize relevant data. Microsoft has a series of examples for reference.
Some notable pre-built queries to get you started include:
- Track VM availability – Display the VMs reported availability during the last day. This can advise on overall uptime of key resources in the Citrix deployment.
- Not reporting VMs – VMs that have not reported a heartbeat in the last five minutes. This will separate out machines that are unavailable and can be used to visualize machines that may be down unexpectedly.
- Top 10 Virtual Machines by CPU utilization – Find the Top 10 VM by CPU utilization in the last seven days. This query can be used to track infrastructure CPU utilization and supplement the utilization reports for Citrix workloads provided by Citrix Director.
- Summary of updates available across machines – Count of updates available under various categories for each machine. This includes critical, security, and other updates to provide detail on what master images, Citrix infrastructure, or desktops require Windows patches.
- Missing security or critical updates – Count how many security or other critical updates are missing. This can be used to isolate and visualize missing Windows security updates specifically to help in the prioritization of updating Citrix components.
CAF Recommendation: “Use Azure Monitor alerts for the generation of operational alerts. Azure Monitor alerts unify alerts for metrics and logs and use features such as action and smart groups for advanced management and remediation purposes.”
Another recommended functionality for Citrix on Azure monitoring is Azure Alerts. These alerts can use metrics, logs, activity logs, or service health and can prompt notifications (email, SMS, etc.) or actions using an action group. Alert rules can also be created for the Log Analytics queries mentioned above to advise on changes in results or metric thresholds. For example, when missing security or critical updates become available.
Another example is a Metric Alert, for example on “Remaining CPU Credits” when using burstable B-Series instances. Burstable VMs are ideal for workloads that do not need the full performance of the CPU continuously and accumulate CPU credits to leverage the full CPU of the VM. These VMs can be used to create a lower-cost single-session VDI offering, such as “Basic” tier within a cloud-tiered services model for low resource intensive use cases. This can offset a “Performance” VDI tier that is using D-Series or F-Series machines to reduce compute spend where applicable. These two tiers can be separate machine catalogs built from the same master image using Citrix Machine Creation Services.
For burstable machines, when CPU credits are consumed, the VM will be capped at baseline performance, impacting user experience. B-Series should only be used for single-session VDAs and after there is a detailed understanding of a use case’s performance needs using historical CPU metrics. Metric Alerts can help mitigate risk once implemented. An alert can let you know when users consume all credits informing Citrix platform owners that a user may be better suited for a “Performance” tier VDI.
These examples of using Azure Monitor for your Citrix environment just scratch the surface of its capabilities for a Citrix deployment. I recommend the following resources for additional information:
- Microsoft Cloud Adoption Framework – Cloud monitoring guide
- Microsoft Cloud Adoption Framework – Enhanced management baseline in Azure
- GitHub – Azure Monitor Community
Don’t Stop Here
If you’ve read all three parts of our Citrix TIPs – Citrix on Azure: Enterprise-Scale Landing Zones series, thank you. I hope this series will help to set you and your team up for greater Citrix and Microsoft Azure success. Continue your journey by exploring the rest of Microsoft’s Cloud Adoption Framework and dive deeper into each of the topics covered. Also, leverage Citrix Tech Zone to get the latest leading practices, whitepapers, and reference architectures for all things Citrix Cloud, including updated Citrix on Azure content. Good luck and happy architecting.