Designing Azure Networks – Part 2 – Security, DMZ, Perimeter N/W, PaaS, web app

This article is part of a larger series of articles titled “Azure Networks for Architects“.

In this article, we will analyze the options available for building a DMZ or a perimeter network on Azure while using PaaS services like Azure Web App. Let’s start with a network design and see what’s wrong with it.


In above design, users access a typical website hosted on Azure Web App (marked 1) which in turn need to access a SQL Server VM hosted in an Azure vNet. Azure web app is connected to Azure vNet using P2S VPN (marked 2). The Azure vNet in turn is connected to on premise resources using VPN. As usual, there is a dedicated subnet for hosting VPN gateway on Azure vNet.

Above a typical scenario that is recommended on Azure (and many have implemented it) where a PaaS service like Azure Web App can access backend resources (like SQL DB) using vNet connectivity and even on premise resources (when vNet is further connected to on prem n/w via VPN).

However, this network design has a couple of security flaws which is unacceptable in many scenarios.

Flaw # 1: Unrestricted access to public facing resources

If you observe above design, traffic from internet to Azure Web App goes uninterrupted. Azure web app does not offer any mechanism where we can apply our custom security policies to incoming traffic. For example, what if we don’t want to expose Azure Web App to internet? We don’t even have an option to turn off public endpoints. Threats like DDOS attacks or IP spoofing go simply unchecked. Wait… someone can say that Microsoft writes that they do have security infrastructure in place for attacks like DOS. Well, read again. The security checks that Microsoft provides are for “their resources” and not “your resources”. This is a very common misconception to misread Microsoft’s own security as Microsoft providing security for customer’s endpoints! No, no one provides anything for free. Though Microsoft provides security but that is very basic. For example, Microsoft will monitor if any other tenant is trying to attack your system from within Azure and prevent it (but it doesn’t protect the attack originating from outside Azure). This is the statement that you need to look for “Windows Azure’s DDoS protection also benefits applications. However, it is still possible for applications to be targeted individually. As a result, customers should actively monitor their Windows Azure applications.“. See the term “also”. It means protecting customer’s apps is not its primary aim but yes, you will definitely et some basic protectuion if the underlying infrastructure is secured.

If you really want to use Microsoft services to prevent attacks, you need to pay for those services. Its called “Azure Security Center”. However, even Azure Security Centre doesn’t offer protection to Azure Web Apps unless you are in “Premium Plan with ASE”. In a nutshell, Microsoft has clearly written in its documentation (so don’t blame them) that they don’t provide protection for tenant’s endpoints. So, remember, you are very much prone to all kinds of attacks and only you… not Microsoft Azure.

Coming back to main point, the first reason why the above scenario is unacceptable is because “there is no control on inbound traffic for public facing resources“. Every organization wants to host a basic firewall (at minimum) that analyzes packets at different layers (especially higher ones) and allow/reject incoming traffic based on policies that your business demands and Azure web apps offer none. Such public facing resources become the weakest link or entry points for attacks into your overall system.

Flaw # 2: No/less cushion between public facing resource and backend resources

One of the important principle of security is called “defense in depth” or “castle” approach. This approach says that we need to increase the distance between the external world and protected resource and provide security by using multiple security controls at each layer. So, if your public facing web app is accessing a SQL Database directly, your SQL Database is at risk. This is also the reason why software design recommends using a “service layer” (not necessarily web service) where all access to backend resources happens via a dedicated set of services. However we don’t want to go into details of software design as it will dilute the topic so we’ll come back to networks.

When security is top priority, organizations maintain complete isolation between internal resources and external world. Below is very high level network design that many organizations follow for hosting public facing resources irrespective of the cloud platform capabilities.


Here are the characteristics of above design.

  1. The public facing resources are within a dedicated network of their own with firewalls and other security devices facing internet. This means that we have control over all of the traffic flowing in/out of public facing resources. We even have the capability to remove all of public facing inbound traffic or allow selective traffic based on security policies that organization wants and we don’t even have to touch the web resource. These controls can be placed on the network itself.
  2. Protected resources are in their own networks (DBs, DCs etc.). Here you’ll see two such separate networks. Whether to use an entirely separate network or use subnets goes into the discussion of performance vs security and my previous post talks about it.
  3. Communication between resources of two networks happen via gateways of the networks. The resources themselves don’t connect directly. If we compare with previous approach, in the flawed design, Azure Web app connected with the SQL Database via Backend vNet’s Gateway. There was only one entity involved in between web app and database. However, now there are two layers (gateway and security policies of DMZ vNet and security policies and gateway of backend vNet). So we have essentially increased the distance between public facing resource and protected resource and we get to deploy extra layers of security controls in between them. So this is better design from “defense in depth” point of view.

In above design the two problems that I mentioned are being addressed (control over inbound traffic and defense in depth).

The “How” on Azure?

Now, the question that remains is how to achieve above level of isolation on Azure networks while using PaaS services like web apps? There are two ways to address this problem on Azure. The first approach involves extra cost while other approach involves extra configuration (and a little bit of extra cost too).

Approach 1: Using Azure App Service Premium Plan
Azure app service offers different plans/editions. The highest of these is “Premium” plan. The premium plan can be used with two options:

  • using App Service Environment (ASE)
  • without using ASE

When we use ASE, the web apps are deployed within an Azure Network. This could be either your own existing n/w or you can create a new dedicated n/w. The app service servers “actually” become part of azure network (without using indirect mechanisms like P2S VPN). The main benefit of this approach vs using VPN is the fact that you can now use all the network security concepts of Azure vNets that you are familiar with (NSG, NSA, UDR…). So you can control the inbound traffic at fine grained level. You can perform deep packet inspection using advanced security appliances connected to your network.

When your app wants to connect to backend resources, you can connect the two networks without actually connecting your web app to backend n/w (you just need to connect the two networks via Gateways). However, this is a costly option just for implementing security and will increase the TCO of your solution. ASE will really be worth investing if you really need other ASE features too (like much larger scalability, bigger storage space etc.).

Approach 2: Using vNet to vNet connectivity
In this approach, instead of hosting web apps in ASE, we simply create a new dedicated network that is public facing and connect Azure Web App to this n/w via VPN as shown below. Notice that in comparison to previous design, the only difference is that the web app connects to DMZ vNet via VPN (and not actually part of it). But this small change makes a BIG difference. In this context, VPN is not same as actually being part of network because this approach still doesn’t provide the ability to control public facing inbound traffic. The web app still connects to internet without any checks and bounds (well, at least, not the ones that we wanted).


However, this design definitely solves the problem of defense in depth. Maybe in future Microsoft may provide capability where all web apps will be placed within a n/w (networks don’t cost us anything) or maybe provide extra capability to filter public inbound traffic.

Till then only ASE offers the uber features that an enterprise grade solution needs. Any non ASE based solution on Azure web app is not completely secure, at least in the books.

I hope you enjoyed this article. In the next article, we’ll take a look at other aspects of network designs.

This series is also mirrored at The links in this article below will be updated as and when new articles are posted.

-Rahul Gangwar

Designing Networks – Part 1 – Performance/Latency

This article is part of a larger series of articles titled “Azure Networks for Architects“.

In this article, we will discuss about when should we create networks and how to organize resources within it to meet application’s non-functional requirements like performance, security and management.

Networks are created for meeting “communication” needs of a solution. Every design decision that we take are guided by “some” goal in mind. Based on our goal, our network design decision changes. The first step is to decide how many networks do we need and which components of solution will go where. There are two common approaches that most of the organizations use while creating a network.

  1. Create networks based on security and/or management requirements
  2. Create networks based on application design (generally for best performance)

In order to understand above approaches, let us analyze some of the network designs. Every network design that we analyze in this series will be split into a separate article. In this part (part 1) we will analyze a network shown below and continue analyzing other network designs in upcoming parts.

Example 1:


Key design aspects about network shown in above figure (and relation with Azure) are given below:

  1. In above figure, resources are grouped based on “types” and “security” needs. For example, all storage resources (DBs, File Servers etc.) are put within a single network (grouped by type). All internet facing resources (web servers, proxy servers…) are placed in a dedicated network (commonly called as DMZ or perimeter network). All supporting infrastructure resources (DC, DNS, ADFS…) are grouped in a separate network. All non-public facing, custom application resources (say windows services…) are clubbed in a separate network.Azure Design Implication:

Above grouping can introduce performance related issues. If you observe, most of the resources (servers) that are grouped together, do not talk to each other. In fact they need to talk to resources placed in other networks. For example, a web server placed in DMZ will call corresponding app server in app network which in turn will access app DB (in Storage N/W) and use supporting infrastructure (like DC) from Corp N/W. Being in separate n/w could lead to high latency issues because communication between networks is slower than communication within same n/w.

On Azure, this could be solved by placing all resources within a single n/w and enforcing the separation boundaries via subnets (similar to shown below… avoided all security aspects). On Azure, as long as two resources are within same network, the communication will be fast. As long as two resources are within same network, (even if they belong to different subnets), this has no latency (performance) implication because the traffic between two subnets of same n/w doesn’t go via any specific router. Instead, the Azure fabric does simple packet filtering (which it does anyways). So, there is no extra overhead related to subnet separation. This is unlike on premise scenarios where your subnets can actually be separated by physical routers (which turn out to be bottleneck in cross subnet performance scenarios).


This is also the reason why on Azure there is absolutely no restriction on number of subnets that you could create. Always remember, limitations on cloud are enforced when either (a) the feature uses some resources OR (b) the feature has design limitations. In case of subnets, having no restriction on number of subnets clearly tells that neither there is any design restriction nor is Azure over stressed with extra resources.

However, if you create separate networks on Azure (similar to above design), there will be performance implications on Azure too because (a) communication between two n/w happen via dedicated Gateways which can be performance blockers and (b) the resources themselves may be physically far away from each other affecting cross network communication latency.

2. The second thing that you will notice in above network design is that once resources have been organized into separate networks, security boundaries are enforced. For example, inbound internet traffic is allowed DMZ network (of course with inspection rules) but access to any other network directly from internet is not allowed. Public facing network (the DMZ) can access application network but no other internal network (app network is providing cushion between DMZ and internal resources). App network can access internal networks.

Azure Design Implication:

After our discussion (in point 1) on separating resources via dedicated networks vs subnets within same n/w, I’m assuming we are going via subnet model. Now, on Azure, this kind of communication restrictions between subnets are generally implemented using two approaches:

Approach #1: Using Network Security Groups (NSG):

In this approach, we simply implement the rules of communication using NSG and apply them to subnets. However, NSGs are simple packet filtering rules and applied at network and transport layers. However these cannot handle advanced threats that happen at higher layers of network stack. For example, NSGs cannot detect if an intrusion has occurred or if some malicious code has already entered in to your system (for example a bot that is trying to do DOS attack on your server from inside network). These advanced scenarios are handled by security appliances that work at higher layers analyzing packets that are closer to your application. At lower layers, you might even not see any issue because all you do at lower level is analyze IP address, port, layer 4 protocol used (the famous 5-tuple inspection done by NSGs).

Approach #2: Using Network Security Appliances (NSA) AND User Defined Routes (UDR)

In this approach, we install special virtual devices called NSA. NSA come from different vendors like Barracuda, Trend etc. These appliances come in different forms and features. Some are firewalls, some have capability for Intrusion Detection (ID) and Intrusion Prevention (IP), some are simple host antiviruses… so depending on your need you would buy one. Unless you bought a host based NSA (that gets installed on same machine where your apps are) you will end up using one of the dedicated resource (maybe a dedicated VM). As an example, let’s say you want to use a firewall that is configured with your advanced rules for restricting your resource’s outside communication or intra-subnet communication. In order to make sure that this NSA can apply those rules, you need to make sure that all communication goes via this NSA. This is implemented on Azure using UDR. So, in a nutshell, you do following:

–> Buy and install a Network Security Appliance (NSA)
–> Configure NSA for your needs
–> Create user defined routes (UDR) to make sure that all communication passes via NSA

However, be aware that when we implement NSAs, we are effectively funneling entire communication through a single compute node. So, make sure to evaluate performance criteria before implementing NSA using UDR. For example, make sure the NSA that you buy has scalability option, make sure to perform load testing especially scenarios that involve inter-subnet communication and NSA limiting it.

Placing Resources Close Together on Azure

I’ll detail a little bit more on Azure Networks, Subnets, distance between resources and performance/latency implications. As I mentioned earlier, as long as your resources belong to same network, communication between them will be fast. However, there are few tricks that you can use for further enhance communication between resources. Earlier, Azure offered the concept of “affinity groups” that allowed us to tell Azure to place all resources to a single “scale unit”. Scale unit was generally considered to be compute clusters that are close together. So you may end up with all your resources within same rack or even within same host machine (if you did not use availability groups). However, now that affinity groups are no longer recommended it is important to understand what is the replacement to that feature.

In scenarios, where latency between two server resources is “ultra, super… important”, you still need to go with concept similar to affinity group. However, unfortunately, there is no replacement for that feature. Now your resources could end up being farther away from each other even if they are within same network. A network is “regional” now which means you would typically select, say, “Central US” for your network and resources. However within a region there could be multiple datacenters hundreds of miles apart and within a datacenter too we may have to traverse multiple routers between two racks (generally the case where datacenter is very large… you cannot have all the resources within same network and end up with scenarios like packet flooding by a faulty node or other similar situations affecting entire datacenter).

So, now that affinity groups are not available (at least I don’t see on any or azure portal but may be available via PowerShell), we can make sure that the VMs that we create are of same size and belonging to same region/network. Azure has compute units in its datacenters and they are available in clusters at single location. So for example, if Microsoft introduces a new fancy VM in Central US region, it is possible that it is available only in a particular datacenter. Now if you create two VMs of which one is plain old mid-sized VM and other one is  the latest one that Microsoft introduced, chances are higher that they will be placed at far apart distance (though within same region). So it is recommended that if intra resource communication is super important, then stick to same size for all individual resources. Of course this is not a guarantee but still a more favorable chance that will continue to be like that even in future. Microsoft itself tries to create compute units as close as possible but if we do not leave a chance to them, we will end up with farther away machines. There are many other factors that influence the datacenter that Microsoft choses for your machine, for example, availability of resources of a particular architecture in a datacenter.

An Example Scenario:

Without naming the organization, I want to give an example when we actually used above concepts. In one of the large scale Azure implementation, the customer knew the load that they will expect on particular dates. The customer wanted to make sure that the machines are close together when we scale out for additional load demands. Now, if we used regular approach of dynamic scaling (increase compute units only when CPU% is ABC etc.) then we would have ended up with compute units in different datacenters as we knew that the datacenter closest to us is small. So, with help of Microsoft Products Groups and Business Groups (since it was a very large scale deployment we got extra support), we created VMs of a particular size (we were told the size that a specific datacenter supports) and created them few days in advance. After that we did not shut down the VMs because if we shut them down, next time we may get a different datacenter. After passing the high load period, we switched back to regular auto scaling approach. One can always say that what if those VMs faced any runtime issues and Azure needed to re-provision them… after all this is what we are taught (that cloud is commodity hardware and we should be ready for compute failures etc.). Well, the reality is a little different. In reality, all datacenters are not alike… some use commodity hardware while some don’t. And compute units failing is not as frequent as we are told, if we land on better datacenters. And even if few compute units did fail, is not catastrophic… we actually overprovisioned to what we needed.

As mentioned earlier, I will continue to analyze other network design approaches in later parts otherwise a single article would become too big to be read.

Key Takeaway:

  • Resources are organized into networks or subnets based on different business needs like performance, security and management. It is important to evaluate the impact of your network design over these aspects.
  • If resources are too far away (geographically) or separated by different networks (via dedicated routers) then performance of communication between them is affected.
  • On Azure, as long as two resources are within same virtual network, the communication latency between them will be low (hence better performance).
  • On Azure, separating resources by subnets (in same network) does not affect their communication performance. But be careful, an exception to this scenario is using NSA appliances with UDR in which case communication performance is affected by the throughput offered by the NSA devices.
  • When we connect PaaS components to IaaS components (for example Azure web app with Azure vNets), this happens via VPNs. Make sure that the PaaS component is within same region as the IaaS component otherwise there will be communication latency between resources. One quick tip to ensure this is to keep all resources within same resource group (RGs are ties to a specific region).
  • On Azure, we have control over resources being as close as being in a single “region”. Earlier we had concept of “affinity group” which provided control at datacenter level but it is deprecated now. If we want resources to be really close to each other, try to create them of similar configuration (size, features used etc.). “Chances” are (not guaranteed) that your resources will be placed within same scale unit or at least within same datacenter.
  • It is common misconception that Azure NSGs are “weaker” or not very secure way to protect your resources on Azure. It is important to understand that NSGs offer a different capability set and operate at different layer of OSI model. If you need fine grained protection, you’d need to intercept packets at higher layer which NSAs can do. Each component has its own place and purpose and they complement each other in overall security (firewalls, antivirus, security center, UDR, NSG, NSA etc.). Expecting NSGs to meet NSA needs is not a good approach.
  • One of the key aspect to improve communication performance is to keep resources close to each other and close to audience. Keeping them within same network, having uniform size of resources, scaling in advance and one unit a time, keeping within same subnet (when using NSA) or increasing throughput of NSAs when in different subnets, selecting same region (when using VPN) top connect to network) are some of the design aspects that will create a solid design that offers minimum communication latency from infrastructure perspective. There are many other things that we can do from application design perspective (e.g., caching) but that will diverge the topic of discussion.
  • Other tips to enhance communication performance are to use standard features offered by Azure like High Speed Gateways for your VPNs, Express routes, using bigger compute units for more bandwidth, using SSDs etc. (of course they mean extra cost) but those are “features” and jot really talking about “design” aspects so I’ve skipped them for now. Those may come as part of my other mini articles in future.

An ideal network on Azure having incorporated above designs may look like below:



  • “Common” resources that require “small” and “less frequent” data transfers are put in a separate network (corp network). In this scenario, these resources were also protected so we did not create any public endpoint so the network is inaccessible from public internet.
  • All application specific resources are put within same network and same Resource Group (so that they are in same region… not shown in figure). These resources are separated by subnets for security needs. Usage of User Defined Routes with Security Appliances is optional but can be used. NSGs are used to filter packets between subnets before it even reached security appliance.
  • Corp network is connected to application network via VPN to enhance security. POaaS resources also communicated to Azure n/w via VPN. Public internet is restricted to access only specific resources in DMZ subnet of app network.


There may be scenarios where some “common/shared” resources requires heavy data usage by application. In such scenario, the recommended option is to use development designs (like caching) to avoid hitting resources of another network. If that is not a feasible approach, then we can use infrastructure solutions like replication where we keep a copy of the shared resource in app network and maintain a synch schedule based on business needs. Performing bandwidth heavy frequent operations to resources of another is the last thing you want to do.

I hope you enjoyed this article that talks about network design aspects for enhancing performance (communication latency) between different components of your network on Azure.

In the next article, we’ll take a look at network design from “security” perspective.

This series is also mirrored at The links in this article below will be updated as and when new articles are posted.

-Rahul Gangwar

What are Networks?

This article is part of a larger series of articles titled “Azure Networks for Architects“. This is a “foundation” article that explains slightly different perspective about Networks and its underlying protocol stack. The article also provides a mapping between Azure constructs and general networking terms used.

Unless we are building a monolithic solution deployed standalone with no interaction to any other entity we are talking about a solution that involves “network” in them. Networks are very important while designing Azure Solutions. I can’t imagine any decent solution on Azure that does not involve having your own network unless we go for pure PaaS/SaaS based solutions.

“If two or more than two entities are connected with each other, they form a network”. Two things are important to understand here :
(1) “entity”: which means not necessarily computers
(2)”connected”: which means they are simply connected (they may not be actually communicating).

Avoiding much of complexity, in software application scenario, if you have two or more components deployed separately (typically presentation, service, business and data layers) you’d need to establish connectivity between them. You can do so by including them as part of a network. There are other non-network oriented ways for connectivity too but as I said, avoiding much complexity…

We see networks everywhere… at our home (wifi), at offices (Ethernet), Cell Phones etc. So, the fundamental process includes two steps:

  • Creating a network
  • Joining entities (typically compute devices) to the network

Since network is a grouping of devices for communication purpose, it is only natural to think that networks must be having a standard language that devices can use to communicate. Yes, you are correct. A network is incomplete unless we specify what language can devices speak in it. Every network supports a specific communication “protocol” that devices will use to talk between each other. So, that also means that every device that wish to connect to a network must understand that communication protocol.

There are different kind of network protocols and you choose one based on the requirements. For example, we use standards like GSM and CDMA for creating networks for cell phones, while we use TCP/IP for creating computer networks (both wifi and ethernet). These networks are different from each other, not only because they use different protocols (from software perspective), but also because they use different radio frequencies for communication. So, we need hardware (specific for a network) on the device that wants to join a given network. This is the reason why your phone probably has an antenna for connecting to GSM, a wifi adapter for connecting to computer networks and maybe another antenna for CDMA network. Coming back to point, computer networks are largely based on TCP/IP protocol suit (just like cellphones are GSM/CDMA based) and so are Azure Networks (did you really think of creating a GSM network on Azure 🙂 ). Going forward, I’ll use the term “device” instead of “entities” as we are moving closer to computer networks.

At this point I’d also want to give very minimal details about how protocols are stacked. These are basic but important concepts. There are many concepts that are “derived” and may not be written explicitly on documentation of product. These are few of those basic concepts from network perspective.

From user perspective, protocols start from layer 4 of OSI model on Azure. You don’t need to go into deep details of the layers of OSI (though some people claim they understand OSI…). What this means is following:

Layer 1:
We don’t have access to physical layer wires, wifi etc… they are managed by Microsoft.

Layer 2:
We don’t have access to data link layer… the software that interacts directly with physical layer. Networks are created on top of this layer and Microsoft did implement their own backbone network on this layer. But this network is inaccessible to cloud users.

Layer 3:
At layer 3 (network layer), we get the protocol called IP. Protocols like TCP, UDP and ICMP are implemented on top of IP. When “user” creates a network on Azure (called vNet in Azure terminology), it is created by Microsoft at layer 3 (the network (IP) layer). There are many versions of IP. Version 4 and Version 6 (IPv6) are of most interest. Microsoft creates vNet using IPv4. So, that means IPv6 is not supported on azure vNets as of today.

Network on Azure (vNet) is called virtual network because is implemented using software. This means that supporting IPv6 in future is merely a matter of upgrading software. In fact, support for IPv6 will be one of the few major announcements that you’d see in future. One question that may come into mind is, whether we can create VLANS on Azure or not? VLANS are also software created networks but they are created at layer 2 and the main purpose of VLANs was to join separate physical networks into one using software. But, on Azure, we don’t have access to layer 2 so we cannot create VLANS. vNets are created using a technique called overlay network. Overlay Network is built on top of other networks and it allows creating multiple networks (with conflicting address space) on top of same underlying shared network. Its done using technology similar to hyper-v network Virtualization (some say overlays are different than virtualization) using NVGRE where each customer’s network ID is added to each frame (not packet because packet is layer 3 concept and we need isolation at layer 2). I think if I go any further, it will dilute the topic. But the key thing is that you don’t get dedicated network on Azure. Its a shared space and data from multiple tenants flow on same infrastructure.

Layer 4:
This is the layer that we get to see and work as cloud users. At this layer we get protocols like TCP, UDP and ICMP are implemented on top of IPv4. There are many protocols that can be used by applications on top of IP layer. For each protocol, IP header’s protocol field has a number mentioned. You can see all common protocol numbers here.

Azure networks support only TCP, UDP and ICMP protocols and you need to keep this in mind before planning for a solution to be deployed on Azure Networks. Your solution may sometimes have the need to use other protocols than TCP, UDP and ICMP. Most of the protocols that software applications use are implemented on top of these. For example, http is implemented on top of TCP.

There are some protocols may run but not supported. The term supported means, if anything goes wrong, you can’t call Microsoft AND updates that Microsoft deploys on Azure “may” break your existing scenarios. Basically, you are on your own even though some protocols may work technically.

Even though TCP, UDP and ICMP are supported, the protocols that are implemented on top of these may not be “supported”… though technically possible. An example will be SNMP. SNMP uses UDP but so it gives an impression that SNMP should be supported. The reality is, even though technically you can use SNMP but it will NOT be “supported” by Microsoft because Microsoft gives SNMP as part of Windows OS and this feature of Windows OS is not supported yet. So its more of OS feature supportability rather than technical supportability. If you are planning to use any product on Azure you must do following:

(1) If you plant to build your own product using a specific protocol “directly”, check whether the protocol itself is supported or not

(2) If you plan to use a product/product feature (say SNMP using Windows Server) then you need to check whether the product feature is supported by vendor (Microsoft in case of Windows Server SNMP) on Azure or not

Some of the protocols are available for specific purposes only. As an example, IPSec is used to provide data security between “device-devices”, between “network-network” and between “device-network”. On Azure, IPSec can be used to provide secure connectivity between “network-network” (that too between two non-azure network-azure vNet and not used between two azure vNets) but is NOT supported for other use cases.

Key Takeaways:

  • Networks have a “language” that every entity/device should know how to speak. This means devices must be equipped with both software protocol (e.g., TCP/IP) as well as necessary hardware (Ethernet/Wifi).
  • Azure provides multiple virtual hardware concepts to join networks. The primary ones include Network Interface Cards (NIC) and Virtual Network Adapters (in form of VPN).
  • Azure networks supports only UDP, TCP and ICMP on top of IPv4. IPv6 is not supported yet. This is very important to keep in mind while designing solutions on Azure. Keep a checklist of protocols that your solution needs and see if it is supported or not as part of technical feasibility analysis.
  • Even if a protocol is supported, check if the vendor of the product that you wish to use on Azure supports it or not.
  • The actual underlying network on Azure is “shared” between tenants and isolation is provided using overlay networks. So, in reality, packets of multiple customers flow on same channel but they are still isolated and secure because users don’t get access to lower layers of networking stack. This is important from compliance perspective. Some organizations want a separate dedicated physical network which is NOT possible on Azure. In those scenarios, we go for hybrid approach to move only those components to Azure that can afford shared space.

I hope you enjoyed this small article that build foundation of networking concepts and how it is related with Azure from supportability perspective.

In the next article, I’ll talk about “when should we create networks”.

This series is also mirrored at The links in this article below will be updated as and when new articles are posted.

-Rahul Gangwar

Contact us for:

Azure Networks for Architects

Azure Networks for Architects is a series of articles where I’ll go over some of the aspects of designing networks on Microsoft Azure. This series will have multiple blog posts and in each article, I’ll go over one aspect about networks on Azure. The series is aimed to cover from very trivial and small concept to important ones.

Though the title of this series says “Architects” this will be helpful for multiple category of poeple including developers, IT professionals, project managers, technical leads, consultants… who want to understand much fine aspects about networking which is much beyond “How to” aspects. In the articles you’ll also find that there are very small things that we tend to overlook but they play important roles in overall design for any enterprise level deployment.

The concepts presented in this series come from a wide variety of sources including my experience while working with hundreds of customers in Microsoft as a Sr. Consultant, my takeaway while working with customers in ComTec as Solution Architect, my learning from some very knowledgeable people in the industry, my mistakes and my little power of visualization and analysis.

Why Networks only? I’m aiming to cover multiple aspects on cloud computing and I’m only “starting” with networks. In future you’ll see other components of designing cloud solutions like “storage”, “communication”, “security” and so on.

Why starting with networks? Networks for the backbone of any solution. The first step in deploying any solution involves creating its underlying infrastructure upon which entire solution is built. As you’ll see in articles of this series, Network design dictates many things about your solution like performance, availability, scalability, security, maintainability and cost. I’ve not used these terms to impress, as you’ll see, these are indeed affected by network design. I’m ignoring about a small set of solutions which are built on PaaS or SaaS and may not need network as such.

Solution vs Application/Software: You’d have seen me using the term solution very often. My title with ComTec also says “Solution Architect”. This is also a buzzword that many people use as it sounds fancy and kind of “deep” and “better” than other terms. Some thin k “Solution” is very large and complex software application. However, I’ve tried to take utmost care about terms that I use (I may still have have used some terms without much thought). The term “Solution” refers to the “Answer” to a “Problem” that we are trying to address. Some times answer to a problem may be a software application but not always. As we move beyond this networking series, you’ll see that the articles are more abstract (farther from “how to”) … more “strategic” than implementation oriented. Even a decision to not use a software could be a solution to problem.

This series is also mirrored at The links in this article below will be updated as and when new articles are posted.

List of Series Articles:

  1. What are Networks
  2. Designing Networks – Part 1 – Performance/Latency

Stay Tuned! Subscribe

Rahul Gangwar

Azure DocumentDB – Design Concepts

It was 2015 when I worked with Microsoft and delivered a small talk on Document DB at “Great Indian Developer Summit” along with my friend and mentor Wriju Da : ). We talked about this technology as part of polyglot persistence pattern and advocated use of multiple persistent stores and not just sticking to RDBMS every time.
It was a buzzword during those days. Since then I’ve worked with many customers and I’ve seen the trend and pace with which NOSQL grew. Today it is an actively sought after database approach and upward trend continues.

Today, I work as a Solution Architect in ComTec and even after NOSQL having gained wide popularity, every time I work with customers I see both the confusion and enthusiasm in the eyes of developers and Architects alike. So, I thought about sharing my bit.

This post is meant for anyone trying to understand NOSQL in general and specifically Azure Document DB from design and architecture perspective. Unlike traditional approach where we try to explain by keeping RDBMS as base (which I find confusing), I’m presenting it in some questions, answers and scenario basis. I’ve used examples from one of the official microsoft documentation instead of creating new ones.

What is document DB?

It is managed (meaning, you don’t have to manage, Microsoft will) NOSQL DB Service from Microsoft.

When explaining NOSQL DB, most of the articles talk about the fact that they don’t have the concept of schemas and hence they are very flexible. While this is true, understanding document DB by virtue of being “schema free” only is not the correct approach (in my view) and often leads to confusion later. So, In order to explain document DB (or NOSQL in general) I prefer explaining following two pointers:

  1. Document DB stores your objects “as is” generally using JSON format (please read JSON format separately if you’re not familiar with it).
    The term “as is” is used loosely to signify the fact that there are no dedicated tables or any other kind of structured containers (of course they are within a container but they are not structured/schema bound). These stored objects in NOSQL databases are also known as “documents” (we will use this term going forward).
  2. There is absolutely no relationship between different objects from storage perspective (you may have that in your head or within app logic).

If documents are stored “as is” and if they don’t have any relationship between them then how do I perform queries that involve relationship scenarios (like 1:1, 1:may, many:many)? After all having such relations between two different entities is a reality.

The first design technique that is used to address relationship challenge is “avoiding any relationships altogether”. Though you can technically store any kind of object from your app to document DB, it is important to design your objects in such a way that they are “self-contained” (not having any dependency on any other object).
This can be explained with a very simple example. Imagine you had “Person” and “Address” tables in RDBMS joined by “Person ID” field. If you create corresponding classes in application also (Person and Address class) and then try to store objects of these types separately in DocumentDB, then you’d be committing a mistake by thinking RDBMS way. Instead, you’d need to think the NOSQL way where you would still create classes called “Person” and “Address” but this time you’d have “Adresses” as one of the property of “Person” class (maybe of type list

). Now, when you create an object of “Person” it will be “self-contained”. The object will be serialized to JSON format and stored in a document DB collection thereby completely avoiding the need for any relationship (and hence joins). So, essentially what’s done here is that we “de-normalized” the data (by not storing objects of separate classes separately)… instead we “embedded” “Address” within “Person” object and they will actually be serialized and stored in document DB as a single document in JSON format with structure something similar to below:
“id”: “1”,
“firstName”: “ABC”,
“lastName”: “XYZ”,
“street”: “My Street”,
“city”: “MyCity”,


Remember:Embedding technique means we are de-normalizing and it works well with 1:1 relationships.

The above example looks good in a specific scenario. But we still have some scenarios where this may not be an option. For example, what if we want to store details about “orders”. Now if we embed orders within “Person” then how do I query based on “Orders” only without keeping “person” in mind? Also, if we follow this approach, we’d end up storing ALL of my entities inside “Person”. That just doesn’t make sense because (a) it will increase size of individual documents (stored objects) and (b) its like going back to forced hierarchical data design where everything stemmed out from a root entity (“person in this case).

YES, if you end up creating very large objects (because you are forcing NOSQL way) then you certainly messed up the document DB design. Large objects are hard to move over wire and if you’re thinking about trimming them at your service level then you are opening a Pandora’s box. You’d definitely end up with a solution design that is over complicated and costly.

“Embedding” technique must be used only when you know that it will not lead to large object designs AND when you know that a particular piece of data is almost always required. I read somewhere… “In NoSQL, you don’t design your database based on the relationships between data entities. You design your database based on the queries you will run against it.” This statement clearly tells that if you have the frequent need to query “address” along with “Person” then you can embed otherwise not.

Another caution while using “embedding” technique is to think about cost associated with update operations. As an example below, we can embed “stock” that a user person invested in.

“id”: “1”,
“firstName”: “ABC”,
“lastName”: “XYZ”,
“numberHeld”: 100,
“stock”: { “symbol”: “myStock1”, “value”: 0.5 }


Now, it may seem that embedding is good here but imagine that myStock1 may be changing 100s of times in a day. And if we have 1000s of “Person” in our database who have investments in this stock then updating value of this stock in system will require you to update 1000s of documents.

On top of that, you have to do that from your application logic. This means if your app doesn’t handle update logic carefully you may end up in an “anamoly” scenario where price for same item may be shown different for different users. Document DB wouldn’t be able to help you to avoid such anomalies. Azure Document DB does provide you feature of stored procs or server side triggers to lessen the design burden but still they can’t match what we get with RDBMS.

In short, you are correct, embedding is not the answer to all scenarios.

So, what is the alternative? How do we handle such scenarios if embedding is not a preferred approach for such scenarios?

When “embedding” doesn’t work (and that is dependent on your app scenario), you use another technique called “referencing”. In referencing technique, instead of actually embedding “investment” objects within “person”, we’d store just the ID of investment (which is stored separately). Wait… didn’t I just say that if I’m storing the objects separately and joining by IDs, then I’m committing the crime of thinking the RDBMS way? Yes, I said it. And I said that to actually point out the hard reality that even after coming to NOSQL DB, you may end up using relational features. There is actually NO clean way to handle these “real” problems in NOSQL DBs.

In fact, the problems doesn’t here… let’s say that for those “rare” scenarios you are ready to use “referencing”. Now, we have really opened the Pandora’s box because…

  1. Document DB doesn’t have any concept of “relationships” so this “referencing” technique is purely application logic driven. You need to be “careful” to design application in such a way that whenever you want to fetch data that need two objects consolidated, you’d have to do it programmatically. I am sure you do understand the implication of performing join in application memory vs at server side. I won’t go into those details. Though, Azure DocumentDB provides concept of stored procedures so that you don’t eat your app server’s resources but that support is limited by multiple factors including design decisions.
  2. We are back to normalization approach but this time without any help from underlying database.
  3. We also enter into problem of deciding which class should reference which should be referenced. For example, should we reference “book” within “publisher” or “publisher” within “book” class? Answer is: If the number of the books per publisher is small, then storing the book reference inside the publisher document may be useful otherwise vice versa. So, even referencing technique changes per application scenario (unlike RDBMS).

What if there is many:many relationship? For example, an Author has many books and a book has many authors. Which class will reference and which will be referenced?

To solve this many:many problem, we use technique of creating “joining document”. In this technique neither author nor book class will reference each other. Instead, there will be yet another class, say, “authorBooks” which will contain author ID and IDs of all the books associated with it. But then the join will be very complex and we also created VERY RDBMS oriented data design. To avoid this, we “reference” author IDs in book and book IDs in author class. Of course you can always use mix of embedding and referencing.

How does Document DB infer schema of objects and how does indexing work?

The answer lies in the fact that JSON is self-describing (i.e., has data and schema). Document DB creates a tree structure out of the schema of JSON object. Each property of JSON object represents a node in this tree. Then each value of this tree gets a definite path. Every path in document DB is indexed. Whenever we update any object, the index is updated. Both values and schema are treated equally when it comes to path. The index of paths itself is a tree structure. Indexes are updated synchronously… which means if you update a document, its result will not be returned until index tree is also updated.

What are performance implications?


  • DocumentDB delivers predictable performance.
  • It does so by allocating request units to your DB collection.
  • Request unit represents the cost associated to process a request of reading a single 1KB JSON document having 10 unique properties.
  • Insert, replace, delete etc. are more costly than read operation.
  • Each query or update operation may have different cost requirements based on size of document and query.
  • Determining the approx number of RU that your application may need is difficult so Microsoft has created a RU calculator where you can upload sample JSON documents and specify read, create, update and delete operations /second needed by your app for that JSON document and calculator will give RUs that you should reserve.

Buzz me if you liked it.

-Rahul Gangwar

Azure SQL Database and Azure AD Authn Flow

SQL Azure DB now supports Azure Active Directory based authentication (preview) and this needs some detailing as the official documentation is very high level. Below I am presenting the flow of this authentication mechanism so that you can create all sorts of permutations and combinations and deduce the behavior yourself for any scenario. I am taking example of a typical ASP.NET based website (say developed using C#) that access SQL Azure DB using Azure AD authentication.

The candidates involved in this scenario are described here:

  • Client: Any one who tries to access SQL Azure DB. This could be SQL Management Studio or an ASP.NET web app running on IIS or a REST based Web API.
  • SQL DB: The resource that we are trying to access.
  • Azure AD: The Security Token Service whose tokens are needed by client to access SQL Azure DB

I will not go into the details of other well documented requirements like setting up contained DB, groups, admins,.NET 4.6 and above based client etc. You can read them here:

The Scenario:

I am taking a little bit bigger scenario involving both Windows AD and Azure AD and looking at different options at different stages of flow. This will help you comprehend simple scenarios easily.

The flow:


In above scenario, we will see how a typical web app access Azure SQL DB using different ways.

(1) User is authenticated with web application. User could have authenticated using claims authentication, Kerberos or any other mechanism… it doesn’t matter. One thing in this step must be clear is that user is actually a local AD user. IIS website may or may not be using ADFS for authentication (this too doesn’t matter for our scenario).

Now, before application could use SQL DB, it first need to obtain a valid token from Azure AD. Further steps explain that process.

(2) Before client/app can obtain token from Azure AD, it will obtain token from ADFS which is federated with Azure AD. This is performed using ws-trust protocol. This can be done using one of the below options depending on what the scenario is.

  • Option 1: IIS web app can prompt user for his/her credentials, collect credentials and send them to ADFS to obtain a Token (T1). This does not require IIS web server to be domain joined.
  • Option 2: If the user was authenticated with IIS web app using claims authentication then web app can use c2wts (impersonation) to authenticate with AD FS and obtain a Token (T1). This requires IIS server to be domain joined.
  • Option 3: If the IIS web app is configured to use integrated Windows authentication with impersonation, then it can use Kerberos (constrained delegation) to authenticate with ADFS and obtain a Token (T1). This requires IIS web server to be domain joined (unless we used NTLM and made solution ugly).
  • Option 4: IIS web app can use a fixed windows identity to access ADFS (irrespective of who the user is) and obtain a Token (T1) from ADFS. If you hardcode username/pwd then IIS web server doesn’t need to be domain joined, otherwise it would need to be domain joined and app pool running under that fixed AD user’s identity.

(3) No matter which option is used by IIS web app, ultimately IIS web app will get a Token (T1) from ADFS. This token T1 is meant for Azure AD and contains user’s attributes defined by claim rules in ADFS. The user could be either the end user or fixed identity (in case of option 4) depending on which option we choose as described above.

(4) Now, the IIS web app will send T1 to Azure AD. Azure AD will accept this token as there already is a federation between ADFS and Azure AD.

(5) Azure AD will verify the token T1 and look at user’s attributes and then create a new token (T2) for the same user to get another token (T2). This token (T2) will be returned to application.

(6) Then the IIS web app will use token T2 to access Azure SQL DB. Azure SQL DB trusts Azure AD issued tokens and allows access using that user’s identity (user of step 1).

Abrupt End!

-Rahul Gangwar
Cloud & Security Consultant



Azure IP Addresses


  • Virtual IP means an IP that is visible externally but it may not be actually assigned to actual resource. So for example, if I have a web farm with two nodes, I don’t want IP of actual resources. Instead, I need an IP that can be used to identify my farm (both the nodes/VMs). Since it is not the IP that is actually allocated to the node, it is called virtual IP.
  • VIPs are always external facing. They are never internal and never assigned to the actual VM.
  • VIPs are always associated with CloudService/Load Balancer. Documentation uses the term CS and Load Balancer interchangeably. For example this article is about VIP on CS but inside it also says that it is associated with Load Balancer. On a side note, from IaaS perspective, load balancer will be more accurate term while from PaaS perspective CS will be better to visualize.


  • Dynamic IP is the actual IP that a node/VM gets. These are actually assigned to the VM instance’s NIC card.
  • However, it is always internal. These IPs are assigned to the VM instance and are never available outside a CS/vNet within which the VM is present.
  • These are called Dynamic because by default, the IP that your VM gets can change if you deallocate.
  • In a typical scenario, let’s assume we have 10 VMs in a vNet, then each VM will get an internal IP (DIP) and “most of times” we don’t care even if it changes. Only in few scenarios we don’t want these internal IPs to change (say when your VM is hosting DNS). In that case, we would want this internal IP (DIP) to be static rather than dynamic. So, you could “reserve” these internal IPs as well. In a typical web farm scenario, you’d never care about reserving DIP.


  • Public Instance-level IP (earlier called PIP) is an IP that is visible to external world AND is actually assigned to the VM instance.
  • So, they are same as VIP except that unlike VIP, they are actually assigned to VM instance.
  • If you access a VM using its PIP, the traffic never goes through load balancer. That is the reason why you never configure/add/delete any public endpoint (http/https etc.) on cloud service when you are using PIP because endpoints never come into picture.
  • In most of scenarios, you’d never want to use PIP. Some scenarios (like passive FTP) may be eligible candidate for using PIP.
  • You’d never use PIP for web farm scenarios as you will definitely have more than one VMs in your farm and you’d want traffic to be routed via load balancer. So for web farms, you’d want to use VIPs only.

Reserved/Stable IP:

  • These are one and the same. When you reserve an IP it becomes stable… doesn’t change irrespective of your deallocation of resources etc…. though deletion does affect even reserved IPs.
  • You can reserve both VIP and DIP but not PIP (again unless something changed).

-Rahul Gangwar