2013/04/04

Surebackup Network Deepdive

In this post I will deepdive into the Surebackup networking part. I can imagine that this might be to deep for some people but at least it should serve as a good reference how Surebackup vlab works.

But first things first. If you don't know what SNAT, DNAT or Netmap means, I advise you to read my previous post. Understanding this post is very crucial for understanding how Surebackup works internally. Even if you know those concepts, I advise you to read the last part of the post describing how to connect overlapping subnet ranges as this is the base for Surebackup.

The simple network

We will first start by creating a very simple Surebackup network. This particular Veeam customer has only one network, his production range.  So lets put the network parameter is in a clean table. This is very interesting information as it contains everything we need to set up the Virtual Lab

NamePortgroupVLANSubnetNetmaskDefault GatewayDNS
ProductionVM Network0192.168.149.x255.255.255.0192.168.149.2192.168.149.20

Now for every production network you want to use in Surebackup, you will need a subnet range that is not being used and has the same subnet size. How do you know which production networks you are going to use. Take a look at the VMs you want to test in Surebackup and write down in which network they are connected.  In this case it is only one network

Subnet MaskProduction SubnetSubnet not in use anywhere on the network
255.255.255.0192.168.149.x192.168.150.x

You could also use a different private range in a different private address space . This will allow you to make a clear distinction between production and your Surebackup network. For example:

Subnet MaskProduction SubnetSubnet not in use anywhere on the network
255.255.255.0192.168.149.x172.16.149.x
255.255.255.0192.168.129.x172.16.129.x

The Goal

Before I start explaining how to configure everything, let me first start by explaining what our final goal is. I think 99% of all problems with Virtual Labs occur because users don't know what the final result should be.

With Surebackup we will start up Virtual Machine directly from a backup in an isolated network. So what is an isolated network? An isolated network is a network that mimics a production network. It means that VMs will reuse the same network settings as listed up in our table earlier. This portgroup or network will be created automatically on a vSwitch without any uplinks by Veeam. You could say that Veeam creates a network sandbox.

Imagine a server SF0006 with IP 192.168.149.36 . It is running in the production network. When Surebackup is configured, a copy of this network is created: the isolated network. The backup copy we want to test will be started in this isolated network. The result is something like this. Notice that I added the IP of the Veeam Backup Server

VM Network (has uplinks on vSwitch)
  • SF0006 : 192.168.149.36
  • Default gateway : 192.168.149.2
  • Veeam Backup Server : 192.168.149.22

vLab VM Network (has no uplinks on vSwitch)
  • SF0006_Backup_Copy : 192.168.149.36
This is already a great start. The only thing is that we can not talk to our "SF0006_Backup_Copy" machine as it is isolated. This is where vLab networking comes in to play. Veeam will deploy a small Linux NAT Router. The router will sit between the Production Network and the Isolated vLab VM Network.

In the production network the vLab router will just need to get any available IP. You can use a DHCP address but I recommend using a fixed IP. In my case I choose 192.168.149.50. So let me update the VM Network

VM Network (has uplinks on vSwitch)
  • SF0006 : 192.168.149.36
  • Default gateway : 192.168.149.2
  • Veeam Backup Server : 192.168.149.22
  • vLab router interface 0 : 192.168.149.50
Then in the isolated network we will also have to choose and IP address. However the choiche is very easy. If a virtual machine wants to contact the outside world it will use its default gateway. The backup copy is not aware that it is started in an isolated environment. When it runs, it will also want to talk to the default gateway to send out traffic. So in the isolated environment our vLab router will mimic the default gateway that is running in production. Our vLab VM Network now looks like:

vLab VM Network (has no uplinks on vSwitch)
  • vLab router interface 1 : 192.168.149.2
  • SF0006_Backup_Copy : 192.168.149.36

Now if the Veeam server or the production machine wants to talk the backup copy it will just send packetsto vLab router interface 0. This vLab router can then forward or route the package to the isolated environment. What is more interesting is that you have overlapping subnets. So you will need to do some form of NAT to hide this isolated environment behind another subnet.

Lets see how it looks in VMware. First of all  you will see 3 running machines. The first one is the production machine SF0006. The second one is the backup copy. Notice that we won't use the name sf0006_backup_copy but rather sf0006_insertveryrandomhashhere. The last one is the vlab network

Now lets take a look at SF0006. Like stated before it is has the IP 192.168.149.36 and is connected to the VM Network

The backup copy of SF0006 also uses the IP 192.168.149.36. However it is connected to the vLab VM Network

 The vLab router itself has 2 IPs. One of those IP is 192.168.149.50 in the VM Network. You can also see that it is connected to the vLab VM Network

If we look at the ESX networking, you can see clearly that the isolated network has no way to talk with the outside world, except via the vLab router

The Configuration

So now we know what the result should be, lets see how you can configure it. You will need to go to the backup infrastructure and add a new virtual lab

Give the Virtual Lab a name. I like vLab because it is short

The vLab router is a Linux appliance. It is a virtual machine so you will need to select a vSphere host where you want to run it on. Important, all the networking is only created on this vSphere host. All the machines that will be test will be powered on on this host.

Select a datastore to store the Linux appliance on

Now configure the router so that it has an IP in your production network. Again this is the free IP we reserved for the virtual machine. 

Instead of using DHCP, we choose the static IP 192.168.149.50 . This IP will be set on interface 0 and will be entry point for all packets coming from production.


 Then in the next step choose manual configuration. If you made it this far, you know what you are doing.

Now lets create an isolated network. In our case we only need to mimic one portgroup called VM Network. However if you would have multiple networks (Production, DMZ, ...) and you need those to test your VM's, you will need to create a copy or isolated network of each production network . I will show you this in a later post.

For each isolated network you will have to define a vNIC or interface in that isolated environment. Remember that our Linux router will mimic the default gateway of production.

So in the settings for the isolated network vLab VM Network, configure the default gateway 192.168.149.2 of production. The vLab router will then mimic the gateway in this isolated network. Interesting enough you will now have to configure a "Mask". Basically you will configure a subnet that does not exists in your environment and that will hide away or mask your isolated environment. What you are actually doing is creating a NETMAP rule in the Virtual Lab router.

I choose 192.168.150.x because I have only one network. However I can not stress this enough, this mask should be unique in your network to avoid problems.

Just skip static mapping for now. I will cover it later.

And your are setup.

The Result

When you are running a Surebackup job the result will be something like this (our goal)


VM Network (has uplinks on vSwitch)

  • SF0006 : 192.168.149.36
  • Default gateway : 192.168.149.2
  • Veeam Backup Server : 192.168.149.22
  • vLab router interface 0 : 192.168.149.50
vLab VM Network (has no uplinks on vSwitch)
  • vLab router interface 1 : 192.168.149.2
  • SF0006_Backup_Copy : 192.168.149.36

Remember that we masked or did a NETMAP for the vLab network. So any production VM that wants to talk to SF0006_Backup_Copy from the VM Network will not use it regular IP 192.168.149.36 but will use his masked version 192.168.150.36. You can see this in the screenshot that I am able to ping the machine succesfully

There is only one problem. The Veeam Backup Server (VBS) its default gateway is set to 192.168.149.2. So if it wants to talk to 192.168.150.36, it would talk with that default gateway. The default gateway is not aware of the situation and just drops the packets.

So how do we fix this. Well it is automatically fixed. If you run a surebackup job or you run an U-AIR Wizard, Veeam will automatically add static routes on the Veeam Backup Server (VBS) or the machine running the U-AIR Wizard. You can see this in a console screen using the command "route print".

Veeam has added a static route saying that traffic for subnet 192.168.150.0/24 should be forwarded to 192.168.149.50 which is our vLab router interface 0. When the VBS wants to talk with 192.168.150.36, it will send a package to our vLab router and the traffic will be translated.

Deepdive

I've shown you how to configure the vLab router but lets see what happens underneath. If you have not read my previous post, please do so.

One thing we discussed in the NAT post is how you can manage overlapping subnet. This is exactly how the vLab router solves this same challenge. So lets take a look under the hood. If we run ifconfig we can see the IPs set on the interfaces. In this case

  • Interface 0 : eth0 : 192.168.149.50
  • Interface 1 : eth1 : 192.168.149.2




First lets look at the NAT rules. There are two important NAT rules. The Netmap rule and the Masquarade rule

The Netmap rule in Pre Routing stage
When?TypeComing from InterfaceExit InterfaceOriginal DestTranslation Dest
Pre RoutingNETMAP eth0 / Interface 0*192.168.150.x/24192.168.149.x/24

The Masquerade rule in Post Routing stage
When?TypeComing from InterfaceLeaving on InterfaceOriginal Source
Post RoutingMasquerade*eth1 Interface 10.0.0.0/0 = everything

So lets follow a packet coming from our backup server 192.168.149.22. It enters on interface 0 and matches the netmap rule. The destination is translated
IP Packet
Source192.168.149.22
Destination192.168.150.36 192.168.149.36
MessageHello from Backup Server

Again the routing will be solved by marking. I will show this later on. The packet is forwarded to interface1. There the Post Routing Masquerade rule kicks in. It replace the source IP with the the IP set on interface 1
IP Packet
Source192.168.149.22 192.168.149.2
Destination192.168.150.36 192.168.149.36
MessageHello from Backup Server


SF0006_backup_copy will be able to receive the message and respond back to the router. The router will reverse the whole sequence and the package is delivered

So how is the marking set up? Well this is a bit tricky and I hope I am explaining it right.

When a package enters on interface 0 (eth0) it will be marked using a value 0x6 with bit mask 0xffffffff (32-bit) if the destination is 192.168.150.x/24. Remember, this rule is applied before NAT rules are applied

Then if we look at the ip routing tables we will see that the all the traffic  should leave via interface 0 / eth0. As default gateway the production router is configured.

If we look at the ip rules, we will see that an fwmark is set. It say that all trafic matching 0x2 with bitmask 0x2 should use an alternative table 2. 

Now this is a bit tricky as you would expect that the data would be marked with mark 6. However fwmark is implying a bitmask 0x2. This bitmask works like a filter, only allowing bits to pass if the bitmask has that bit set to 1

So lets convert the last byte to binary
0x6                  = 0000 0110
bitmask 0x2     = 0000 0010
Result 0x6/0x2 = 0000 0010

0x2                  = 0000 0010
bitmask 0x2     = 0000 0010
Result 0x2/0x2 = 0000 0010


You can see that 0x6/0x2 will now match 0x2/0x2 and so the routing table 2 is chosen.

If we take a look at this alternative table 2, we will see that indeed it say to forward traffic for 192.168.149.0/24 via interface 1 / eth1 towards our isolated network

Static Mapping

Now that we know how the basic settings work we can look at static mapping. Static mapping is an alternative on top of the Netmap. Lets look for example at our server SF0006_backup_copy (192.168.149.36). If we want to reach it we will have to connect to 192.168.150.36. Our computer knows due to static routes that it must send packets for 192.168.150.36 to our vlab router 192.168.149.50 

If other clients in the same subnet want to talk with this server they will have to manually add the static routes. There is however another way. If for example in production you have a Free IP 192.168.149.136 you can map this IP to our server in our isolated environment. Other clients can just connect to 192.168.149.136 and the router will do the translation to our SF0006_backup_copy (192.168.149.36)

Static mapping is part of the virtual lab configuration. You can see I have enabled it

 I added a mapping in VM Network so that the production IP 192.168.149.136 will be mapped to the isolated IP 192.168.149.36

The result will be that SF0006_backup_copy is reachable on 2 addresses
  • 192.168.150.36 via the Netmap rule
  • 192.168.149.136 via Static mapping


If we look under the hood a couple of extra rules will be added


Static mapping adds 2 rules, one DNAT & one SNAT
The DNAT rule in Pre Routing stage
When?TypeComing from InterfaceExit InterfaceOriginal DestTranslation Dest
Pre RoutingDNAT eth0 / Interface 0*192.168.149.136192.168.149.36

The Masquerade rule in Post Routing stage
When?TypeEnter InterfaceExit InterfaceOriginal SourceTranslated SourceDestination
Post RoutingSNAT*eth1 Interface 1*192.168.149.2192.168.149.36

In 2 stages the source and destination will be rewritten
IP Packet
Source192.168.149.22 192.168.149.2
Destination192.168.149.136 192.168.149.36
MessageHello from Backup Server

An additional mark rule will be created so that not only traffic going to 192.168.150.x is marked with 0x6 but also traffic going to 192.168.149.x



Reference:

9 comments:

  1. I am very interested on this topic. I setup the surebackup in the same production network. It works fine. If I do it on different VLAN, it doesn't seem to work.

    ReplyDelete
  2. my Veeam backup is sitting on 192.168.1.0/24 which is once of a production network. The surebackup running fine. Now, I want to use SureBackup on 192.168.9.0/24. It doesn't seem to work.

    ReplyDelete
  3. Hi. I'm experiencing exactly what Daniel has noted. Even Veeam support hasn't been able to help me. I'm hoping a talented blogger like yourself knows the trick!

    ReplyDelete
  4. What is important is that you Veeam server and your vlab access point (proxy section) are in the same network. For every network you have in production you should make an isolated network and add a NIC with the correct settings. Next to this blog, I also made a webex about this: http://www.veeam.com/videos/surebackup-deepdive-tdewin-eng-2657.html . It is less deep dive but could maybe help you as well

    ReplyDelete
  5. Hi there. I am also experiencing the same problem. If the restored VM is in the same VLAN as the Veeam server and the Vlab proxy I can access remotely to the restored VM. If the restored VM is in a different VLAN as the Veeam server and the Vlab proxy I can not access nor ping the restored VM...

    There is a discussion in Veeam's Forum about the same problem:

    http://forums.veeam.com/vmware-vsphere-f24/surebackup-static-mapping-does-not-work-t16447.html

    Do you have any suggestion or comments about it? Thanks ;)

    ReplyDelete
  6. Thank you very much, your blog give me Green Result for my vLAB with Veeam Backup and Virtual Lab.
    Best Wishes.

    ReplyDelete
  7. In picture 15/29 can you explain where the Appliance IP of 192.168.1.1 suddenly comes from? It's not an IP that is mentioned in any part of the blog earlier.

    Thanks,
    Dan

    ReplyDelete
  8. In picture 15/29 can you explain where the Appliance IP of 192.168.1.1 suddenly comes from? It's not an IP that is mentioned in any part of the blog earlier.

    Thanks,
    Dan

    ReplyDelete
  9. This comment has been removed by a blog administrator.

    ReplyDelete