2014/02/27

Stop duct taping your IT infrastructure : Is Veeam enterprise ready?

FYI, this is going to be a rather long article. In fact, I think it could be more thought of as a white paper. However I feel that a white paper would block my creativity and will not allow me to tell my not so unbiased opinion. So whatever opinion I express here is my personal opinion, not per se the opinion of my current employer.

So the question I am often facing is "Is Veeam enterprise ready?". Honestly I think it is in a modern environment. However sometimes it is too painful to be honest to customers. So I'll just discuss some statements I hear while talking to customers and what are my answers to them. However before I do that, let me start with a small issue I am facing today in my daily life.


In my daily life I drive a Volkswagen Golf. In 99% of the time it is the perfect car for me. The reasons are simple:
  • I drive a lot alone so don't need a lot of space
  • It can drive on highways and it can get me everywhere
  • It is reliable 
  • It is not too expensive and since I am not a car freak, I kinda like it. It's that good balance between "I need to compensate for something" and "I don't care what it looks like, if it drives, it drives"
  • It can seat 4 people comfortably
This week I am facing a challenge though. I need to transport a rather big package and will be unable to fit in my Golf. This happens to me once a year. So actually the car doesn't fulfill my needs 100% of the time. So I have been thinking about this sweet baby below


  • It can drive me
  • It can do highways
  • It should be reliable (I hope)
  • It can fit more then 4 people
Additionally
  • Once a year I can go camping in this baby! I love camping!
  • I can move stuff really easily
So on paper it seems the perfect car. It can do everything, so why not buy it? Well lots of hidden daily costs for something I will only need 1% of the time:
  • Mileage per gallon will be horrible. Damn I hate physics and friction!
  • Taxes in Belgium are awful, imagine what this will cost me per year.
  • Although it will get me everywhere, visiting my customers in Brussels will be painful. Imagine having to park this monster.
So next time you buy a new car, don't buy a big car just because you need it once a year. In my case, I will have the package home delivered. It will cost me a bit more money but not in comparison with this sucker. As a bonus, it's a service, I don't have to do anything. But what does this have to do with backup? Let me come back to that one :)

Veeam and Physical machines? :
Answer is simple, Veeam doesn't do them. Please start by virtualizing all your servers today! There are no real good arguments not to do it. Often the argument is performance. Well, honestly, performance is not the issue. What is the issue is that you are over-provisioning your virtual environment and mostly don't have the I/O on your storage to allow for good performance. Sometimes I get the argument that local disks have lower latency. True, it's all about physics again but then again, there are great modern solutions like Nutanix, VSAN or PernixData that are taking away that pain by doing server side caching.

So virtualise them all! Often I hear people want to set up private / public clouds (cause it is the big buzz word). Well IaaS is the only cross-platform, cross-application solution that can offer true cloud solutions. So if you want to be cloud ready, this is the way to go. Decouple your machine from physical boundaries. Not only will you free up wasted resources but more importantly:
  • You don't have to set up horrible difficult cluster solutions to have HA. In fact I recently had this discussion with a partner. Most of the times clusters are so badly managed and configured that they are causing more downtime then that they offer high availability. Also they bring a lot of nasty requirements that makes administrating them a hell. On a side node, stop using Physical RDMs. These arguments are invalid:
    • Clusters: Read above!
    • Performance:VMware released a white paper 5 years ago (ESXi 3.0) saying you don't gain any performance by using RDMs
    • VMDK Size: 62 TB not big enough?
  • Your disaster recovery plan will be so much easier cause you are not locked down to a physical server
  • Your physical migrations projects will be so much easier. Nothing easier than storage vMotioning a VM to that brand new SAN box you bought.
The best bonus? You can backup up your machine with Veeam B&R and if needed restore it in 2 minutes via Instant VM Recovery! I challenge you to do that with a cold standby physical server. The most extreme example I ever heard was a bank that had a dedicated physical server for certain VM's (1 VM = 1 Physical Machine). They didn't care about optimization of resources but rather wanted all these business continuity features.

But I have this SQL physical server : 
Let it dump a backup on a CIFS / NFS share that is hosted by a virtual machine. Even better, stop duct taping your infrastructure and virtualize that load already. You are trying to patch an old setup instead of letting it evolve.

But I have this SQL physical server and I don't want to workaround:
Use file to tape to copy your MDF/LDF files  or your backup file to tape on a daily basis. In fact this is a feature not a lot of people know exists. Veeam actually allows you to backup files to tape from (physical) machines. Even with VSS intergration

However, tape is not made to handle a lot of small files but rather likes big files to stream nicely to tape. It is why Veeam doesn't backup VM's directly to tape but rather put  backup files on tape. So if you want to use it for a physical server, only do it for your big files that contain the data you care about. Those Windows DLL's are really not that useful.

Most painful aspect of this approach? What if the server crashes? You will need to reinstall it, reinstall the application, import the data and hope it all works! Personally I wish you good luck!

Or you could just virtualize the load and even test if your backups can be successfully restored with Surebackup. In fact, for some enterprise customers, Surebackup is one of the main reasons they switch to Veeam. They have the requirement to do recovery tests whether it is by law or company policies. However they don't have the storage nor the man power to do it on a monthly basis. Well guess what, Veeam can do it automatically for you without requiring any extra storage. That is what I call being enterprise ready.

When you will release a Veeam Explorer for Exchange 2003 (and other legacy applications questions):
Well this is the bad thing about virtualization. It has allowed for legacy applications to stay around for waaaaaaaaaay too long. With physical servers, migrating the workload was often combined with executing an upgrade (add a new node to the domain, then demote the old node). Unfortunately, vMotion has made it so easy to migrate that people are just moving the VM's to newer physical servers.

Even worse, when people start virtualizing they use VMware Convertor to P2V. Honestly these are the worst migrations. You have no idea if it will work correctly afterwards, you get stuck with IDE drives and you are duct taping again instead of evolving. I always tell customers, to do a clean installation of the OS and application. Then export and import the data (or add and demote). The one time effort will be bigger but later on you will reap the benefits.

So now once in a while I get the "is it supported" question about Exchange 2003 or SQL 2000. What about them? Why doesn't Veeam support it? Well actually the real question is, why are you still running it? Even Microsoft doesn't support the application itself, so why should a third party tool support it?:
But the real kicker is that you can actually backup these machines with Veeam. SQL 2000 for example can use the MSDE writer. And you can use file level recovery or U-AIR to recover data. Is it the best solution? Of course not! So stop duct taping your infrastructure. If you still run SQL 2000 it means you are running on a 14 year old technology. I know administrators are lazy but that means you haven't done any work on that machine for more then a decade...

The worst argument I heard was "Yes but our application requires SQL 2000". Please replace your application then and stop duct taping! If this vendor doesn't support 2008 or 2012, it is not worth your time. I'm sorry but it's true!

What about tape?
We have it in v7. Did I get excited when Veeam announced tape support? Not really but a lot of customers where asking for it cause they need cheap long time retention media or because they have "company requirements that state ...". Well sometimes you need to evolve your requirements. But anyway so Veeam has it, so making the product more enterprise ready!

Scalability?
Sometimes I see customers complaining that Veeam doesn't scale well. If I ask them about proxies, they have never heard of it or only installed one. If I ask them about backup storage they have a cheap NAS solution with 5 disks in RAID 6. Well, I'm sorry to tell you, your backup storage is the bottleneck, not Veeam.

Veeam is an easy solution but in enterprise environments, you really need to think about architecture, even for Veeam. If you want fast backups, you need the hardware to support it.

Besides in many Enterprise environments that use legacy agents based backups, Veeam will drastically reduce backup windows. There are even extreme examples of environments going from 48 backup windows (which doesn't fit in 1 day) to 2 hours. That's what I call enterprise ready.

Mixed environments: What about my AS400 or HPUX?
So this arguments, I see a lot. Often a customer has one legacy device running a core application. Somebody set it up 20 years ago, nobody knows how it works but it is really critical. Well maybe it is time to migrate that badboy to an x86 VM. Yes I know it is expensive but you will have to do it one day.


Essentially, this is what you are doing, you are ignoring the problem and making it worse. Sure add more duct tape and it will hold a bit more. But one day it will fail and it might be the end of your company.  So before all COBOL experts are dead, please migrate those critical applications while you still can.

Mixed environments: But I don't want to use 2 solutions
Well this argument is the worse. Often people will argue they have to install 2 applications. Well lets say you buy product x which has a plugin for VM's. Well in this case you will have to install and configure the main application and the plugin. Veeam is so easy to install and configure that I bet you it is easier to setup then the plugin itself. In this case you will end up with:
  • Veeam for 95% of your machines
  • Product x for that remaining 5% of your physical legacy machines, which you will migrate in the end anyway and not having to configure any plugin
Also from a daily management perspective, this won't create extra management. The time you invest in configuring and maintaining that horrible plugin, you can use to manage Veeam.

In fact, Veeam can be configured to seamlessly backup all your VM's in your environment. Because instead of selecting individual VM's, you can just backup resource pools, host and folders or datastores. When new VM's are created they are automatically being backed up by an existing job. What automagically?! You heard that right, it means you can refocus your time on innovative stuff instead of duct taping your daily backups. In environments with hundreds VM this is really a game changer.

So daily cost (opex/managemet) is not the issue. But what about licensing?

Well  all backup products are capacity based in some way (node, application, socket, tb, ...). Well because you only need to  license 5% of your old legacy environment, 95% of the budget will be free to buy Veeam licenses. And boy are you in luck! Veeam is affordable and doesn't have any hidden costs. You have 3 ESXi hosts? You license the ESXi sockets! That's it.
  •  If you want to run 1 VM or 100 VM's or 1000 VM's on those hosts, Veeam doesn't care.
  •  If you need to deploy new VM's to meet new business challenges, you don't have to worry about backup licenses, you will always be compliant. 
  • You want to deploy 5 SQL servers instead of 1 to split the load? Well no need to count and report those nodes, cause it doesn't matter. You have Exchange and/or Sharepoint? It's okay, you don't need to license any applications separately/
That's what I call cloud and enterprise ready licensing. Far to often licensing will kill innovation but no longer with Veeam. Some vendors have started making a sport in doing compliancy checks. If they would focus that effort on building a better product, Veeam could actually get some competition that is worth investigating.

What is the reality? CxO level persons look at "requirement" and they rather buy the "truck/van" instead of buying the golf. The result? A lot of frustration and OPEX costs afterwards. RFP and Tenders are the worst tools ever for IT decision makers because everybody can offer some form of "instant recovery" like bare metal recoveries. The reality is that these features are most of the time horribly implemented and are just not working. Test driving Veeam really helps selling the product, because people just can't believe that it actually does what it promises.

Finally because Veeam is cut out for the job, it will do fast backups and will offer advanced functionality that is just not possible if you do things via an agent based approach. The reason is simple, Veeam is not a duct tapped backup solution but rather a modern data protection product. It can keep up to speed with VMware (ESX) and Microsoft (HV) release cycles because that is the main focus.

So don't buy a product that focuses 95% of its effort on 5% of your environment, but rather buy one that focuses 100% on 95% of your environment and try to evolve that other 5%.

So is Veeam Enterprise Ready?
Well Veeam is Enterprise ready! If it can't fit in your environment, it means you need to rethink your architecture and evolve it so that your architecture is up to par with the latest standards (Or at least not running 14 year old technology)


So stop duct taping today and get ready for some real modern data protection!

2014/02/17

Veeam improves backing up VMs running on NFS

Working with Veeam Backup & Replication allows you to protect VMs in a very efficient way. Reading the data is done completely agentless and so all data is fetched via the hypervisor. For VMware, there are 3 ways of reading the data (Transport modes):
  • Direct SAN: Reading the data at the block level or in other words reading the data directly from the VMFS volumes. This is the best method as you don't consume any CPU or Memory of the production environment.
  • Hot-Add or Virtual appliance mode: Adding the VMDK of the VM you are trying to backup to a proxy VM. In this case your proxy must be a VM residing in the same cluster as the original VM. Notice that the proxy will consume CPU and Memory of the production cluster as it is a VM in the end. Most of the time this is not an issue because the backup process runs during the night.
  • NBD or Network mode: Reading the data via the vmkernel port. This is the least efficient method. However it is popular because 
    • It's super easy to setup. If the proxy can talk to the ESXi management kernel, it is setup.
    • It works good on 10GB networks
Roughly the order of picking the most efficient proxy with Veeam will be SAN > Hot-Add > NBD  for obvious reasons.

When you are looking at NFS there are only 2 options left, Hot-add and NBD. However, due to NFS v3 locking mechanisms, customers are only left with NBD as the only viable solution. What's the problem? Well if the proxy and the VM you are trying to backup are not on the same host, you might experience stuns when the VMDK is released after the backup. This is not a Veeam problem. In fact VMware has a nice kb article, you can find here:http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2010953

VMware suggest 2 solutions:
  • To work around this issue, ensure that the virtual machine resides in the same host where the backup appliance is installed.
  • Alternatively, you can use LAN/NBD transport that uses the NFC (Network File Copy) in your backup solution or disable SCSI hot-add through the backup software.
So most of the customers forced their proxies to use NBD so that they didn't have the stuns in favor of less efficient backups.

They could also make one job, tie it to one or two VMs and couple it to one proxy. Then using DRS affinities rules, you could keep the VMs and the proxy together on one host. Needless to say this approach is horrible from a management perspective

With the release of patch 3 (v7), however, Veeam has worked around this VMware issue. If you have NFS (Netapp, Nutanix, etc.) , please go to the patch page and download it now! :D

After you have installed it, you can use the following registry key:
EnableSameHostHotaddMode (DWORD)  : Intelligent load balancing can now be configured to give preference to backup proxy located on the same host

So you will need to enable the key in the registry. All our keys are stored under:
HKLM\SOFTWARE\VeeaM\Veeam Backup and Replication

Create the DWORD key there and set it to 1.

Once you have done this, you should create a proxy VM on each host in your cluster. Personally I have always felt that Windows Core Edition is perfect for this job. It installs quickly, doesn't consume any unnecessary resources and the Veeam proxy doesn't have any GUI anyway. Also, disable DRS for this VM so it stays on its corresponding host. Alternatively, install it on a local drive. Most server have a local disk that your are not using because you want your VM's to be mobile. In this case, this is what you are trying to avoid.

I also wondered, what will happen if the proxy is already busy with another task. However I got the feedback that Backup & Replication will wait forever for a proxy to become available, that is residing on the same host as long as there is a proxy online that can do Hot-add. Only if there is no Hot-add proxy or they all are offline, Veeam will switchover to NBD.

2014/02/10

Linux File Level Recovery: mounting instead of restoring!

If you have ever worked with the other operating system  restore wizard in Veeam, you know that restoring to the original server requires you to add that server to the console. This is because Veeam is agentless but it needs to be able to talk to that server in an efficient way.

A couple of weeks ago I got the question if you would be able to mount the files without copying them. It's a typical problem if you have an in-guest application making it's own backups and you don't want to copy everything, but just extract a bit of data. I was thinking about an helper appliance that would use instant VM recovery to attach the backupped disks to recover files. Then it would export the VMDK via NFS in-guest so that the original server could mount it.

This seemed a big workaround so I thought lets see if I can export it from the FLR appliance itself. Well it's not possible since the appliance is using busybox and no NFS is in there.

After looking around I came up with some cool workaround (well it got me excited but I'm just a techie). Since the FLR appliance does open up SSH, you can actually use SSHFS, a filesystem driver for SSH.

Before we start, know that starting from version 7 you can set the password for the FLR appliance in the credentials manager. Just go to the main menu (blue button in the top left corner) and choose "Manage Credentials"


Once you have started the FLR wizard, you should now be able to login. All partitions shown in the Linux FLR appliance are mounted under /media. My case the partition I'am interested in (or logical volume) is under /media/dm-0 but you can just mount /media if you don't want to login the appliance.





Finally write down the IP of the FLR appliance. You can see it via the VMware tools. This is handy when you used DHCP like I did. In my case it is 172.20.2.66






Now on you main system, you need to have sshfs support installed (as part of the fuse libraries). This depends on the linux version you are running. I'm running Centos and needed to install the EPEL repo. I "borrowed" some of the info from this blog  http://www.centosblog.com/sshfs-how-to-mount-remote-partition-via-ssh-on-centos/

If you don't want to read it, here are the commands to add the EPEL repo  (find the correct EPEL RPM here :https://fedoraproject.org/wiki/EPEL)

yum install wget -y
cd /tmp
wget http://epel.mirror.nucleus.be/6/i386/epel-release-6-8.noarch.rpm
yum install epel-release-6-8.noarch.rpm

After it is added, you can add the sshfs and fuse library

yum install fuse sshfs
modprobe fuse
I created a directory /recovery for mounting the FLR directory. The mount can now be done via the sshfs command
sshfs root@172.20.2.66:/media/dm-0 /recovery
 After this, you should be able to see the files and directory structure you have in your FLR window:






When you are done, just use the umount command, to unmount the volume. Make sure you are no longer browsing the directory

umount /recovery



While testing I used rsync to try to copy a lot of small files to see if it went fast (which was a typical problem in v6.5 but not in v7). Seems to work like a charm!


2014/01/29

What the cloud?!!

Recently I have noticed a lot of people using Veeam B&R for "Cloud" solutions. Cloud for me personally has become something indefinable because people are just using it for everything that is even remotely connected to the Internet (literally and figuratively speaking).

For Veeam users, "cloud" seems to be the challenge to get the data to a partner or a second location. But the possibilities here are not 1 solutions fits all. Further more the naming is sometimes a bit confusing -I admit- and causes people to uses the terms incorrectly. So here is my attempt to clarify the different possibilities and naming.

First of all I made this diagram to clarify some stuff. I suggest you take a look before continuing reading.

Replication Job (Part of Backup & Replication)

Replication job is what I started calling "Hypervisor Replication" as it is comparable like "Storage Replication" at SAN level. With replication, Veeam will backup a VM and restore it on-the-fly to a second vSphere (or Hyper-V but not cross hypervisor type) environment. The effect is that data is stored directly at the hypervisor level volume level (VMFS). The great advantage is that because your are replicating at a different layer in the infrastructure stack, Veeam is hardware independent which is the promise of virtualisation in the first place. Starting up the VM doesn't require you to implement a solution or scripting to resignature LUNs or re-register VMs. They VMs are just there, ready to be started. Veeam Backup & Replication console can assist you to change the network settings matching the target side. Also Failover and Failback can be done from the console. In this case, Failback will only need to sync back changes.

The great thing about hypervisor replication is that will copying, Veeam can make the data consistent (VSS consistency). You can even have multiple restore points so that you don't have to Failover to the latest replication (which might contain the initial corruption). These restore point will be implemented as snapshots at the target side

For replication it is advised to have a proxy at both sides. These will start a stable TCP/IP connections to each other to transfer the data. The primary proxy will do a disk level deduplication and some compression before sending the data over the wire to the receiving proxy. This proxy will "unpack" the data and inject the data back in the hypervisor environment. So for replication Veeam doesn't offer WAN Acceleration but does does offer WAN Optimization. This is important as some solutions sell WAN Optimization as WAN Acceleration.

So Replication is mostly used for DR and does not create backup files and although multiple restore points are possible they are mostly limited by VMware (28 snapshots) and disk space. What I noticed recently is that people are using the term replication to describe the backup copy job.

Backup Job (Part of Backup & Replication)

Backup job is what people get right in 99% of the time. It is a job that will take the data and store it in the Veeam proprietary format. Why not store it in the native format? Well it is all about disk savings!

With Veeam there are 2 strategies: (Forward) Incremental and Reverse Incremental. Both have their advantages and disadvantages. This is sometimes an overlooked setting which can have an impact on off-loading jobs. These are also job level settings and so if you look in a repository, each job will create a folder which contains the proprietary backup chain.

A job can have multiple VMs and this is good because deduplication will be done at the job level backup chain files. So try to group similar VMs together to get better storage savings

With Forward Incremental you will start the first day by creating a full backup of all VMs and store that data in a VBK format. Inside this file/format, compression and deduplication is applied. Next day and incremental VIB file is created, only saving the data that has been changed. The process will continue the next day and more VIB files will be created. There is a catch however, these VIB files are linked to each other and to the first Full. So imagine you setup a retention policy of 3 restore points, you can't throw away anything after 4 backups because the last VIB is still dependent on that first full. To solve this issue, Veeam force you to run an active full or synthetic full effectively discontinuing the old chain and starting a new chain. Active full means you will just read all the data from the production storage and create a new full VBK like you did the very first time. Synthetic full means that you will first run an incremental backup. Then as a post process, you will create a full VBK based on the data that is already in the repository, thus allowing you to create forever incremental backups.This post process is quite I/O intensive as it needs to read and write all the data. Thus you could for example spread the load of creating these synthetic fulls over multiple day by creating multiple jobs and letting one job run synthetic full run on Friday, the other one on Saturday and the other on Sunday, etc.

With Reverse Incremental the strategy is different. First day you will create a VBK file like with  Forward Incremental. The second day however you will update this VBK file. I will always explain it as "copy-on-write". Before Veeam makes a change to the VBK data, it will first read that data store it in a VRB (Reverse Increment file) and then overwrite it. The effect is that you get incremental backups but your latest backup is always the full backup. In effect you will create a reverse backup chain. And so in case you configured 3 restore points, you can throw away stuff after the 4th run because nothing is dependent on the oldest VRB file. Rather the last VRB is dependent on previous VRB/VBK files. With Reverse Incremental you can only run an active full as synthetic full wouldn't make sense. This will also discontinue the chain as it will create a new VBK from scratch leaving the old one in place.

Veeam Support has made some great animations to make everything a bit more clear, if you got lost in translation: http://www.veeam.com/kb1799

I always advise people to run an active full every 1 month refreshing all the data and making sure that 1 corruption does not stay forever incremental in your backups. If you have multiple jobs, you can spread the load over multiple weekends. 1 job can create a full the first week, another one the second week, etc.

So when to use what? Use Reverse Incremental if you only backup to disk and optionally use the backup copy job. It will save space as it doesn't force you to create weekly (synthetic) fulls. However if you want to copy the data, the copy process will have to process a full VBK on a daily basis because Veeam updates the VBK. The only exception is the backup copy job because it has some built in intelligence as it is native to backup & replication.

Further more, the copy-on-write process will require more I/O because of the read/write/update process. Finally this I/O is fairly random albeit that a relatively big block size is used. So some deduplication devices might not like it so much. In these cases, use the default Forward Incremental, sacrificing disk space but only the need to copy a VBK once a week with any other copy process.

Backup Copy Job (Part of Backup & Replication)

First of all, backup copy job is not replication :D. I can't stress this enough. The backup copy job does +- what its names suggest it does, except it does not. It will copy backup data but not VBK/VIB/VRB files. This is confusing for people but it adds great flexibility. When you create a new backup copy job, and you link a primary job it won't actually link the primary job but rather add all the VMs to the backup copy job that you configured in the primary job. When the backup copy job runs it will look for restore points for each individual VM in all backups, copy the latest restore point and create its own backup chain in the second repository. This look a bit like forward incremental but after the retention policy is fulfilled the oldest increment (VIB) will be rolled into the VBK file. This is sometimes confusing when people use Reverse Incremental on the primary job as they don't see VRB files but VIB files. Great thing is that the backup copy job is granular as you can pick out maybe only a subset of VMs to copy.

GFS policy is part of the backup copy job because you could use backup copy job to tier between fast and slower storage for longer retention. In this case it is important to understand that the backup copy job will almost run in sync with the primary job. So if you configure 7 restore points and 14 restore points on the copy job you won't have 21 unique restore points. Rather you will have 7 restore points that have a duplicate twin because of the backup copy and 7 older unique restore point after the backup copy job has synced.

The great thing about the backup copy job is that it is the first and only job in v7 that can make use of WAN Acceleration (so not only optimization). This is done by adding a WAN accelerator role (Windows Service) at both sides. This WAN accelerators will keep a global cache which can be shared by multiple backup copy job and multiple runs. Further more fingerprinting will be done at a very small block level so to give you real WAN acceleration. The downfall is that processing will take longer and this is the reason why it is part of the backup copy job and not of the primary job.

Important thing about the backup copy job is that the target is a repository. So if you want to copy the data to another location, Veeam will need to have an data mover/agent at the other side. This agent can run on a Linux server or a Windows server. However if you want to use WAN acceleration, you will require a Windows server as this service is Windows x64 only. Good news is that, you can have both WAN Acceleration and Repository role installed on the same Windows machine. Further more the source backup architecture will need to be able to talk to this windows server over IP, so some kind of VPN/MPLS solution has to be in place.

Backup Copy Job and Cloud backup are +- complementary "jobs" however backup copy job requires you to have CPU, Memory and Storage at the "cloud side". Cloud backup does not but then again can't do any WAN Acceleration.

Backup to Tape (Part of Backup & Replication)

Going to be fairly short about it. It does exactly what you think it does. It copies the backup files (VBK/VIB) to tape. In this case it makes sense to use forward incremental if you do a daily backup to tape. With Reverse Incremental, it will have to copy a VBK file each day to tape potentially consuming a lot of tapes.

Important notice, Veeam will be restore point aware if you use backup to tape. That means, it knows where which restore point for which VM is located. However since it copies VBK/VIB files, restoring means it first has to stage those file to a repository and then allow you to do the final restore. It will be automagically but it might take time. It is one good reason to split up job because to restore a VM that is 50GB but is stored in a VBK of 4TB (because you added a lot of VMs to the job) requires to first stage the whole VBK (and additional VIB if required)

If you are thinking about vaulting, you can create a second media pool/backup to tape job next to your local backups to tape. This job can be configured so that at the end of the run it will export the tapes to the I/O slots, ready for some field engineer to get them physically out of the library and bring them to a remote location.

Cloud backup (Not part of Backup & Replication)

Cloud backup is part of the Cloud Edition Veeam offer. It is an add-on solution which can not bought separately.  When you buy Cloud Edition, you get Backup & Replication & Cloud backup. Also you don't really buy Cloud Edition but you rather rent it on a yearly basis (subscription fee). This make Cloud Edition a good OPEX solution instead of a CAPEX buy it all at once solution.

Basically what Cloud Edition does is, it copies backups file (VBK/VIB) to the Cloud. In this case, cloud can be Amazone, Azure, HP or even local players offering Openstack SWIFT storage. What is important is that Cloud Edition will talk to Storage API to store bits and bytes. This means those APIs don't need to be Veeam aware as it doesn't processes those bytes but just stores them. So no Veeam components need to be installed at the other end and so there are already a lot of partner offering potential "Veeam backup storage space". Further more, since the API are HTTP based (mostly some form of S3), no complicated VPN has to be setup for each customer. Finally, solutions like Openstack are multi-tenant at their core so a Partner does not need to set up different Openstack servers for each customer.

One of the cool features that is in cloud edition is that you can also use encryption. In this case the backups will be encrypted before sending them to the cloud.

The downfall is that features like WAN acceleration are not possible as Veeam can not install any service at the other side. This means that copying might consume more bandwidth. Furthermore, doing a FLR can't be done in the cloud by importing the backups there, but you need to copy back the whole VBK and import it back in backup and replication. For Amazon there is some scenario where you can use an EC2 container and then do an extract in the cloud.

With Cloud it is also advised to use Forward Incremental instead of Reverse Incremental.

Again, Backup Copy Job and Cloud backup are +- complementary "jobs" however backup copy job requires you to have CPU, Memory and Storage at the "cloud side". Cloud backup does not but then again can't do any WAN Acceleration and does require a different license.


2014/01/14

Miniblog: Powershell Out-Gridview cmdlet

Normally I'm not so enthusiastic about a simple powershell cmdlet but in this case I wonder why I didn't find it earlier.

With Powershell you will end up with arrays of objects all the time. Visualizing them on the console can be tricky. Worse if you want to select a certain object, you will need to display them and then let the user input some number that corresponds with the object you want to select. Most of the time I will end up with using the in-line where clause to filter out objects but it is not really dynamic.

Until this morning I discovered the out-gridview. This cmdlet will take a stream of objects and show them in a GUI gridview. If you use the -passthru parameter it even allows you to select an object. To give you an example, here is a "Process killer" script in 1 line of code

Get-Process | Out-GridView -passthru -title "Kill me now" | % { Stop-Process $_ }
When you run it a gridview will appear:
 
Then you can select the filter view to filter out for example notepad. If you then select notepad and click ok, it will send the select object(s) down the pipeline where the foreach-object will iterate over the objects and stops the processes

2014/01/10

Track my Veeam Restore Point

When using Veeam Backup & Replication, you will see that backup files are created at job level. The good thing is that this introduces job level dedup. However when you look at the backup files, it is hard to identify which file includes which VM. It's not really a problem as the Veeam DB will keep track of this.

But what if you want to copy restore points with another tool? Some companies are still running 2 backup products, one for  physical, one for Veeam. Typically the first application also has the tape library connected. Or what if you just want to copy your backups to the cloud. How are you supposed to know which VM is stored in which files if it is no longer being tracked by backup and replication?

Well this pet project might help a bit. First of all, I just wrote it and it is not tested. The reason is simply not to waste too much time developing something nobody uses. For me the goal was rather to learn some more about PowerShell (I am planning to do some Powershell webinars soon, so need to practice and get some cool examples made). So remember:
  • It's not tested, be careful with it
  • It potentially can generate a lot of data because it is not optimized
  • It won't truncate any data :)
  • Enjoy!
You can find it here

In essence, this powershell script will just query the restore points that are currently active on the Veeam Backup Server. Then it will understand which files are linked to it and insert that data in a SQL db.

So first thing you need to do is to create a database in your SQL server instance.  The default settings will be
  • instance: localhost
  • db : rpdb
  • user : rpdbuser
  • password : rpdbpass
In my test setup I made the rpdbuser db owner of the rpdb. You can change the parameters in the script:
Make sure you have the database setup. Then you can run the script with the -createdb flag. It should create all the tables:
From that moment you can start collecting restore point meta data. If you run the script without filter, all the restore points at that point will be inserted in the db. If you would do this daily, you will see duplicates popping up. This is certainly the case for forward incremental. However if you are using reverse incremental, it makes sense to have duplicate restore points as they have shifted probably from filename.

To get more control over what should be stored and what not, I would recommend using filters to only store the restore points data that you are shipping off site. Then running the collect script as part of a post job or after your copy is done by your third party application.

Collecting data will be done with the -collect switch. However I added a -dry switch so that you can test before you let the powershell script insert data in db. Here are some examples of filtering:

No filter:

-Collectfiles "file1.vbk,file2.vib" which will only store those restore points related to backup files you might have shipped

-Collectonlyfull switch which will only store info about full vbk
-Collectjob "jobname" which makes sense if you run as a postscript on a job
Or just a combination of collectjob and collectonlyfull
If you are happy with the result, remove the dry command and let it run whenever you want to track the state of restore points:

Then you can use the -vm paramter and the -begin and -end parameter (f.e  -vm sharepoint -begin "2013-01-01" -end "2014-02-01") to find a restore point for a certain VM. Remember the date will not be when you tracked it but rather the date when the restore point is created. Then if you find the restore point you like, you can use the -id parameter to understand which file you need to get back to the repository to do a restore for a certain file.

Finally, could you use this functionality in combination with files to tape instead of backup to tape? You could but it still requires you to do a lot manual setup and manual querying and restoring. So if you can, let Veeam track everything and you will have nothing to do and not the worry about setting up anything!

2013/12/19

1-Click Veeam Install on Windows 2012

With v7 there were a lot of improvements. One of those is the ability to do a silent or unattended install of Veeam Backup & Replication. There is actually a good kb article that describe how to install each individual component but it actually requires you to figure out dependencies or how to install SQL express. You can find it here.

One of the interesting scenario's why you could do and automated install is if you want to install a backup server at each branch office to use as a "restore" installation as described in my previous article "after the backup copy job the auto import". For me the trigger was actually a partner that just wanted an installer he could use at a customer to set up a very simple PoC very quickly.

After some considerations, I decided to install everything with powershell. Not only is it the preferred language for Windows 2012 scripting but it is also your primary scripting tool for Veeam Backup & Replication. The target platform is 2012, so I have no idea if my findings will work for 2008 and so on (for example the script will enable dotnet 3.5 on Windows 2012). So feedback is really appreciated.

You can find the automation script here

Now there are some dependencies that you need to fulfill before you can use it:
  • Create a Windows 2012 machine
  • Add it to a domain
  • Create a user srvveeam in your domain (or change the param section at the start of the script)
  • Download the Veeam Backup & Replication ISO. Then extract the following folders in a new empty folder: Backup, Catalog, Explorers and Redistr. Then add the script, bat file and your license (veeam_backup.lic) in the same folder as shown below.
  • Give srvveeam user the required permissions for backup and replication in vCenter. You can find the fined grained permissions in the doc section of Veeam. Also alter the vCenter address in the script.
  • I recommend scrolling through it and fine tweaking it to your environment.
Now installation will be quite easy, just right click runmeasadmin and run the bat file as an administrator.

Installation should start and install all necessary requirements for Windows 2012 including SQL Express. If one of the components is not installed successfully it  should fail. If something is already installed, the script should skip installing the specific item.

 

Cool thing about using powershell is that you can load the powershell snapin after you install it. So it means you autoconfigure Veeam for example to add proxies or repositories automatically to the new installation. This script will do the following in B&R after installation:
  • Create credentials
  • Create a new repository
  • Add vCenter
  • Create a single job that includes all the VMs (Maybe test and tweak before you use it in your 1000VM environment ;)). It will use the credentials created before to enable Application Aware Image Processing

 But of course your imagination is your only limitation when using Powershell :D.