2014/01/29

What the cloud?!!

Recently I have noticed a lot of people using Veeam B&R for "Cloud" solutions. Cloud for me personally has become something indefinable because people are just using it for everything that is even remotely connected to the Internet (literally and figuratively speaking).

For Veeam users, "cloud" seems to be the challenge to get the data to a partner or a second location. But the possibilities here are not 1 solutions fits all. Further more the naming is sometimes a bit confusing -I admit- and causes people to uses the terms incorrectly. So here is my attempt to clarify the different possibilities and naming.

First of all I made this diagram to clarify some stuff. I suggest you take a look before continuing reading.

Replication Job (Part of Backup & Replication)

Replication job is what I started calling "Hypervisor Replication" as it is comparable like "Storage Replication" at SAN level. With replication, Veeam will backup a VM and restore it on-the-fly to a second vSphere (or Hyper-V but not cross hypervisor type) environment. The effect is that data is stored directly at the hypervisor level volume level (VMFS). The great advantage is that because your are replicating at a different layer in the infrastructure stack, Veeam is hardware independent which is the promise of virtualisation in the first place. Starting up the VM doesn't require you to implement a solution or scripting to resignature LUNs or re-register VMs. They VMs are just there, ready to be started. Veeam Backup & Replication console can assist you to change the network settings matching the target side. Also Failover and Failback can be done from the console. In this case, Failback will only need to sync back changes.

The great thing about hypervisor replication is that will copying, Veeam can make the data consistent (VSS consistency). You can even have multiple restore points so that you don't have to Failover to the latest replication (which might contain the initial corruption). These restore point will be implemented as snapshots at the target side

For replication it is advised to have a proxy at both sides. These will start a stable TCP/IP connections to each other to transfer the data. The primary proxy will do a disk level deduplication and some compression before sending the data over the wire to the receiving proxy. This proxy will "unpack" the data and inject the data back in the hypervisor environment. So for replication Veeam doesn't offer WAN Acceleration but does does offer WAN Optimization. This is important as some solutions sell WAN Optimization as WAN Acceleration.

So Replication is mostly used for DR and does not create backup files and although multiple restore points are possible they are mostly limited by VMware (28 snapshots) and disk space. What I noticed recently is that people are using the term replication to describe the backup copy job.

Backup Job (Part of Backup & Replication)

Backup job is what people get right in 99% of the time. It is a job that will take the data and store it in the Veeam proprietary format. Why not store it in the native format? Well it is all about disk savings!

With Veeam there are 2 strategies: (Forward) Incremental and Reverse Incremental. Both have their advantages and disadvantages. This is sometimes an overlooked setting which can have an impact on off-loading jobs. These are also job level settings and so if you look in a repository, each job will create a folder which contains the proprietary backup chain.

A job can have multiple VMs and this is good because deduplication will be done at the job level backup chain files. So try to group similar VMs together to get better storage savings

With Forward Incremental you will start the first day by creating a full backup of all VMs and store that data in a VBK format. Inside this file/format, compression and deduplication is applied. Next day and incremental VIB file is created, only saving the data that has been changed. The process will continue the next day and more VIB files will be created. There is a catch however, these VIB files are linked to each other and to the first Full. So imagine you setup a retention policy of 3 restore points, you can't throw away anything after 4 backups because the last VIB is still dependent on that first full. To solve this issue, Veeam force you to run an active full or synthetic full effectively discontinuing the old chain and starting a new chain. Active full means you will just read all the data from the production storage and create a new full VBK like you did the very first time. Synthetic full means that you will first run an incremental backup. Then as a post process, you will create a full VBK based on the data that is already in the repository, thus allowing you to create forever incremental backups.This post process is quite I/O intensive as it needs to read and write all the data. Thus you could for example spread the load of creating these synthetic fulls over multiple day by creating multiple jobs and letting one job run synthetic full run on Friday, the other one on Saturday and the other on Sunday, etc.

With Reverse Incremental the strategy is different. First day you will create a VBK file like with  Forward Incremental. The second day however you will update this VBK file. I will always explain it as "copy-on-write". Before Veeam makes a change to the VBK data, it will first read that data store it in a VRB (Reverse Increment file) and then overwrite it. The effect is that you get incremental backups but your latest backup is always the full backup. In effect you will create a reverse backup chain. And so in case you configured 3 restore points, you can throw away stuff after the 4th run because nothing is dependent on the oldest VRB file. Rather the last VRB is dependent on previous VRB/VBK files. With Reverse Incremental you can only run an active full as synthetic full wouldn't make sense. This will also discontinue the chain as it will create a new VBK from scratch leaving the old one in place.

Veeam Support has made some great animations to make everything a bit more clear, if you got lost in translation: http://www.veeam.com/kb1799

I always advise people to run an active full every 1 month refreshing all the data and making sure that 1 corruption does not stay forever incremental in your backups. If you have multiple jobs, you can spread the load over multiple weekends. 1 job can create a full the first week, another one the second week, etc.

So when to use what? Use Reverse Incremental if you only backup to disk and optionally use the backup copy job. It will save space as it doesn't force you to create weekly (synthetic) fulls. However if you want to copy the data, the copy process will have to process a full VBK on a daily basis because Veeam updates the VBK. The only exception is the backup copy job because it has some built in intelligence as it is native to backup & replication.

Further more, the copy-on-write process will require more I/O because of the read/write/update process. Finally this I/O is fairly random albeit that a relatively big block size is used. So some deduplication devices might not like it so much. In these cases, use the default Forward Incremental, sacrificing disk space but only the need to copy a VBK once a week with any other copy process.

Backup Copy Job (Part of Backup & Replication)

First of all, backup copy job is not replication :D. I can't stress this enough. The backup copy job does +- what its names suggest it does, except it does not. It will copy backup data but not VBK/VIB/VRB files. This is confusing for people but it adds great flexibility. When you create a new backup copy job, and you link a primary job it won't actually link the primary job but rather add all the VMs to the backup copy job that you configured in the primary job. When the backup copy job runs it will look for restore points for each individual VM in all backups, copy the latest restore point and create its own backup chain in the second repository. This look a bit like forward incremental but after the retention policy is fulfilled the oldest increment (VIB) will be rolled into the VBK file. This is sometimes confusing when people use Reverse Incremental on the primary job as they don't see VRB files but VIB files. Great thing is that the backup copy job is granular as you can pick out maybe only a subset of VMs to copy.

GFS policy is part of the backup copy job because you could use backup copy job to tier between fast and slower storage for longer retention. In this case it is important to understand that the backup copy job will almost run in sync with the primary job. So if you configure 7 restore points and 14 restore points on the copy job you won't have 21 unique restore points. Rather you will have 7 restore points that have a duplicate twin because of the backup copy and 7 older unique restore point after the backup copy job has synced.

The great thing about the backup copy job is that it is the first and only job in v7 that can make use of WAN Acceleration (so not only optimization). This is done by adding a WAN accelerator role (Windows Service) at both sides. This WAN accelerators will keep a global cache which can be shared by multiple backup copy job and multiple runs. Further more fingerprinting will be done at a very small block level so to give you real WAN acceleration. The downfall is that processing will take longer and this is the reason why it is part of the backup copy job and not of the primary job.

Important thing about the backup copy job is that the target is a repository. So if you want to copy the data to another location, Veeam will need to have an data mover/agent at the other side. This agent can run on a Linux server or a Windows server. However if you want to use WAN acceleration, you will require a Windows server as this service is Windows x64 only. Good news is that, you can have both WAN Acceleration and Repository role installed on the same Windows machine. Further more the source backup architecture will need to be able to talk to this windows server over IP, so some kind of VPN/MPLS solution has to be in place.

Backup Copy Job and Cloud backup are +- complementary "jobs" however backup copy job requires you to have CPU, Memory and Storage at the "cloud side". Cloud backup does not but then again can't do any WAN Acceleration.

Backup to Tape (Part of Backup & Replication)

Going to be fairly short about it. It does exactly what you think it does. It copies the backup files (VBK/VIB) to tape. In this case it makes sense to use forward incremental if you do a daily backup to tape. With Reverse Incremental, it will have to copy a VBK file each day to tape potentially consuming a lot of tapes.

Important notice, Veeam will be restore point aware if you use backup to tape. That means, it knows where which restore point for which VM is located. However since it copies VBK/VIB files, restoring means it first has to stage those file to a repository and then allow you to do the final restore. It will be automagically but it might take time. It is one good reason to split up job because to restore a VM that is 50GB but is stored in a VBK of 4TB (because you added a lot of VMs to the job) requires to first stage the whole VBK (and additional VIB if required)

If you are thinking about vaulting, you can create a second media pool/backup to tape job next to your local backups to tape. This job can be configured so that at the end of the run it will export the tapes to the I/O slots, ready for some field engineer to get them physically out of the library and bring them to a remote location.

Cloud backup (Not part of Backup & Replication)

Cloud backup is part of the Cloud Edition Veeam offer. It is an add-on solution which can not bought separately.  When you buy Cloud Edition, you get Backup & Replication & Cloud backup. Also you don't really buy Cloud Edition but you rather rent it on a yearly basis (subscription fee). This make Cloud Edition a good OPEX solution instead of a CAPEX buy it all at once solution.

Basically what Cloud Edition does is, it copies backups file (VBK/VIB) to the Cloud. In this case, cloud can be Amazone, Azure, HP or even local players offering Openstack SWIFT storage. What is important is that Cloud Edition will talk to Storage API to store bits and bytes. This means those APIs don't need to be Veeam aware as it doesn't processes those bytes but just stores them. So no Veeam components need to be installed at the other end and so there are already a lot of partner offering potential "Veeam backup storage space". Further more, since the API are HTTP based (mostly some form of S3), no complicated VPN has to be setup for each customer. Finally, solutions like Openstack are multi-tenant at their core so a Partner does not need to set up different Openstack servers for each customer.

One of the cool features that is in cloud edition is that you can also use encryption. In this case the backups will be encrypted before sending them to the cloud.

The downfall is that features like WAN acceleration are not possible as Veeam can not install any service at the other side. This means that copying might consume more bandwidth. Furthermore, doing a FLR can't be done in the cloud by importing the backups there, but you need to copy back the whole VBK and import it back in backup and replication. For Amazon there is some scenario where you can use an EC2 container and then do an extract in the cloud.

With Cloud it is also advised to use Forward Incremental instead of Reverse Incremental.

Again, Backup Copy Job and Cloud backup are +- complementary "jobs" however backup copy job requires you to have CPU, Memory and Storage at the "cloud side". Cloud backup does not but then again can't do any WAN Acceleration and does require a different license.


2014/01/14

Miniblog: Powershell Out-Gridview cmdlet

Normally I'm not so enthusiastic about a simple powershell cmdlet but in this case I wonder why I didn't find it earlier.

With Powershell you will end up with arrays of objects all the time. Visualizing them on the console can be tricky. Worse if you want to select a certain object, you will need to display them and then let the user input some number that corresponds with the object you want to select. Most of the time I will end up with using the in-line where clause to filter out objects but it is not really dynamic.

Until this morning I discovered the out-gridview. This cmdlet will take a stream of objects and show them in a GUI gridview. If you use the -passthru parameter it even allows you to select an object. To give you an example, here is a "Process killer" script in 1 line of code

Get-Process | Out-GridView -passthru -title "Kill me now" | % { Stop-Process $_ }
When you run it a gridview will appear:
 
Then you can select the filter view to filter out for example notepad. If you then select notepad and click ok, it will send the select object(s) down the pipeline where the foreach-object will iterate over the objects and stops the processes

2014/01/10

Track my Veeam Restore Point

When using Veeam Backup & Replication, you will see that backup files are created at job level. The good thing is that this introduces job level dedup. However when you look at the backup files, it is hard to identify which file includes which VM. It's not really a problem as the Veeam DB will keep track of this.

But what if you want to copy restore points with another tool? Some companies are still running 2 backup products, one for  physical, one for Veeam. Typically the first application also has the tape library connected. Or what if you just want to copy your backups to the cloud. How are you supposed to know which VM is stored in which files if it is no longer being tracked by backup and replication?

Well this pet project might help a bit. First of all, I just wrote it and it is not tested. The reason is simply not to waste too much time developing something nobody uses. For me the goal was rather to learn some more about PowerShell (I am planning to do some Powershell webinars soon, so need to practice and get some cool examples made). So remember:
  • It's not tested, be careful with it
  • It potentially can generate a lot of data because it is not optimized
  • It won't truncate any data :)
  • Enjoy!
You can find it here

In essence, this powershell script will just query the restore points that are currently active on the Veeam Backup Server. Then it will understand which files are linked to it and insert that data in a SQL db.

So first thing you need to do is to create a database in your SQL server instance.  The default settings will be
  • instance: localhost
  • db : rpdb
  • user : rpdbuser
  • password : rpdbpass
In my test setup I made the rpdbuser db owner of the rpdb. You can change the parameters in the script:
Make sure you have the database setup. Then you can run the script with the -createdb flag. It should create all the tables:
From that moment you can start collecting restore point meta data. If you run the script without filter, all the restore points at that point will be inserted in the db. If you would do this daily, you will see duplicates popping up. This is certainly the case for forward incremental. However if you are using reverse incremental, it makes sense to have duplicate restore points as they have shifted probably from filename.

To get more control over what should be stored and what not, I would recommend using filters to only store the restore points data that you are shipping off site. Then running the collect script as part of a post job or after your copy is done by your third party application.

Collecting data will be done with the -collect switch. However I added a -dry switch so that you can test before you let the powershell script insert data in db. Here are some examples of filtering:

No filter:

-Collectfiles "file1.vbk,file2.vib" which will only store those restore points related to backup files you might have shipped

-Collectonlyfull switch which will only store info about full vbk
-Collectjob "jobname" which makes sense if you run as a postscript on a job
Or just a combination of collectjob and collectonlyfull
If you are happy with the result, remove the dry command and let it run whenever you want to track the state of restore points:

Then you can use the -vm paramter and the -begin and -end parameter (f.e  -vm sharepoint -begin "2013-01-01" -end "2014-02-01") to find a restore point for a certain VM. Remember the date will not be when you tracked it but rather the date when the restore point is created. Then if you find the restore point you like, you can use the -id parameter to understand which file you need to get back to the repository to do a restore for a certain file.

Finally, could you use this functionality in combination with files to tape instead of backup to tape? You could but it still requires you to do a lot manual setup and manual querying and restoring. So if you can, let Veeam track everything and you will have nothing to do and not the worry about setting up anything!

2013/12/19

1-Click Veeam Install on Windows 2012

With v7 there were a lot of improvements. One of those is the ability to do a silent or unattended install of Veeam Backup & Replication. There is actually a good kb article that describe how to install each individual component but it actually requires you to figure out dependencies or how to install SQL express. You can find it here.

One of the interesting scenario's why you could do and automated install is if you want to install a backup server at each branch office to use as a "restore" installation as described in my previous article "after the backup copy job the auto import". For me the trigger was actually a partner that just wanted an installer he could use at a customer to set up a very simple PoC very quickly.

After some considerations, I decided to install everything with powershell. Not only is it the preferred language for Windows 2012 scripting but it is also your primary scripting tool for Veeam Backup & Replication. The target platform is 2012, so I have no idea if my findings will work for 2008 and so on (for example the script will enable dotnet 3.5 on Windows 2012). So feedback is really appreciated.

You can find the automation script here

Now there are some dependencies that you need to fulfill before you can use it:
  • Create a Windows 2012 machine
  • Add it to a domain
  • Create a user srvveeam in your domain (or change the param section at the start of the script)
  • Download the Veeam Backup & Replication ISO. Then extract the following folders in a new empty folder: Backup, Catalog, Explorers and Redistr. Then add the script, bat file and your license (veeam_backup.lic) in the same folder as shown below.
  • Give srvveeam user the required permissions for backup and replication in vCenter. You can find the fined grained permissions in the doc section of Veeam. Also alter the vCenter address in the script.
  • I recommend scrolling through it and fine tweaking it to your environment.
Now installation will be quite easy, just right click runmeasadmin and run the bat file as an administrator.

Installation should start and install all necessary requirements for Windows 2012 including SQL Express. If one of the components is not installed successfully it  should fail. If something is already installed, the script should skip installing the specific item.

 

Cool thing about using powershell is that you can load the powershell snapin after you install it. So it means you autoconfigure Veeam for example to add proxies or repositories automatically to the new installation. This script will do the following in B&R after installation:
  • Create credentials
  • Create a new repository
  • Add vCenter
  • Create a single job that includes all the VMs (Maybe test and tweak before you use it in your 1000VM environment ;)). It will use the credentials created before to enable Application Aware Image Processing

 But of course your imagination is your only limitation when using Powershell :D.

2013/12/13

Miniblog Series P003: When are my Veeam socket licenses being used/assigned?

A lot of times I got the question from people: "Do I need to license my target ESXi hosts for Veeam Backup & Replication in case I want to use the replication feature?" From a Veeam perspective, you don't, you only need licenses for the source hosts. However you do require VMware licenses at the source and target side because we need to be able to talk to the VMware API (which is not possible with the ESXi free edition.

A lot of companies also have separate development and production environments but both are typically connected to one vCenter. In this case, they get worried too if they only want Veeam for production. Because what will happen if you add vCenter to Veeam? Will it start complaining about not having enough sockets? No it won't!

If you read our FAQ it will actually tell you this info:
http://forums.veeam.com/viewtopic.php?f=2&t=17633#p85109

Q: How the product is licensed?
A: Per physical CPU socket of "source" hypervisor host (where protected virtual machines reside). Destination hosts for replication and migration jobs do not need to be licensed. Hosts running virtual machines which are not being processed by Veeam do not need to be licensed, even if they are a part of the same cluster.

Then how do we do the licensing. Actually this info is also in the FAQ.

Q: At what specific moment do the source host sockets get counted towards the licensed sockets pool?
A: Upon first backup, replication or copy of a VM that is running on the given host.

So when you start a new backup job it will first validate if the host running the source VM has the necessary sockets entitled to it. If it has, Veeam will continue to backup the VM. If there are no sockets assigned to the host, Veeam will check if you still have licenses available to assign. If yes, Veeam will assign them dynamically and continue to do the backup. If there are no sockets left, the backup will fail.

Now how do you check which host have which licenses assigned? Well just go to the main menu . There select help and then licensing information.


This will open up the license information dialog. If you click the "Licensed Hosts" button in the bottom left corner of this dialog, you can check your socket assignment. If a socket was accidentally assigned to a development host you can select it and then click the revoke button to remove the socket.


In this dialog you can also check the total/remaining amount of sockets.

In case you added your backup server to the Enterprise Manager, it will act as the central licensing server. In this case you execute the same actions from this central console. Just go to configuration tab and then select licensing.


 

Miniblog Series P002 : How do I succesfully open a call at Veeam?

For most vendors opening call is a very annoying and tedious process. Well for Veeam it is actually really easy and I can only recommend just opening a call with every problem you have. Sometimes I will visit people just to hear that they are having a very frustrated problems for months but then when I ask to send me a call number they have never actually opened a call. To be honest, this is a chicken/egg situation. You are not telling Veeam that you have a problem so it impossible for Veeam to solve it.

Collecting Logs

First of all start by collecting the logs. You can do this by going to the main menu. There under help > support information you can start the collection wizard.


When you first start the collection wizard it will ask you to define the scope. I can strongly recommend to do this as this will decrease the size of your log package dramatically. In case you have an error with a certain job, pick the job that is having troubles:


Next step is the Data range. Again limit the amount of days. I would suggest not only including the days with errors but at least to include one day where the job ran successful. This way you can show that the job has ran successfully in the past.


Now just enter a location where you want to save your logs:


In the last step the logs are collected:


Once it is done, you can click the open folder to find your logs:



If the log bundle is smaller then 15 MB you can upload it while opening the call. If the log is bigger there are a couple of things you can do:
  • Just try to upload it during opening the call but you might experience http timeouts. While writing the article I was able to upload a file of 22MB but this really depends on your connection.
  • Don't upload any logs but ask support for an FTP link to transfer your logs when you open the call
  • Open up the log bundle and only upload specific logs. Alternatively split the logs in multiple zip files.

Who is the license owner

I often get the question, which login should I use to open the call. I can recommend to check the license owner and use his account as your support will be associated with it. If you can't find the login you can create a new account and just open a call. Then refer to the license owner so that support can validate your support information.  To find the person who is associated with your license just go to the main menu and click "about" under help.

In the licensee field you see the email address associated with it. Also it might also be a good time to write down the version number you are running, in this case Version : 7.0.764

Alternatively you can find the name by going to the main menu and clicking support information but this won't show the actual email address.

Opening the call

Opening the call can be done via the phone, but actually I prefer to do it via the web portal because it allows you to add additional information or to follow up the status.

First of all you should go to the support portal and sign in with the license owner account or create a new account. The portal can be found here:

Once you log in, validate that your contact details are correct so that support will be able to contact you successfully. You can do this by clicking edit profile in the right column:

Then on the next page you will be able to edit the email address or you phone number. Please double check before you start the process of opening a call.


Once you are happy with the contact detail, just click get support button. It will take you back to the main page where you will be able to click the open ticket button. You can also select the right product directly by clicking "open ticket" next to the licenses you have listed below.


First step is to fill in a title and describe the problem you have. Please also select the correct area where you are living and the severity. You can see response time associated with the severity in the right column.


Now select the product you are having problems with

Then select the version number. You can find the version number in the about window. Check the "who is license owner" if you can't find the about window. I always select English support but there are some other languages available as well. However if you don't mind English, it is good to understand that there are more technical people speaking English then any other languages. Then click next

The portal will now suggest some knowledge base articles. Please consider them. If nothing useful is there, click continue
Now you can attach your logs. First select the kind of issue you have. It will then suggest you to upload the correct/related log files. If your log bundle is big, you could opt to extract the specific file(s) and upload only the suggested log(s). I like to send the bundle because then support will have all the info at once. So select your log/bundle

Once you selected the logs, don't forget to click the upload button. I have made this mistake (not doing it) a hundred times. If you click next without doing it will just tell you that you haven't attached any logs and will ask you that you are sure you want to proceed

Once the logs are successfully submitted you will see the file appear in the uploaded file list

Then click next or upload additional log files. You will get a summary. Validate the fields and push the submit button to open a call.

You should receive a mail now with your ticket details and your call should be open. You should also see it in your support dashboard



Adding extra notes and viewing the status

If you want add additional notes, you can do this via the support dashboard

Just go to the open cases tab at the button and click details next to your ticket
In the next window you should be able to add more details. Also you will be able to follow the status of your case


Escalating the problem

Not a lot of people seem to know this but you can actually escalate a problem yourself. Only do it when you feel it is really necessary. First of all you can request and update via the web portal. Just click request update next to your call

You can also call support. Before you do it, first write the case number. To find the case number, again look at your open case.

Then click the phone support button to find the number you can call for your country


Finally if you feel like support is not helping you can escalate the problem to the support manager. Refer to your case number when contacting the support manager. By clicking the "Talk to a manager" button, your email client should open, creating a new email to the support manager distribution list.