2016/06/10

Veeam repository in an LXC container

In the past, I have always wanted to investigate the concept of a Linux repositories with extra safety like a chroot. However I never took the time to work on a example. With the rise of docker, I was thinking, could I make a repository out of a docker container. After some research, I understood that docker has the whole abstraction layer of storage which I really didn't needed. Next to that, docker containers run in privileged mode. So I decided to try out LXC and go a bit deeper into the container technology.

Now before we start, this is just an experimental setup in a lab, so do proper testing before you put this in production. It is just an idea but not a "reference architecture", so I imagine improvements can be done. So here is a diagram of the set up which will discuss in this blog.


The first thing you might notice is that we will set up a bridge between the host ethernet card and the containers virtual network ports. In some lxc setups, NAT is preferred between the real world and a disconnected virtual bridge. I guess that works for web servers but in this example, Veeam needs to be able to connect to each container via SSH and you might need to open up multiple ports. When you bridge your containers virtual network ports and your outside ethernet port, you basically create a software switch. This give the advantage that you can assign an individual IP to every individual container as if it was a standalone machine

Next to that we will make a separate LVM volume group. You might notice that the root of every tenant is colored green. That's because we will use lxc-clone with snapshot functionality. That means, set up the root (container) once, and than clone it via an LVM snapshot so you can instantly have a new container/tenant. Finally, an LVM volume called "_repo" is assigned to each individual container and mounted under /mnt. This is where you will store the backups itself, separated from the root system.

The first thing is of course install debian. Not going to show it as it is basically following the wizard which is straightforward. I do want to say that I assigned 5GiB for the root, but it turns out, after all the configuration, I only use 1.5GiB. So if you want to save some GB's, you could assign for example only 3GiB. 

Maybe one important note, the volume group name for storing the container roots need to be called different than the container name in order for lvm-clone to work correctly. I ran into the issue where cloning did not work because of it.  So for example, call the volume group  "vg_tenstore" and the containers/logical volumes "tenant" . During the initial install, only setup the volume group. The logical volumes will be made by lxc during configuration.

So after the install, I just installed drivers and updates by executing the following. If you don't run it in VMware, you of course do not need the tools. You might also go leightweight by not installing the dkms version.
apt-get update
apt-get upgrade
apt-get install open-vm-tools-dkms
apt-get install openssh-server
reboot
After the system has rebooted, you are able to start an SSH session to it. Now let's install the LXC software. (based on https://wiki.debian.org/LXC)
apt-get install lxc bridge-utils libvirt-bin debootstrap
Once that is done, let's set up the bridge. For this edit /etc/network/interfaces. Here is a copy of my configuration
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
source /etc/network/interfaces.d/*
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
allow-hotplug eth0
#auto eth0
#iface eth0 inet static
#       address 192.168.204.7
#       netmask 255.255.255.0
#       network 192.168.204.0
#       broadcast 192.168.204.255
#       gateway 192.168.204.1
#       # dns-* options are implemented by the resolvconf package, if installed
#       dns-nameservers 192.168.204.2
#       dns-search t.lab
auto br0
iface br0 inet static
        bridge_fd 0
        bridge_stp off
        bridge_maxwait 0
        bridge_ports eth0
        address 192.168.204.7
        netmask 255.255.255.0
        broadcast 192.168.204.255
        gateway 192.168.204.1
        # dns-* options are implemented by the resolvconf package, if installed
        dns-nameservers 192.168.204.2
        dns-search t.lab
Notice that I kept the old configuration part in comments. You can see that the whole address configuration is assigned to the bridge (br0). So the bridge literally gets the host IP. After modding, I restarted the OS, just to check if the network settings would stick, but I guess you can also restart the networking via "/etc/init.d/networking restart" . The result should be something like this



Ok so that was rather easy. Now lets add some lines to the default config so that every container will be connected to this bridge by default. To do so, edit  /etc/lxc/default.conf and add
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
In fact, if you look at previous screenshot, you can see that there are already 2 containers running, as 2 "vethXXXXX" interfaces already exist.

Now let's setup the root for our container. If you forgot your volume group name, use 'lvm vgdisplay" to display all your volumes groups. Then execute the create command
lxc-create -n tenant -t debian -B lvm --lvname tenant --vgname vg_tenstore
This will create a new container tenant based on the debian template. A template is a preconfigured script to make a certain flavor of containers. In this case, I used a debian flavor, but you can find other flavor here "/usr/share/lxc/templates". The backing store is lvm and the logical volume that will be created is called "tenant" and will be created in the volume group "vg_tenstore". Keep the name of the container and the logical volume the same as mentioned before.

Now the container will be created, you can actually edit it's environment before starting it. I did a couple of edits, so that everything boots smoothly. First mount the filesystem
mount /dev/vg_tenstore/tenant /mnt
First I edited the networking file /mnt/etc/network/interfaces and set a static ip
#....not the whole file
iface eth0 inet static
address 192.168.204.6
netmask 255.255.255.0
gateway 192.168.204.1
dns-nameservers 192.168.204.2
Then I edited /mnt/etc/resolv.conf
search t.lab
nameserver 192.168.204.2
Finally I made a huge security hole by allowing root login via SSH in /mnt/etc/ssh/sshd_config
PermitRootLogin yes
You might want to avoid this but I wanted something quick and dirty for testing. Now umount the filesystem and let's boot the parent container. I used -d to daemonize so you don't get an interactive container you can't escape
umount /dev/vg_tenstore/tenant
lxc-start --name tenant -d
Once started, use lxc-attach to attach directly to the console with root and execute passwd to setup a password. You can go out by typing exit after you setup password
lxc-attach --name=tenant
passwd
exit
Now test your password by going into the console. While you are there, you can check if everything is ok and than halt the container
lxc-console --name tenant
#test what you want after login
halt
You can also halt a container from the host by using "lxc-stop -n tenant". Now that is done, we can actually create a clone. I made a small wrapper script but you can of course run the commands manually. First I made a file "maketenant" and used "chmod +x maketenant" to make it executable. Here is the content of the script
tenant=$1
ipend=$2

if [ ! -z "$tenant" -a "$tenant" != "tenstore" -a ! -z "$ipend" ]; then
        trepo=$tenant"_repo"
        echo "Creating $tenant"
        lxc-clone -s tenant $tenant
        lvm lvcreate -L 10g -n"$trepo" vg_tenstore
        mkfs.ext4 "/dev/vg_tenstore/$trepo"
        echo "/dev/vg_tenstore/$trepo mnt ext4 defaults 0 0" >> /var/lib/lxc/$tenant/fstab
        mount "/dev/vg_tenstore/$tenant" /mnt
        sed -i "s/address [0-9.]*/address 192.168.204.$ipend/" /mnt/etc/network/interfaces
        umount /mnt
        echo "lxc.start.auto = 1" >> /var/lib/lxc/$tenant/config
        echo "Starting"
        lxc-start --name $tenant -d
        #check if repo is mounted
        echo "Check if repo is mounted"
        lxc-attach --name $tenant -- df -h | grep repo
        echo "Check ip"
        lxc-attach --name $tenant -- cat /etc/network/interfaces | grep "address"
else
        echo "Supply tenant name"
fi
Ok so what does it do:
  • lxc-clone -s tenant tenantname : makes a clone of our container based on a (LVM) snapshot
  • lvm lvcreate -L 10g -n"tenantname_repo" vg_tenstore : create a new logical volume for the tenant to store it's backups in
  • mkfs.ext4 /dev/vg_tenstore/tenantname_repo : format that volume with ext4
  •  echo "/dev/vg_tenstore/tenantname_repo mnt ext4 defaults 0 0" >> /var/lib/lxc/tenantname/fstab : Tells LXC to mount the volume to the container under /mnt
  • We then mount the FS again of the new tenant to change the ip
    • sed -i "s/address [0-9.]*/address 192.168.204.$ipend/" /mnt/etc/network/interfaces : replaces the IP with a unique one for the tenant
  • echo "lxc.start.auto = 1" >> /var/lib/lxc/tenantname/config : tells LXC that we want to start the container at boot time
Finally we start the container  and check if indeed the repository was mounted and the IP is correctly edited. The result? :

We have a new tenant (tenant2) is up and running. Let check the logical volumes. You can see that tenant2 is running from a snapshot and that a new logical volume was made as a repository


And the container is properly bridged to our br0


Now you can add it to Veeam based on the IP. If populate, you will see only what you assigned to the container. 


You can than select the /mnt point


And you are all set up. Let's do a backup copy job to it


And let's check the result with "lxc-attach --name tenant2 -- find /mnt/"


That's it, you can now make more containers to have separate chroots. You can also think about security like Iptables or Apparmor but since the container is not privileged, it should already be a huge separation from just separate  users / "/home directories" with sudo rights.

2016/06/06

RPS 0.3.3

It took some time to release this version because it packs a lot of changes which hopefully makes the tool more useful. This release focuses on more "export" functionality.

So the first major change is that the "Simulate" button has been lowered. This makes more sense as you will probably first put in the info and than run the simulation. But the main reason why it was lowered is the additional export functionality. You will now see a couple of checkbox next to the simulate button.


The first checkbox is the "export" functionality. When you check it and run the simulation, an input field will appear with an URL. If you click somewhere in the field, the complete text should be selected, which you can than easily copy with for example ctrl+c. When you reuse the URL, your simulation will automatically execute with all the previous inputs. This way you can share your simulation without having to screenshot everything. Make sure to push the Simulate button before copy pasting as this will refresh the URL field.

But what if you still want a clean screenshot of the result? I can not tell you how many screenshots I already saw of the RPS output in mails/documentation/etc.. However, screenshotting the output might be challenging. First of all you need specific software to cut it out. Next, if the simulation is longer, you should screenshot multiple times and than concat the result. So in this release I'm introducing "Canvas rendering". This will render the result in an hidden HTML5 canvas and than replace a visible image with a copy of the HTML5 output. The result should be a cleaner output, that you can use in documentation. Also opted to reduce the amount info on the output as dates etc. make little sense when a partner wants to send a result to an end customer.



If you are running Firefox or Chrome, you will be able to save the picture by clicking the Download link. The advantage is that the link will push a formatted name with the current date and time. However if you are not using one of both (like IE or Safari), you should still be able to right click it and save it as picture. The reason why the download link does not work in every browser is because I'm using an unofficial "download" attribute. Let's hope in the future, Edge, IE, Safari, etc. will also support it.


Another new future would be support for Active Full support that was added in Veeam v9. This future required some testing to check if the result made sense. Hopefully I got it right, and I would love to hear your feedback!

Finally, a very small request that has been implemented is Replica support. In this first version, I only added support for VMware, but this might change in the future


2016/05/10

BytesPerSec from WMI Raw class via Powershell

If you ever tried to query data from WMI, you might have noticed that there is preformatted data and raw data classes. Recently I tried to make due with the preformatted data but after a while, I saw it was pretty inaccurate for monitoring as it only shows you that specific moment. Especially for disk writing, which seems to be buffered, meaning you get hugh spikes and hugh dips because of buffers emptying all at once.

Well I tried the raw classes and couldn't make sense of it at first. After my google-kungfu did not seems to yield any result (mostly people asking the same question), I tried to make sense of the data via the good old "trial-and-error" method and see if I could squeeze some reasonable results out of it.

The biggest issue with raw classes is that you take samples, and the values are just augmented with new values over time by the system. So you get a hugh number that doesn't mean anything. What you need to do is take 2 samples, lets call them "old" and "new". Your real value over the interval would be
(new val - old val) / (new timestamp - old timestamp)

Well with "BytesPerSec", I could not get it to work until I realised, bytes per sec is already written in an interval. So for "BytesPerSec", it seems you have to look at the "Timestamp_Sys100NS". To convert this to seconds, you multipy it with "*1e-7". (google "1*1e-7 seconds to nanoseconds" to understand why). So what you get is :
(New BytesPerSec - Old BytesPerSec) /((new Timestamp_Sys100NS - old Timestamp_Sys100NS)*1e-7)
So that seems strange because BytesPerSec is already in seconds. On a 1 second interval, you would not need to divide because the difference between Timestamps would be around 1 anyway. However, consider a 5 second sample interval. In this case, the system seems to add 5 samples of "bytespersec". By dividing it by 5, you get an average over the interval. Well it seems to be more complex than that. If you put the sample interval on 100ms, the formula still seems to work, which basically tells me the system is constantly adding to the number but adjusting it to"Bytes Per Seconds". For example, in the script below, I sleep 900ms because that allows powershell to do around 100ms of querying/calculations.

Now, my method of discovery is not very scientific (I could not correlate to any doc), but it does seems to add up if you live compare it to the task manager. So below is a link to a sample powershell script, you can use to check the values. Although I'm writing a small program in C#, I can only recommend powershell to play with WMI, as it allows you to play with WMI without having to recompile all the time, and discover the values via for example the Powershell_ISE.

https://github.com/tdewin/veeampowershell/blob/master/wmi_raw_data_bytespersec.ps1

The script queries info from "Win32_PerfRawData_PerfDisk_PhysicalDisk" and "Win32_PerfRawData_Tcpip_NetworkInterface". If you are looking for a fitting class, you can actually use this oneliner:
$lookfor = "network";gwmi -List | ? { $_.name -imatch $lookfor } | select name,properties | fl
 Adjust $lookfor, for what you are actually looking for.


Update: 
There is acutally a a "scientific method" to do it. More playing and googling turns up interesting results.

 First of lookup your class and counter. So lets say I want "BytesPerSec" from "Win32_PerfRawData_PerfDisk_PhysicalDisk". You would google it, and hopefully you get to this class page:

https://msdn.microsoft.com/en-us/library/aa394308%28v=vs.85%29.aspx

This would tell you about the counter:
DiskBytesPerSec
Data type: uint64
Access type: Read-only
Qualifiers: CounterType (272696576) , DefaultScale (-4) , PerfDetail (200)

If you click enough, you would end up this page:
https://msdn.microsoft.com/en-us/library/aa389383%28v=vs.85%29.aspx

That type is actually 272696576 PERF_COUNTER_BULK_COUNT. If you google "PERF_COUNTER_BULK_COUNT", you might end up here:

https://msdn.microsoft.com/en-us/library/ms804018.aspx

This would tell you to use the following formula:
(N1 - N0) / ( (D1 - D0) / F, where the numerator (N) represents the number of operations performed during the last sample interval, the denominator (D) represent the number of ticks elapsed during the last sample interval, and the variable F is the frequency of the ticks.

This counter type shows the average number of operations completed during each second of the sample interval. Counters of this type measure time in ticks of the system clock. The variable F represents the number of ticks per second. The value of F is factored into the equation so that the result is displayed in seconds. This counter type is the same as the PERF_COUNTER_COUNTER type, but it uses larger fields to accommodate larger values.

 This might still be not so trivial but the nominator should be fairly clear. It is the same value we used before, namely newvalue - oldvalue. The dominator, actually is  ( (D1 - D0) / F). Which would be (newticks-oldticks)/frequency. This turns out to translated to :
($new.Timestamp_PerfTime - $old.Timestamp_PerfTime)/($new.Frequency_PerfTime).

Interesting enough "$new.Frequency_PerfTime" is always the same because it is actually the hz speed of your processor. So it basically tells you how much ticks it can handle per second. The timestamp_PerfTime, is I guess, how many ticks have already passed. So by deducting the old from the new, you get the amount of ticks that have been done between your sample. If you divided that through hz, you get how much "seconds" have past (can be a float). That means you don't have to convert to nanoseconds, and you can use the formula directly like this:
$time = ($new.Timestamp_PerfTime - $old.Timestamp_PerfTime)/($new.Frequency_PerfTime)

So the total formula would be
$dbs = $new.DiskBytesPersec - $old.DiskBytesPersec
$time = ($new.Timestamp_PerfTime - $old.Timestamp_PerfTime)/($new.Frequency_PerfTime)
$cookeddbs = $dbs/$time

Running the mentioned method in the script and this method give you almost the same results but I guess the tiny differences have to do with rounding. Anyway this method in the update should be the most accurate as this is what Microsoft describes as using themself for cooking up the data. Should also give you a more "stable" way to calculate other values, instead of trial and error

2016/04/13

Veeam RESTful API via Powershell

In this blog post I'll show you how you can play around with Veeam RESTful API via Powershell.  This post will show you how to find a job and start it. You might wonder why would you do such a thing? Well in my case it is to showcase the interaction with the API (per line of code), very similar as you with do with wget or curl. If you want an interactive way of playing with the api, know that you can always replace the /api with /web/#api/ (for example http://localhost:9399/web/#/api/) to get an interactive browser. However, via Powershell you get the real sense that you are interacting with the API and all methods used here should be portable to any other language. That is why I've not chosen to use "invoke-restmethod", but rather a raw HTTP call.


So the first thing (which might not be required), is to ignore the self signed certificate. If you access the API via  FQDN on the server itself, the certificate should be trusted, but that would make my code less generic.
add-type @"
    using System.Net;
    using System.Security.Cryptography.X509Certificates;
    public class TrustAllCertsPolicy : ICertificatePolicy {
        public bool CheckValidationResult(
            ServicePoint srvPoint, X509Certificate certificate,
            WebRequest request, int certificateProblem) {
            return true;
        }
    }
"@
[System.Net.ServicePointManager]::CertificatePolicy = New-Object TrustAllCertsPolicy
So with that code executed, you have told dotnet to thrust everything. Next step is to get the API version
$r_api = Invoke-WebRequest -Method Get -Uri "https://localhost:9398/api/"
$r_api_xml = [xml]$r_api.Content
$r_api_links = @($r_api_xml.EnterpriseManager.SupportedVersions.SupportedVersion | ? { $_.Name -eq "v1_2" })[0].Links
With the first request, we basically do a get request to the API page. The Veeam REST API uses XML in favor of  JSON. So we can just convert the content itself to XML. Once that is done, we can browse the XML. The cool thing about Powershell is that it allows you to browse the structure and autocompletes. Just execute $r_api_xml and you will get the root element. By adding a dot and the start-element, you can see what's underneath this node. You can repeat this process to "explore" the XML (or you can just print out the $r_api.Content without conversion to see the plain XML).

Under the root container (EnterpriseManager), we have a list of all SupportedVersion. By applying a filter we get the v1_2 (v9) API version. This one has one Link which indicates how you can logon
PS C:\Users\Timothy> $r_api_links.Link | fl

Href : https://localhost:9398/api/sessionMngr/?v=v1_2
Type : LogonSession
Rel  : Create
So the Href show the link we have to follow. The type tells use the name of the link and then finally, Rel explains us which http method we should use. Create means we need to do a Post.

Most of the times:
  • get method is if you want to get details but don't want do a real action. 
  • post method is used if you want to do a real action
  • put method if you want to update
  • delete method if you want to destroy something
When in doubt, check the manual :  https://helpcenter.veeam.com/backup/rest/requests.html 

Ok for authentication, we have to do something special, we need to send the credentials via basic authentication. This is a pure HTTP standard, so I'll show you two ways to do it
$r_login = Invoke-WebRequest -method Post -Uri $r_api_links.Link.Href -Credential (Get-Credential -Message "Basic Auth" -UserName "rest")

#even more raw

$auth = "Basic " + [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("mylogin:myadvancedpassword"))
$r_login = Invoke-WebRequest -method Post -Uri $r_api_links.Link.Href -Headers @{"Authorization"=$auth}
Well the first method uses Powershell built-in functionality for doing BASIC authentication. The second method, actually shows what is really going on in the HTTP request. "username:password" is encoded as base64. then "Basic " and this encoded string are concatenated. This result is then set in the Authorization header of the request.

The result is that (if we logged on successfully), we get a logon session, which has links to almost all main resources. Before we go any farther, we do need to analyze a bit the response.
if ($r_login.StatusCode -lt 400) {
If you do a call, you can check the StatusCode or return code. You are expecting a number between 200-204 which means success. If you want to know the exact meaning of the HTTP return code in the Veeam REST API : https://helpcenter.veeam.com/backup/rest/http_response_codes.html

The next thing now is to extract the Rest session id. Instead of sending over the username and password the whole time, you need to send over this header to authenticate. The header is returned after you succesfully logged in.
    #get session id which we need to do subsequent request
    $sessionheadername = "X-RestSvcSessionId"
    $sessionid = $r_login.Headers[$sessionheadername]
So now we have the session id extracted, lets do something useful
    #content
    $r_login_xml = [xml]$r_login.Content
    $r_login_links = $r_login_xml.LogonSession.Links.Link
    $joblink = $r_login_links | ? { $_.Type -eq "JobReferenceList" }
First we take the logon session, convert it to XML and we browse the links. We are looking for the link with name "JobReferenceList". Let's follow this link. In the process, don't forget to configure in your header the session id.
    #get jobs with id we have
    $r_jobs = Invoke-WebRequest -Method Get -Headers @{$sessionheadername=$sessionid} -Uri $joblink.Href
    $r_jobs_xml = [xml]$r_jobs.Content
    $r_job = $r_jobs_xml.EntityReferences.Ref | ? { $_.Name -Match "myjob" }
    $r_job_alt = $r_job.Links.Link | ? { $_.Rel -eq "Alternate" }
So the first line is just getting and converting the XML. The page which we requested is a list of a the job in reference format. The reference format is a compacted way of representing the object that you requested, basically showing the name and the ID of the job and some links. If you add the "?format=Entity" to such a list (or request of an object/objectlist). You get the full details of the job.

So why the reference representation? Well it is a pretty similar concept to the GUI. If you open the Backup & Replication GUI and you select the job list, you don't get the complete details of all the jobs. That would be kind of overwhelming. But when you click a specific job and try to edit it, you get all the details. Similar, if you want to built an overview of all the jobs, you wouldn't want the API to give you all the unnecessary details as this would make the "processing time" of the request much bigger (downloading the data, parsing it, extracting what you need, ..)

So in the 3rd line, what we do is look for the job (or rather the reference to a job), which names matches "myjob". We then take a look at the links of this job and look for the alternate link. Basically, this is the jobid + "?format=Entity" to get the complete details of the job. Here is the the output of $r_job_alt
PS C:\Users\Timothy> $r_job_alt | fl
Href : https://localhost:9398/api/jobs/f7d731be-53f7-40ca-9c45-cbdaf29e2d99?format=Entity
Name : myjob
Type : Job
Rel  : Alternate
Now ask for the details of the job
    $r_job_entity = Invoke-WebRequest -Method Get -Headers @{$sessionheadername=$sessionid} -Uri $r_job_alt.Href
    $r_job_entity_xml = [xml]$r_job_entity.Content
    $r_job_start = $r_job_entity_xml.Job.Links.Link | ? { $_.Rel -eq "Start" }
By now, the first 2 lines should be well understood. In the third line we are looking for a link on this object with name start. This is basically the method we need to execute to start the job. Start is a real action and if you look it up, you will see that you need a POST method it to call it
 #start the job
    $r_start = Invoke-WebRequest -Method Post -Headers @{$sessionheadername=$sessionid} -Uri $r_job_start.Href
    $r_start_xml =  [xml]$r_start.Content

    #check of command is succesfully delegated
    while ( $r_start_xml.Task.State -eq "Running") {
        $r_start = Invoke-WebRequest -Method Get -Headers @{$sessionheadername=$sessionid} -Uri $r_start_xml.Task.Href
        $r_start_xml =  [xml]$r_start.Content
        write-host $r_start_xml.Task.State
        Start-Sleep -Seconds 1
    }
    write-host $r_start_xml.Task.Result
Ok so that a bunch of code, but still wanted to post it. So first we follow the start method and parse the XML. The result is actually a "Task". A Task in the end is representing a process that is running on the RESTful api, that you can refresh, to check the actual status of the process. What is important, it is the REST process but not the Backup Server process. That means that if a task is finished for REST, it doesn't mean necessarily that the action at the backup server is finished. What is finished is that the API has passed your command to the backup server.

So in this example, we check if the task is "Running", we will refresh the task and write out the State, sleep 1 second, and  then do the whole thing again while it is in "Running" state. Once it is finished, we write out the Task.Result. Again, if this task is Finished, it does not mean the job is finished, but the backup server has "started" the job hopefully succefully

So finally we need to log out. That rather easy. In the logon session, you will find the URL to do so. Since you are logging out or deleting your session, you need to use the method "DELETE". You can check that by checking the relationship of the link.
    $logofflink = $r_login_xml.LogonSession.Links.Link | ? { $_.type -match "LogonSession" }
    Invoke-WebRequest -Method Delete -Headers @{$sessionheadername=$sessionid} -Uri $logofflink.Href
I have uploaded the complete code here:
https://github.com/tdewin/veeampowershell/blob/master/restdemo.ps1 
There is some more code in that example that does the actual follow up on the process. But you can skip or analyze the code if you want.

2016/01/14

Extending Surebackup in v9

Now that everybody has posted there favorite new features in Veeam v9, I want to take the time to highlight one particular feature. This is the credential manger part in Surebackup. This extra tab can be found when you configure your VM in the application group.


So why this extra tab? Well you can read my Surebackup Sharepoint validation script and instantly see the biggest problem. Storing your credentials in a secure way is 60% of the whole blog article. This is because of in v8, all scripts are started by the backup service and thus inherent this account and permissions.

Enter v9, the credentials tab is added. My first assumption was that all scripts will run under the configured account. That turned out to be incorrect. The script will be started up with the backup service account, but the Network Credentials are changed. This has one big advantage, even if your backup server is not in the domain, you can still use these credentials. Think of it as using "runas /netonly" to start up an application (this is how Product Management explained it to me). The credentials are only applied when connecting to a remote server.

So for the fun of it, I have already looked into some example scripts. They might not be all that stable and it is better to change them to your liking, but they should give you an idea on where to start.

First of all, you can find an update version of the Sharepoint script. The only parameters to set are:
  •     -server [yourserver, %vm_ip% for example]
  •     -path [to content you want to check, by default : /Shared%20Documents/contenttest.txt]
  •     -content [the content to check, by default  "working succesfully"]
If you then setup the account correctly that can access the webservice, it will authenticate succesfully with the network credentials, download the file and match the content. The real magic? "$web.UseDefaultCredentials = $true"

But the fun doesn't stop there. I also tried to make a SQL script. You only need to pass:
  • -server [yourserver, %vm_ip% for example]
  • -instance [by default MSSQLSERVER]
It will logon to the instance, issue a "use" and "query the tables" to all databases. Finally it check the state of the databases in "sys.databases". The use, makes sure that SQL Server actually tries to mount the database. But the cool thing is, you can easily alter the example to execute a full blown sql query and then check if the output satisfies your need. The real magic? "Server=$instancefull;Integrated Security=True;"

Also added a template for just any plain Powershell script  . This might look trivial (it doesn't do anything but logon and write the hostname), but I spend some time figuring out you need " -Authentication Negotiate" and that there is no need to setup SSL. However, do check if the firewall allows remote connection from outside the domain if you want to use this one.

So no more excuses for writing those extensive application test scripts!

Final tip, if you are customizing these examples, you can do a Powershell "write-host" at anytime. The output can be found in the matching surebackup log. By default in:
  %programdata%\veeam\Backup\\Job..log

For example, for the SQL script, you would find something like:
[08.01.2016 15:57:59]  Info     [SureBackup] [SQLandSP] [ScriptTests] [Console] online : master

2015/11/30

Extending Surebackup with custom scripts : Sharepoint

Often I visit customers and ask them about there restore tests. Most common answer? We test the backups when we do the actual restores. To the question why not test more frequently, the most common answer would be "time and resources".

A couple of months ago, I actually visited a customer that tried to do a restore from backup. It failed, B&R was able to restore the backup but the data inside seemed to be corrupt. The SQL server refused to mount the database. Exploring multiple restore points, this gave the same issue. It was a strange issue because all backup files where consistent (no storage corruption), and the backup job did not have any failed states. The conclusion was Changed Block Tracking corruption. In light of the recent bugs in CBT, I wanted to emphasize again how critical it is to validate your backups. If the customer would have tested his backups with for example, the SQL test script included in v8 , they might have caught the error before the actual restore failed.

This shows another thing I want to highlight. Surebackup is a framework but your "verification" is only as good as your test. By default, Surebackup application tests are just portscans. This tells you that the service has started (it was able to bind to the port and it is answering), but doesn't tell you anything about how good the service is performing. For example, the SQL service / instance could start, but maybe some databases where not able to be mounted inside the instance.

Few people visit this topic, but you can actually extend the framework. The fact that is supports Powershell makes it quite simple to write more extensive test.

So here is a small test for Sharepoint. I hacked it together today, so please reread the whole script to "Surebackup" my coding skills. It is rather basic, but you could use it for any kind of webservice actually. It simply reads the content of a txt file in a certain site. If the content matches a predefined value, you know that
a) The database was mounted inside the instance
b) Sharepoint is able to talk to the instance and query it
c) The webservice is responding to requests

So how do you get started? Well first upload a txt file with some content. In my case, I uploaded file contesttest.txt with the content "sharepoint is working succesfully" as shown below:


You can right click the link and copy it's location. Test if you can really access it this way as shown below


Now get the powershell goodness from https://github.com/tdewin/veeampowershell/blob/master/suresharepoint.ps1 and put it somewhere on the backup server. Now edit the file.



First of all you can see that everything can be passed as a parameter (e.g. commandline call, use -server "ip" to change the ip address). Change the username and plaintext password to the user that will be used to authenticate against sharepoint. Preferably and account with read only rights and not the administrator as in my screenshot, this way you are sure it doesn't break anything ;).

You might wonder, do I need to provide the password in plaintext? No you don't have to actually, you can also follow this procedure but it might make things more complex. Instead of plaintext passwords, you can use Powershell encrypted passwords but understand that if you want to decrypt the password, you need to be the same user as the one that encrypted the password (the whole point of encrypting it, right?). When Surebackup runs, it is actually being ran by the backup service. So the account that is being used to decrypt the password is the service account used to run this service (as shown in the screenshot below)



If this is not Local System account but a service account, you can use the following cmd script to create an encrypted password:
https://github.com/tdewin/veeampowershell/blob/master/encryptedpasstoclip/encryptedpasstoclip.cmd

Change the username in the bat file, run it, give in the password for the service account and finally give in the password for the account you want to use to authenticate to sharepoint. The result should be that an encrypted password is put on your clipboard. replace the whole password statement in the file. for example:
$pass = "01000000d08c9ddf0115d1118c7a00c04fc297eb01000000c9b320ead0059d409978380353923e8000000000020000000000106600000001000020000000b1816dffef13bc70672b55dfcee25a41488d5bb395ae28242b70afeb90938db9000000000e8000000002000020000000bd7da1d0d06893bed8b035c411c34f181b000aa9f0e4f46658eb3efe3e73c06840000000948652774f7f82848ba3065af8193c23fe25b773cea3ecf65957bdc12cdcc71868a82ba11d0475e65b321056a900d0571a05184b89132c0f21452642033c918340000000e8fcabb194c06c78ad01ee2192b73bf7ba799630adfedb6091dc1a629dc9d5a2a6025a64fcf74fe8a89d4a579a54c3538928ee0d22a57f22f6e50da240deaa62"
If you go this far (or you skipped the whole password encryption part because your backup server is a Fort Knox anyway), we can now configure the script. Go to Veeam B&R and configure the test script as shown below in the application group (or in the linked jobs part):



Notice that I also configured "Arguments" as "-server %vm_ip%". This will pass the sandbox IP to the script directly.

Before you actually startup Surebackup, you can test the script with your production environment. If it doesn't work against production environment, it will probably fail against your lab environment. In case you configured an encrypted password with another account, you can temporarily override it with the following command (In case you did not, you can just run as script.ps1 -server )
PS C:\Users\demo2> C:\scripts\suresp.ps1 -server -password (read-host -assecurestring -prompt "pass" | convertfrom-securestring)


Now if everything is green and you got a match, run Surebackup, and validate if you can get the same output in your lab


If it failed, you can actually check the logs for the output the script gave. Go to "%programdata%\Veeam\Backup\". It should contain a folder with your name. In this folder, there should be a log called Job.. You can open it with notepad


Scroll all the way down in the log and look for "[console]"


This should give you the output of the console. In this case, everything was ok!

2015/11/27

Veeam Application Report

As many as you know, although Veeam B&R has an agentless approach it still makes sure that all the applications are consistently flushed just before the backup starts. To do this, Veeam B&R leverages VSS. One thing it also does is, it tries to detect what applications are installed in what VM. This data is collected so that during restore, you don't have to figure out which VM is holding what application and where exactly the application database is stored inside of the VM (for example for Exchange, it will detect the path(s) leading to the EDB(s)).

Now a fellow SE colleague requested to add this "application detection" to the main GUI. They wanted to leverage the detection to sort out with VMs have what application installed. Adding it to the main GUI would however make it more complex but you can actually leverage the data via Powershell.

So here is a sample script you can use as a starting point:
https://raw.githubusercontent.com/tdewin/veeampowershell/master/veeam-per-app-detect.ps1

It generates a nice clean report with all the VMs that have detected applications (yes even Oracle so it is v9 ready), grouped per application. The output should look something like shown in the screenshot below:


Enjoy!