2014/12/08

Instant prereq for SCOM 2012 R2 on Windows 2012 R2

If you really don't want to figure out all the individual roles and features, run :
import-module servermanager
add-windowsfeature Web-Server,NET-Framework-45-ASPNET
add-windowsfeature Web-Asp-Net,Web-Asp-Net45,Web-Metabase,Web-Windows-Auth,Web-Request-Monitor,Web-Mgmt-Console,NET-WCF-HTTP-Activation45
This will install all the necessary roles for SCOM 2012 R2. If you get CGI/ASP handlers not being registered, restart the server.

2014/08/29

Veeam Explorer For Exchange without logs

So you made a backup from your exchange server with Veeam and want to recover Exchange items. Well that is quite easy with the Veeam Explorer For Exchange. But what if you have the logs files on a different vmdk then the edb file, and you excluded the disk. Will you be able to recover from the EDB alone?

That's a question that came up on our internal forums. Well at first, it looks like it is not possible. You will get this kind of message:


Saying that you can't open the EDB because "Online Exchange backup detected, log replay is required".

So what can you do? Well first start a windows file level recovery of your exchange server.


This should mount the server disk under c:\veeamflr\exchange\ (depending on the vm name). Now start by extracting the eseutil to a defined directory on your Veeam server. By default you can find the tools and dlls under:
 c:\veeamflr\exchange\volume1\Program Files\Microsoft\Exchange Server\V15\Bin


Personally I just copied everything which starts with ese like so:
cp  "c:\veeamflr\exchange\volume1\Program Files\Microsoft\Exchange Server\V15\Bin\ese*" .

Alternatively you can also copy them from your  live exchange server.

Now let's query the DB by using eseutil and the /mh parameter like so:
PS C:\eseextract> .\eseutil.exe /mh "C:\veeamflr\exchange\volume1\Program Files\Microsoft\Exchange Server\V15\Mailbox\Mailbox Database 1821327848\Mailbox Database 1821327848.edb"

 

It shows that the db is in dirty shutdown, matching the description of the explorer. So let's hard repair it without logs.

Now here is the tricky bit. When you start File Level Recovery, a cache file will be made holding all writes under:
C:\Windows\system32\config\systemprofile\AppData\Local\mount_cache{}


The cache will be deleted automatically but it might mean that when you are repairing, it could grow filling up your whole c: drive. If you are not sure, copy the EDB to a second location where you will have plenty of space. Also you will see that the recovery process might need upto 2x the space of the original EDB. This is because it will create a TMP file to work on. So plan for that as well.

In my scenario, I kept the file on the original location but I specified that the TMP file should be on another drive. To recover use eseutil.exe /p (optionally specifying the /t parameter for  the TMP file)  :
PS C:\eseextract> .\eseutil.exe /p "C:\veeamflr\exchange\volume1\Program Files\Microsoft\Exchange Server\V15\Mailbox\Mailbox Database 1821327848\Mailbox Database 1821327848.edb" /t "E:\tmp\tmp.edb"


So it will give you an error that you might potentially loose data. However, remember we are reading the backup in readonly and redirecting writes to the mount_cache file so no harm done.



After some time it should get recovered. You can then validate it again with the /mh parameter like so:
PS C:\eseextract> .\eseutil.exe /mh "C:\veeamflr\exchange\volume1\Program Files\Microsoft\Exchange Server\V15\Mailbox\Mailbox Database 1821327848\Mailbox Database 1821327848.edb"


Your EDB should be in clean shutdown. Now open up the Veeam Explorer for Exchange from the start menu. (If you can't find it, it's under: "C:\Program Files\Veeam\Backup and Replication\ExchangeExplorer\Veeam.Exchange.Explorer.exe" by default)

Then push "add store" and point to your EDB which is under the original EDB path we used with eseutil. In my case:
C:\VeeamFLR\exchange\Volume1\Program Files\Microsoft\Exchange Server\V15\Mailbox\Mailbox Database 1821327848\Mailbox Database 1821327848.edb


For the log directory, point to the directory holding the edb. You should now be able to click open, and get it to work


2014/08/21

Removing the SCOM 2012 R2 agent on a core edition

Recently I reinstalled the whole SCOM setup in my lab. Just because I wanted to test with the latest versions like R2 and needed to use the 180 trial license for that. This left me with a problem. Some of my servers kept reporting to my old SCOM server although it was obviously down. Re-adding the server to the current server didn't work. In the logs, I saw the following messages reappear (scom being my old server)
The description for Event ID 21006 from source OpsMgr Connector cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

scom

5723
10060L
A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

So I decided to remove it from SCOM and then manually uninstall the agent. One problem, one of the servers is a core edition. Good lucking launch appwiz on that one. Luckily, it is quite easy to find out how you need to uninstall it. Launch regedit and go to
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\
For every installed program, there should be a subkey under which you can see the DisplayName and the UninstallString.

Alternatively, you can use the following script:
$(Get-ChildItem 'HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\') | % { Write-host ("Soft : {0} `n`t {1}" -f $_.getvalue("DisplayName"),$_.getvalue("UninstallString")) }
It should output something like this
Soft :

Soft : Microsoft Visual C++ 2008 Redistributable - x64 9.0.30729.4148
         MsiExec.exe /X{4B6C7001-C7D6-3710-913E-5BC23FCE91E6}
Soft : VMware Tools
         MsiExec.exe /X{4D80C805-67C3-4525-A7BA-DC43215E9167}
Soft : Microsoft Monitoring Agent
         MsiExec.exe /I{786970C5-E6F6-4A41-B238-AE25D4B91EEA}
 So to uninstall the agent, first stop the service, just to be sure:
 net stop healthservice
 Then Uninstall the agent. I used the /X flag instead of the /I flag
 MsiExec.exe /X{786970C5-E6F6-4A41-B238-AE25D4B91EEA}
Btw at first, the command didn't want to do anything, rebooting the server helped. A GUI should appear requesting if you are sure you want to uninstall.

Then I deleted the directories under program files just to be sure that no residue was left on the filesystem:
rmdir "C:\Program Files\System Center Operations Manager" /s
rmdir "C:\Program Files\Microsoft Monitoring Agent" /s
Redeployed and let's hope I won't see those nasty error messages re-appear

2014/05/21

What is the buzz around Backup from Storage Snapshots anyway?

It is always great if vendors announce new features because most likely they are solving issue existing customers have. One of the better features Veeam released is Backup from Storage Snapshots. In v7, this feature supports HP Storeserv and HP Storevirtual storage platforms. As recently announced, in v8 this feature will be extended to Netapp.

But what problem does it really solve? When I am talking to customers, I have two kinds of customers: the ones that have the actual problem and the ones that don't. You can easily recognize them, cause if you have one of the first category, they immediately say : "We need this!".

So let's look at the problem first, and then explain how Backup from Storage Snapshots (BfSS) works.

With the introduction of Virtualisation, there are actually more layers that have to be made consistent before you can make a backup. In the old days, it was just the application, operating system and then the hardware (SAN) underneath that. Now, a new layer has been introduced: the Virtualisation layer itself.

Since Veeam backups at the VM level, it makes sense to take into account this layer. The way Veeam does it, is by taking VM snapshots (for VMware). To make everything consistent in the VM there are a couple of possibilities :
  • Use Veeam Application Aware Image Processing: basically talk to VSS directly via a runtime component. If necessary, it can also truncate logs for Exchange or SQL
  • Use VMware Tools: For Windows it will also do a (filesystem level) integration with VSS. For other platforms (or if you prefer), you can use pre-snapshot/post-snapshot scripts.
Once everything is consistent in the VM, Veeam then triggers a VMware snapshot. When that snapshot is created, everything can be released in the guest because you have a "consistent photo" of your VM. But what happens underneath?


Before the snapshot is created, the VM is happily reading and writing to the VMDK


After a snapshot has been created, VMware will create a delta disk. This disk will be very small in the beginning. However, while the snapshot exists (and thus the delta disk exists), the writes are redirected to this delta volume. The great advantage is that only reads will be done from the original VMDK if the blocks have not been overwritten. This means that we can backup the original VMDK knowing that it is in a consistent state and won't be altered during backup

Important, VMware snapshots are not "transaction logs". If a block is updated for a second time, the block in the spare disk will be updated thus not taking extra space. That means the delta can maximally grow to the size of the original VMDK.

Well so far, so good. But what is the problem with this?


If you have a not so I/O active VM, there is not really a problem. Because of change block tracking feature, Veeam only has to backup the blocks that have been changed between back-ups. That means fast backups and due to the low I/O, the delta won't grow so fast.

But what if you have an I/O active VM. Well then you have a couple of problems. First of all, your snapshots will grow with 16MB extents (or at least that is what I could find on the net). But everytime it grows, it needs to lock the clustered volume (VMFS) to allocate more space for the VMDK (Meta updates). That means extra I/O will be needed but also possible impact on other VMs that run on the same volume due to these locks. This problem also occurs with thin provisioning.

Secondly, if you are using thin provisioned VMFS volumes, the VMFS volumes will consume more and more space on the SAN. When you delete the snapshot, that space won't be automatically reclaimed. VMware now support the UNMAP VAAI primitive but as far as I know, it is not an automatic process:
http://cormachogan.com/2013/11/27/vsphere-5-5-storage-enhancements-part-4-unmap/

Finally because it is an I/O active VM, it probably has changed a lot of blocks between backups meaning that the VM backup might take long.

So if you could reduce the time the snapshot is active, the snapshot won't have the chance to grow that big. You might not avoid the problems completely but at least the impact will be a lot smaller.

But it can get worse. What happens when you delete (commit) the snapshot? Of course your data is not just discarded but need to be re-applied to the original volume. However, writes are still being done to that snapshot, so you can not just start merging. Because what happens to a block you are committing back and updating at the same time? Well for that VM uses a consolidated helper snapshots.


Basically VMware creates a second snapshot. All writes will be redirected to this helper. Then you can start committing the data back to the original VMDK.


Once that is done, the hope is of course that "the consolidated helper" snapshot is smaller then the original snapshot. So for example, if backup time took 4 hours, the hope is that consolidating only took for example 10 minutes, meaning that the snapshot might only be a fraction of the original snapshot.

What is important is to notice is that, the bigger the snapshot, the more extra I/O will be generated during commit. You need to read the blocks from the snapshot and then overwrite them in the original VMDK. That means that during a commit, you might notice a performance impact on the volume and thus on your original VM as well.

But what happens after that commit? You are left with consolidated helper, so you need to commit that. In 3.5, VMware just frozes the VM (holding off all I/O), and commited the delta file to the VMDK (call a sync commit). That means you can have huge freeze times (stuns). At one point, VMware improved this process by creating additional helper snapshots and going through the same process over and over again until it feels confident that it can commits the snapshots in a small time.

There are actually 2 parameters that impact this process.
  • snapshot.maxIterations : How many times will we repeat this process of creating helper snapshots and committing them. After all iterations are over, the VM will be stunned anyway and the commit will be forced. By default, VMware will go through 10 iterations max
  • snapshot.maxConsolidateTime : What is the estimated time VMware can stun your VM. The default is 6 seconds. For example, if after 3 iterations, VMware is confident it can commit the block of the helper snapshots in less then 6 seconds, it will freeze all I/O (stun), commit the snapshot, continue I/O (unstun) and not go through any additional iterations.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2039754

So if you are running I/O intensive program, the impact might be huge if you have to go to several iterations. Also imagine that instead of getting smaller consolidaton helpers, you will get bigger helpers after several iteration, the stun time might become huge instead of smaller. In the KB article there is an example that if you start with 5 minutes stun time, you might actually end up with 30 minutes stun time.

As a side note, I have to thank my colleague Andreas for pointing us to these parameters. While they where undocumented back then, they helped me find to find the info I needed. His article describes the process of lowering Max Consolidate Time for Exchange DAG Cluster. Granted VMware might go to additional iterations but the result might be that the stun time will be smaller thus not causing any failover. Like he suggest as well, only do this together with support. If your I/O is too high, you might actually amplify the problem as described above.

Conclusion is that, if you keep the delta file small, commit will be much faster, will have to go to a lot less iterations and stun time might be minimized (even if you go through max iterations).

So how does BfSS helps then? Well, when you use BfSS, a storage snapshot will be created after the snapshot on VM level is created. That means you can then instantly delete the snapshot on a VM level.


So as you can see the start is the same.
 

But then you create a snapshot by talking to the SAN/NAS devices that is hosting the VMFS volume / NFS share . This means your VM snapshot is "included" in the SAN snapshot, and this allows you to instantly commit the snapshot on the VM level


Afterwards the Veeam Backup & Replication proxy can read the data directly via the storage controller. Granted, Veeam will still create a snapshot, but you can imagine that a delta of 2 minutes while be 100x times smaller than a delta of 3 hours.

Sometimes customers will ask me if you are not shifting the problem. From a thin provisioning perspective of course not because the SAN is aware of the blocks it deletes. From a performance impact, SAN arrays are designed to do this. In fact, snapshots are NetApp bread and butter. They just redirect pointer, so deleting a snapshot is just throwing away the pointers. So no nasty commit times there.

But there is another bonus with storage snapshots that will be exclusively available for Netapp. VMware has still not solved the stun-problem that you can have with VMs hosted on NFS volumes when using Hot-Add backup mode. Backup & Replication has a way around this, but still it requires you to deploy a proxy on each host.

With BfSS, v8 while also implement an NFS client in the proxy component for Netapp. That means, even though you use NFS, you can use a "Direct SAN" approach (or as I like to call it Direct NAS). First of all it means you won't have those nasty stuns but more importantly, you will read the data where it resides. That means no extra overhead on the ESXi side (no CPU/MEM needed!) when you are running your backups.

So although demoing this feature might not look impressive  (unless you have this problem of course), you can see that it a major feature that has been added to Veeam Backup & Replication. The impact of making backups on I/O intensive VM will be drastically lower, allowing you to do more backups and thus yielding a better RPO.

 *Edit* I also found that VMware has added a new parameters in one of its patches, but what snapshot.asyncConsolidate.forceSync = "FALSE" does is not described. 

2014/04/16

Powershell wrapper for Beta Veeam Explorer for Active Directory

The new Veeam Explorer for Active Directory is cool stuff. I blogged about it earlier, showing how you can use it today. However it also shows that some manual steps have to be taken. Well, if you work as a sales engineer, you got to do these demo's a lot, meaning a lot of repetitive steps.

Then today, something on the Veeam forum inspired me. A guy was trying to start a Windows FLR via Powershell. So I decided to make a small wrapper to start the FLR and automate all those manual steps... well it sorta got "out of hands"..

You can get the wrapper script here. Save it on the backup server. Make sure to unblock the powershell script (go to the files properties, under the general tab, just above the ok button there should be some warning about downloaded content). Also make sure you have the correct executionpolicy setup.

Then create a new shortcut. In this shortcut specify the following parameter
powershell.exe -file "[path\to\script]\start-vbradrestorefromlatestbackup.ps1"

You can notice in the screenshot I added some parameters. This is where things got "out of hands".

-server [server] : auto select a certain vm. If you don't specify it, the wizard should propose you all the possible VM's in the backup files known to your backup & replication instance


-latest : auto select the latest restore point. If you don't specify it, the wizard should propose the available restore points for the VM you selected


 -autodiscovery : try to connect to the production server to learn where the ntds.dit file is stored. By default it is disabled and the wizard will use the default path "c:\windows\ntds\ntds.dit". I felt it was safer not to automatically connect to production. Notice that WinRM should be enabled as the script uses invoke-command to read the registry key on the production server.

-autodiscoveryserver [dns production ad] : give the ip or dns name to connect to, to do the auto discovery. If not specified but autodiscovery is on, the wizard will try to extract the DNS name from the restore point or use the VM name as a DNS name

-askcredentials : ask for credentials to do the autodiscovery. If you don't specify, it will just use invoke-command with your credentials.


-filepath : if you want to manually specify the path to the ntds.dit file (assuming you didn't enable autodiscovery)

-adexplorer : if you didn't installed the explorer on the default path

Once you have the shortcut, you should right click it and make sure to run it is as an administrator







If you want to give it a shiny icon, you can do that in the shortcut settings as well. Change the icon and browse to the explorer path. By default it is under "C:\Program Files\Veeam\Backup and Replication\ActiveDirectoryExplorer\Veeam.ActiveDirectory.Explorer.exe"


Now you should be up and running. Just click your shiny new shortcut. It should launch the wizard and automatically load the ntds.dit file after a FLR into the VEAD.


You will notice the Powershell window will stay open. That is because it is waiting for you to close the VEAD and to automatically stop the FLR so that everything is cleaned up as well



2014/03/18

Get even more control over your Veeam schedules

Veeam has a pretty extensive scheduler for jobs. However sometimes customers really want strange schedules to run their backups. I always try to change the mindset. Sometimes they want to have alternating backups just because they don't know the backup copy job exists in v7. In this case, it's like your Santa bringing them a new Christmas present when you explain them that they can actually copy their backups really easily from repository to repository.

However sometimes they have exotic questions. For example we want to run an active full every 2 weeks. Not every week or every month, no every 2 weeks. So what can you do in this case? Well use the Windows task scheduler and some easy Powershell script that executes your logic.

If you want to use Veeam Powershells snapin, make sure you explicitly install it. It is provided as part of the Veeam Backup & Replication installer, but is not selected by default. To validate, just check if you can find the Powershell option in your main menu.


If you don't have it installed, you can fire up the main installer or just locate the corresponding MSI on the iso ":\Backup\BPS_x64.msi"



The most  simple script can be found here:
simplestartjob.ps1

Change the name of the job in the script to match your job you want to start. Then go to the windows task scheduler and make a new task:


Then create a new task. Personally I like to add my Veeam task in a custom folder so that they are all grouped together.


On the general page enable
  • "Run whether user is logged on or not"
  • "Run with highest privileges" : If you want to know why you need to enable this, find more info at the end of the blog article. If you don't care, go ahead and continue

On the action tab, add a new action


Fill the correct setting
  • Program : powershell
  • Arguments : "e:\scripts\simplestartjob.ps1"
 Use the quotes to be safe (for examples if you have spaces in your path


The result should be something like this:


Now create a trigger


You can schedule it daily, weekly or monthly



Finally click ok and enter the credentials.


When the script runs it should start your job



Also I created some scripts in the past for customers:
  • Activefull.ps1 : Active full which should be run on special time. With the windows task scheduler you can say for example first and third week of the month. Alternatively you can do the alternation in the script itself
    • alternate weeks: if( ([int]$(get-date -uformat "%V"))%2 -eq 1) { do_something }
    • alternate days: if( ([int]$(get-date -uformat "%j"))%2 -eq 1) { do_something }
  • Startjob.ps1 : Launcher script. Instead of making a different script for each job, you can reuse the script and give the job name via a parameter
    • Please not that the argument should be something like (single quotes around names). Correct quotes are important "c:\path\to\script\startjob.ps1 'my job name'"
  • Stoptapejob  : Stop tape job after a certain time. Customer has a single drive and tape job "hang" if there is no drive in the slots and he was canceling them manually every day.

Why I need to run with highest privileges
Actually if you just open a powershell prompt in non-admin mode and admin mode you will see why


When you run in non-admin mode you will get a SQL error. Actually it is not really about admin mode but more about the fact that the current user doesn't have access to the database. It is actually the same requirement, when you want to give users fine grained access to the GUI. So first of all make sure you setup the correct permissions in B&R itself (Main menu>Users and roles)



Next to that, users also need permissions to the database that is behind B&R

If you give the user db_owner on the Veeam db, it should work as well





2014/03/17

Test driving the new Veeam Explorer for Active Directory

If you are a Veeam fan, you certainly need to read the word of Gostev. Basically you enroll  on the forum and every week you will get a mail from the forum containing the word of Gostev. He is the product manager for Backup & Replication and if you want to be the first one to know all about the new stuff in IT (not only Veeam), you'll see that this weekly mail avoids you reading 25 blog articles a day (quote I borrowed from one of my colleagues ;)).

This week wasn't any different. Veeam launched the public beta for the Veeam Explorer for Active Directory. So what is the big difference between the AD AIR wizard and this one? Well you don't need to power on a virtual lab to extract a single user. Instead, Veeam will read the ntds.dit (ad database file) directly via a file level recovery. This reduces recovery time drastically. The coolest part? It's a public beta, so everybody can test it!

So how do you get started? Well you will need a Veeam Backup & Replication server v7 installed. Then you can download the beta via the forums:
http://forums.veeam.com/veeam-backup-replication-f2/veeam-explorer-for-active-directory-t21038.html

Basically, you will get a zip file and in it is an MSI installer you can just next-next-install.


After that you should find the VEAD in your start menu


Now how do you get started? Well like with all explorer beta versions (exchange, sharepoint), you will need to start a file level recovery and point to the database. So lets start with the easy part, the file level recovery. Go to the main menu and choose restore.


Then choose to do a windows guest file level recovery


Find your active directory server


And start the guest files recovery wizard


Remember when you click finish, the recovery wizard will start but not any files will be actively recovered to your original machine.


Now you should see the file browser. But actually Veeam mounts the file level recovery under:
C:\VeeamFLR\\Volume

In my case the c: drive is
C:\VeeamFLR\ad02\Volume1

Keep the FLR wizard open during the whole process, otherwise Veeam will dismount the disk.

Now start the VEAD and click add database


Now you should point to the the ntds.dit file


I first tried to do a search through "C:\VeeamFLR\ad02\Volume1". This gave me 2 results:


But when I tried to mount the one in system32 I got the following error:


Luckily somebody already posted the solution on the forum. To find the correct path, go to your production server and look into:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters

However since my ad02 server is a core server, I created this small script you can invoke remotely to find the path
$server = "ad02"
Invoke-Command -computer $server { Get-ItemProperty -path "Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\" }
This should show you the correct path under "DSA Database file"



So this says the file should be under C:\Windows\NTDS\ntds.dit. If we match this to the file level recovery path, we will get something like this:
C:\VeeamFLR\ad02\Volume1\Windows\NTDS\ntds.dit

Now again try to add the database file and point to the correct file:


After loading the db, the VEAD should show you your AD structure:


So lets try to restore a user. First of all you can easily see your attributes by right clicking the user:


Opening the attributes, I was wondering where they hid the recovery for individual attributes. Well it is actually part of the user recovery wizard. So lets follow the restore wizard:


First define the  AD to restore to and the credentials to use


Then specify where to restore the user


In the next step you can decide what to restore


This is the "twist" about the wizard, you can actually recover the user with the password. So if by accident you deleted a user during the night, you can recover him and next day, the end user won't even notice that he was deleted.


 Do you want to enable the account? :)


Final step is to click the restore button.

What is also cool is that not only can you restore an individual user, but also a whole OU if required:


Another great thing about it, is that it should be exchange aware. So if you restore a user, it should connect it to the correct mailbox as well.

So have fun playing with the beta! And remember kids, it's a beta, don't test it in production :)