Using Veeam metadata for efficient extraction of Backup artefacts (1/3)

Rédigé par Maxence Fossat - 08/02/2024 - dans CSIRT - Téléchargement

Veeam Backup & Replication is a widely-used software suite for creating and managing backups of virtual, physical and cloud machines. In a remote incident response, where efficient data access is key, Veeam metadata files can be used to list and search for Backup objects. This article explores the structure of Veeam metadata and how to use a Velociraptor artifact to restructure this data.

Introduction

Data protection and disaster recovery are key aspects of any mature information system. Therefore, Veeam Backup & Replication1, as a powerful backup management solution, has been widely adopted by IT administrators around the world. Veeam Backup & Replication allows creating compressed and/or encrypted image-level backups of virtual, physical and cloud machines and to restore from them.

In case of a security incident, backups might be the last resort for forensic investigators. Yet there is still a lack of forensic tools or methodologies to remotely access and analyse Veeam backups. Forensic analysts usually need to revert the machines to the desired state to perform their analysis or remotely connect to the Veeam server and use their proprietary software to find the desired backup and extract its contents.

This article will be the first in a series that explores combining open-source software and Veeam's publicly accessible software to efficiently list, choose and extract from backups remotely with minimal network bandwidth consumption. It will start by an exploration of Veeam's .vbm files, the Veeam backup chain metadata files, and how they can be used to remotely list backup objects of unencrypted backups with the Velociraptor2 open-source tool.

Throughout this article, we will refer to Veeam Backup & Replication as VBR.

The Lab

The initial research for this article featured a lab environment composed of the following elements:

  • A Hyper-V3 standalone server, hosting:
    • A Windows 10 virtual machine
    • A Debian 12 virtual machine
  • A Windows Server 2022 virtual machine, configured as a Domain Controller, with Veeam Agent for Microsoft Windows4 installed
  • A Windows 10 virtual machine, joined to the domain, with Veeam Agent for Microsoft Windows installed

Both of the virtual machines that were not hosted on the Hyper-V server were used to simulate physical machines where the Veeam Agent is responsible for copying raw disk images.

Limitations

The only Backup Method used throughout the research was Forever Forward Incremental (FFI)5. As such, some concepts in this article might not apply to Forward Incremental (FI)6 and Reverse Incremental (RI) Backup Methods7. This would require further testing. The concept of backup methods is briefly explained in the "Backup Chains" section of this article and in greater detail on VBR's documentation8.

The Backup Chain Format9 is the default one for VBR version 12 and onward. This article makes no assumption for legacy Backup Chain Formats or for single-file backups.

Since there was only one Hyper-V host, the Replication10 feature of VBR was not tested. This article thus focuses entirely on the Backup11 feature. As per the absence of any VMware vSphere hypervisor in this lab environment, this article may also contain inaccurate or insufficient information on backups of VMware vSphere VMs and will not cover cloud VMs (AWS EC2, Microsoft Azure VMs, Google Cloud VMs).

Fundamental concepts of Veeam backups

Objects

The first step in backing up machines using VBR is to add Objects to the Inventory. The Inventory is a list of assets which are potential sources and/or targets for backup jobs, replication jobs and other activities. Different types of Objects can be registered to the Inventory. For virtual infrastructures, VMware vSphere and Microsoft Hyper-V hypervisors can be added, either as standalone servers (ESXi or Hyper-V hosts) or via clustering/management servers (vCenter Server, SCVMM Servers, Hyper-V clusters)12. These hypervisors will then be scanned to establish a list of VMs, each VM represented as an Object in the backup infrastructure.

In the following example, two virtual machines were added by scanning the Microsoft Hyper-V server at IP address 192.168.122.35:

The "Inventory" menu of Veeam Backup and Replication showing the following hierarchy: Virtual Infrastructure, Microsoft Hyper-V, Standalone Hosts. The Hyper-V host is selected and the virtual machines inside it are listed.
Microsoft Hyper-V standalone host registered in the Inventory

For physical infrastructures, a Protection Group13 must be created. It is a set of computers on which the Veeam Agent will be remotely pushed and installed to perform backup jobs. Protection Groups can be populated with a list of individual computers, Active Directory containers (Domains, OUs, etc.) or a dynamic CSV file exported from an asset management system. Each item in the Protection Group is an Object.

In the following example, two Objects were added by targeting the corporation.local domain from LAB-DC server:

The "Inventory" menu of Veeam Backup and Replication showing the following hierarchy: Physical Infrastructure, Protection Group 1. Protection Group 1 is selected and the computers inside it are listed.
Protection Groups in the Inventory

Storages

In order to know where to store backups, VBR needs to have at least one Backup Repository14. A default one is chosen at installation by the administrator. It can be physical storage mounted directly on the Veeam Server or a Network Attached Storage (NAS). Directories specified as Backup Repositories are used as containers for Storages, which are the files where the compressed (and sometimes encrypted) backed up data resides.

In this example, the Default Backup Repository is set to the C:\Backup folder directly on the Veeam server:

The "Backup Infrastructure" menu of Veeam showing the Default Backup Repository.
Default Backup Repository

Hosts

In VBR terminology, servers that host or manage Objects (VMs, physical machines, etc.) are called Hosts. These Hosts, also known as Managed Servers, are usually the hypervisors that were added to the Inventory as well as the Veeam servers that manage physical machines backed up via the Veeam Agent.

In the following example, two Hosts are part of the Backup Infrastructure: the Veeam server (VEEAM-SRV) and the Hyper-V hypervisor.

The "Backup Infrastructure" menu of Veeam showing the list of Managed Servers.
List of Hosts or Managed Servers

Backup Jobs and Policies

Specific options for backups, such as compression level, encryption, targets to back up, Backup repository, retention policy, scheduling of automatic launch, etc. are configured in Backup Jobs15 and Backup Policies16. The difference between Jobs and Policies is that Jobs are managed by the backup server which controls scheduling and executes the backup remotely, whereas Policies configure Veeam agents to schedule and execute backups independently.

In this example, a Backup Job has been created for the Hyper-V VMs and a Backup Policy has been pushed to the Veeam Agents.

List of Backup Jobs and Policies.
List of Backup Jobs and Policies

Restore Points

Every time a Backup Job or Policy is run, it creates a Restore Point in a particular Storage inside the Backup repository. Every Restore Point can be used to revert the machine state to what it was when the Backup Job or Policy was run. You can think of Restore Points as "snapshots" of data at specific points in time.

Here, the Objects that were backed up due to a Job or Policy are listed, with the number of Restore Points available:

A list of machines which were backed up due to a Backup Job or Backup Policy.
List of Objects that were backed up

Looking at each Object in more detail provides the name (Name) and size (Backup Size) of the Storage file, the size of the data that was actually copied (Data Size), the date of the backup and its type (Full, Increment, etc.). The types of Restore Points are explained in the next section on Backup Chains.

The properties of the Hyper-V Backup Job, listing the Objects and for each Object the Restore Points with dates, Storage files and details.
Properties of the Hyper-V Backup Job

Backup Chains

Backup Chains17 are sequences of backup files created by Jobs. Each Backup Chain provides the ability to recover data for a particular Job on an Object, by walking back files in the chain up to the desired Restore Point. The main types of backup files are:

  • VBK: Veeam full backup files. They store copies of full disk images.
  • VIB: Veeam incremental backup files. They store incremental changes of disk images.
  • VRB: Veeam reverse incremental backup files. They store incremental changes of disk images in reverse order. This is only used in the Reverse Incremental (RI) backup method.
  • VBM: Veeam backup chain metadata files. They store information about the Backup Job, the Objects processed by the Backup Job, the number and structure of backup files, the Restore Points, and so on.

In VBR terminology, .vbk, .vib and .vrb files can be considered as Storages.

All backup files created by the backup job reside in a dedicated job folder in the backup repository. For example, the first backup job that was created in the lab was named Backup Job Hyper-V VMs and configured to be stored on the default backup repository C:\Backup\. VBR created the folder C:\Backup\Backup Job Hyper-V VMs\ and stored all backup files produced by the job in this folder.

Windows file explorer showing a list of VBK and VIB files inside the folder C:\Backup\Backup Job Hyper-V VMs
Files generated by the "Backup Job Hyper-V VMs" backup job

The type of backup files and how VBR orders them in the backup chain depend on the chosen backup method. During this research, only FFI (Forever Forward Incremental) backup method was tested. This method creates a backup chain that consists of the first full backup file followed by increments.

Diagram showing a first full backup created on sunday then incremental backups created each day of the week.
Source: https://helpcenter.veeam.com/docs/backup/hyperv/incremental_forever_bac…

In most cases, the first run of the Backup Job or Policy will be the only one to create full backups. But certain changes of configuration can trigger a new full backup. For example, in our lab, changing the Backup Policy to remove encryption caused it to recreate a full backup.

Properties of the Agent Backup Policy. Shows that there are 3 Restore Points, the first is a Full backup, then an Increment, then another Full backup.
Properties of the Agent Backup Policy

Veeam Configuration Database

VBR stores all information about backup infrastructure, jobs settings, job history, sessions and other configuration data in a Database server often referred to as the Configuration Database18. When deploying VBR, you must choose the placement of the configuration database. It may be either a local or a remote Database Server.

Exploring Veeam backup chain metadata files

Veeam backup chain metadata files are XML files containing all necessary information to list recoverable/extractable images. Starting from VBR version 12, the default behaviour of a backup job is to create a metadata file for each "workload" (each Object is usually a workload, so a metadata file would be created for each Object). When the Backup Job is run again, the metadata file is updated accordingly.

This exploration will start with unencrypted backups, since their metadata is entirely in cleartext. Stripping the XML data from a VBM file of all its values and attributes gives the following hierarchy for an unencrypted backup:

<BackupMeta>
    <Backup/>           <!-- Backup Chain information -->
    <BackupMetaInfo>
        <Hosts>         <!-- List of Hosts -->
            <Host/>
        </Hosts>
        <Storages>      <!-- List of Storages -->
            <Storage/>
            <Storage/>
            <Storage/>
        </Storages>
        <Points>        <!-- List of Restore Points -->
            <Point/>
            <Point/>
            <Point/>
        </Points>
        <Objects>       <!-- List of backed up Objects -->
            <Object></Object>
        </Objects>
        <Oibs>          <!-- Objects In Backup (special Veeam Configuration Database structure) -->
            <OIB/>
            <OIB/>
            <OIB/>
        </Oibs>
        <LogBackupInfo/>
        <CustomMetaOptions>
            <BackupOptionsInfo>
                <VmObjectId/>
                <BackupGfsOption/>
                <BackupRetentionOption/>
            </BackupOptionsInfo>
        </CustomMetaOptions>
    </BackupMetaInfo>
</BackupMeta>

OIBs (Objects In Backup)

One of the interesting artefacts found in the VBM files is the list of OIBs (Objects In Backup)19. OIB is an undocumented structure that is used at the Veeam configuration database's level for each backup. In theory, each Object that is backed up to a Restore Point would correlate to an OIB, although this has not been thoroughly tested.

This is where things become interesting! XML attributes of the OIB elements contain a plethora of interesting information. In the following data, some attributes were truncated for readability.

<OIB Format="0" Id="79e2b1b9-3373-4b21-9fa2-48f29053f693"
    OriginalOibId="79e2b1b9-3373-4b21-9fa2-48f29053f693"
    ObjectId="1f025505-ceea-4c2b-a467-1c0b202208e5"
    PointId="b924914f-b3cf-426f-be54-fdb8f10ca374"
    StorageId="7599dcfb-ee09-415e-ac17-f558b955daec"
    LinkId="5e3b62c2-1175-442b-8f83-52063176343c" IsCorrupted="False"
    IsRecheckCorrupted="False" IsConsistent="True" State="0" Type="2" Algorithm="2"
    InsideDir="3d0dc48d-042d-4ab0-994b-ee0146518814 (3568f913-2f5d-419d-829f-810839ab6e11)"
    CreationTime="01/04/2024 14:54:54" CreationTimeUtc="01/04/2024 14:54:54"
    VmName="srv-web" ApproxSize="5003804672" EffectiveMemoryMb="1024" HasIndex="False"
    HasExchange="False" HasSharePoint="False" HasSql="False" HasAd="False"
    HasOracle="False" HasPostgreSql="False" HasVeeamArchiver="False"
    AuxData="&lt;COibAuxData&gt;   [...]   &lt;/COibAuxData&gt;"
    ParentId="00000000-0000-0000-0000-000000000000"
    ParentOriginalOibId="00000000-0000-0000-0000-000000000000" DisplayName="srv-web"
    Fqdn=""
    GuestInfo="&lt;GuestInfo&gt;   [...]   &lt;/GuestInfo&gt;"
    CreationUsn="0" NeedHealthCheckRepair="False"
    CompletionTimeUtc="01/04/2024 14:55:26"
    SnapshotId="00000000-0000-0000-0000-000000000000"
    JobRunId="a747a148-e675-495e-835a-1d7f1a655328" HealthStatus="0"
    IsPartialActiveFull="False" ProductId="b1e61d9b-8d78-4419-8f63-d21279f71a56"
    ProductVersion="12.1.0.2131" ProductVersionFlags="0" ProductIsRentalLicense="False" />

Apart from state information (IsCorrupted, IsConsistent, NeedHealthCheckRepair, and so on), there are attributes for the name of the backed up machine (VmName), its memory in MiB (EffectiveMemoryMb) and the approximate size in bytes of extractable disk images (ApproxSize). Most importantly, the creation and completion time of the backup job in UTC (CreationTimeUtc / CompletionTimeUtc) are documented. The creation time in local time is also documented (CreationTime).

GuestInfo

The GuestInfo attribute of the OIB element contains escaped XML data. When extracted, this data provides more information on the backed up object:

<GuestInfo>
    <Property Name="GuestOsName">
        <Value>Debian GNU/Linux</Value>
    </Property>
    <Property Name="GuestOsType">
        <Value>debian4_64Guest</Value>
    </Property>
    <Property Name="DnsName">
        <Value>web-srv</Value>
    </Property>
    <Property Name="ToolsStatus">
        <Value></Value>
    </Property>
    <Property Name="ToolsVersionStatus">
        <Value></Value>
    </Property>
    <Property Name="Ip">
        <Value>fe80::215:5dff:fe7a:2301</Value>
        <Value>192.168.122.216</Value>
    </Property>
</GuestInfo>

Most of these properties are self-explanatory. In our testing, we have never seen a value set for ToolsStatus and ToolsVersionStatus, but online research shows it could be information on VMware tools20.

AuxData

The AuxData attribute of the OIB element also contains escaped XML data. This data, in turn, contains the most detailed information available in the metadata on the backed up objects. The structure and content of this data highly depend on the underlying Host that managed the Object. As per the layout of the research lab, this article will cover the structure of both Hyper-V AuxData and Veeam Agent AuxData.

Hyper-V

Here is an excerpt of the AuxData structure for the backup of the Debian 12 VM hosted on the Hyper-V hypervisor:

<COibAuxData>
    <HasVssMetadata>False</HasVssMetadata>
    <CreationTimeUtc>01/04/2024 14:54:54</CreationTimeUtc>
    <HvAuxData vmID="3568f913-2f5d-419d-829f-810839ab6e11" vmName="srv-web"
        vmStore="C:\ProgramData\Microsoft\Windows\Hyper-V"
        vmSnapshotFolder="C:\ProgramData\Microsoft\Windows\Hyper-V"
        vmSwapFilesFolder="C:\ProgramData\Microsoft\Windows\Hyper-V" userSnapshotType="3"
        isClusteredVm="False" isHv2015="True" VmConfigVersion="10.0" isShieldedVm="False"
        kdsEnabled="False" processorsCount="1" processorsLimitMhz="2688" memoryInMb="1024"
        memoryReservationInMb="512" memoryLimitInMb="1048576" dynamicMemoryEnabled="False"
        targetMemoryBuffer="0" weight="5000" vmGeneration="0" hasSharedDisks="False">
        <Host name="192.168.122.35" path="192.168.122.35" generation="17">
            <networks>
                <network>
                    <switch_name>B71EAEB2-D2CF-4174-BEC9-A37A3F921F34</switch_name>
                    <network_name>Intel(R) 82574L Gigabit Network Connection - Virtual Switch</network_name>
                </network>
            </networks>
        </Host>
        <GuestInfo>
            <!-- Truncated: identical to GuestInfo inside the OIB element -->
        </GuestInfo>
        <tgt_nics />
        <NicMappings />
        <disks>
            <disk is_excluded="False" use_cbt="True" use_rct="True" use_swap_filter="True"
                is_shared_disk="False">
                <disk_info>
                    <!-- Truncated: Virtual Hard Disk (VHDX) information covered in next sections -->
                </disk_info>
            </disk>
        </disks>
        <raw_disks>    <!-- Secondary files used by Hyper-V, covered in next sections -->
            <CRawDiskBackupObject SignatureId="HVCONFIG_VMCX">
                <CRawDiskInfo>
                    <!-- Truncated : VM Configuration file (VMCX) -->
                </CRawDiskInfo>
            </CRawDiskBackupObject>
            <CRawDiskBackupObject SignatureId="HVCONFIG_VMRS">
                <CRawDiskInfo>
                    <!-- Truncated : VM Runtime State (memory, etc.) file (VMRS) -->
                </CRawDiskInfo>
            </CRawDiskBackupObject>
            <CRawDiskBackupObject SignatureId="HVCONFIG_VMGS">
                <CRawDiskInfo>
                    <!-- Truncated : VM Guest State file (VMGS) -->
                </CRawDiskInfo>
            </CRawDiskBackupObject>
        </raw_disks>
        <excluded_src_disks />
        <vmSrcSwitchPorts>
            <VmSwitchPortInfo />    <!-- Truncated: Information on virtual switch used by the VM -->
        </vmSrcSwitchPorts>
        <vmTgtSwitchPorts />
        <oijId>34df77a8-aede-4fec-86a7-56b92e87a1b0</oijId>
    </HvAuxData>
</COibAuxData>

Some information is redundant with OIB attributes and seems to be the authoritative source for them. For example, the GuestInfo element inside HvAuxData is identical to the one in OIB and the memoryInMb attribute of HvAuxData is equivalent to EffectiveMemoryMb attribute of OIB.

However, the real power of the AuxData structure comes from the list of Virtual Hard Disks (disks) and the list of secondary files used by Hyper-V (raw_disks).

disks

The disks element contains a list of disk structures which provide details on the Virtual Hard Disks (VHD or VHDX)21 that were backed up. Examining the disk_info element gives plenty of crucial information on the backed up data.

<disk_info
    disk_id="e75a6a85-947e-4433-ab03-26c6719cfba6:1f025505-ceea-4c2b-a467-1c0b202208e5:4B7E4C5702BE54A61A4D094E938D1445:79e2b1b9-3373-4b21-9fa2-48f29053f693"
    sync_task_id="" state="Processed" valid_processed_offset="21474836480"
    capacity="21474836480" logical_sector_size="512" physical_sector_size="4096"
    use_block_exclude="True" ransomware_index_file_name=""
    src_disk_folder_path="C:\Hyper-V\" inside_dir="Ide0-0">
    <port_info BusType="1" Channel="0" Port="0" />
    <CHvVmRctIdentifier Type="RefencePoint"
        FileId="{b71a0d72-d6e1-4172-9d06-64a782c11356}" Generation="1"
        RctId="rctX:b71a0d72:d6e1:4172:9d06:64a782c11356:00000001" />
    <extent num="0" filename="srv-web.vhdx" folderpath="C:\Hyper-V\"
        size="5003804672" modified="01/04/2024 14:54:55" is_shadow="False">
        <snapshotCtp format="6.5" trackedFilePath="" blockSize="1048576" />
    </extent>
    <recovery_extent num="1"
        filename="srv-web_045CBADB-986E-413D-8C2B-33A9AA5E0492.avhdx"
        folderpath="C:\Hyper-V\" size="37748736" modified="01/04/2024 14:54:56"
        is_shadow="False" />
</disk_info>

Three pieces of information can be of particular interest to an analyst:

  • The disk's maximum capacity: this can be extracted from the capacity attribute of disk_info. In this example, the capacity of the disk is 20 GiB which corresponds exactly to 21474836480 bytes. From our testing, when the state attribute is at value Processed, the valid_processed_offset attribute is also equal to the capacity of the disk in bytes.
  • The Virtual Hard Disk filename: this is present in the filename attribute of the extent element. In this example, it is srv-web.vhdx.
  • The Virtual Hard Disk size: this is present in the size attribute of the extent element. This attribute will be really useful since it will help us in knowing the size each .vhdx file will take on disk when extracted from the backup file.

raw_disks

The raw_disks element lists CRawDiskBackupObject structures which are the details of secondary files of Hyper-V VMs. These structures will mainly contain information for VM Configuration files (VMCX), VM Runtime State files (VMRS) and VM Guest State files (VMGS)22. Let's examine a CRawDiskBackupObject in more detail:

<CRawDiskBackupObject SignatureId="HVCONFIG_VMCX">
    <CRawDiskInfo>
        <DiskId>
            e75a6a85-947e-4433-ab03-26c6719cfba6:1f025505-ceea-4c2b-a467-1c0b202208e5:B3A5A69C8BE337B3966AFC0D25DB0565:79e2b1b9-3373-4b21-9fa2-48f29053f693
        </DiskId>
        <SyncTaskId></SyncTaskId>
        <SourceFileName>766C1A2A-1A87-41D5-BB99-560161FBEAE3.vmcx</SourceFileName>
        <SourceFolderName>C:\ProgramData\Microsoft\Windows\Hyper-V\Snapshots</SourceFolderName>
        <SourceShadowFolderName>C:\ProgramData\Microsoft\Windows\Hyper-V\Snapshots</SourceShadowFolderName>
        <RepositoryFolderName>Config</RepositoryFolderName>
        <Capacity>57574</Capacity>
        <ValidProcessedOffset>57574</ValidProcessedOffset>
        <State>Processed</State>
        <RansomwareIndexFileName></RansomwareIndexFileName>
    </CRawDiskInfo>
</CRawDiskBackupObject>

Interesting fields in this structure are the name of the extractable file (SourceFileName) and its size in bytes (Capacity). Similarly to the disks structure, if the State is Processed, the ValidProcessedOffset will be equal the file's size in bytes.

Veeam Agent

Here is an excerpt of the AuxData structure for the backup of the domain controller where the Veeam Agent was installed.

<COibAuxData>
    <os platform="EVeeamAmd64" type="EVeeamOs2022" fqdn="LAB-DC.corporation.local" netbios="LAB-DC" />
    <AdInfo NtdsPath="C:\Windows\NTDS" DatabaseLogFolder="C:\Windows\NTDS"
        DatabaseFilePath="C:\Windows\NTDS\ntds.dit" SystemHivePath="C:\Windows\system32\config" />
    <HasVssMetadata>True</HasVssMetadata>
    <CreationUsn value="3" />
    <CreationTimeUtc>01/12/2024 22:11:21</CreationTimeUtc>
    <DesktopOibAuxData OsVersion="10.0.20348" Is64BitOperatingSystem="True">
        <OsName>Microsoft Windows Server 2022 Standard</OsName>
        <SnapshotCreationTime>-8584965094046102426</SnapshotCreationTime>
        <DirPath></DirPath>
        <Location></Location>
        <OijId>00000000-0000-0000-0000-000000000000</OijId>
        <burManifest></burManifest>
        <RealVmSize>14648324096</RealVmSize>
        <ContainsRecoveryMediaFiles>True</ContainsRecoveryMediaFiles>
        <Disk>
            <!-- Truncated: Backed up disk, discussed in next section -->
        </Disk>
        <ShadowVolumesLayout>
            <!-- Truncated: Shadow volumes information -->
        </ShadowVolumesLayout>
        <BackupLayout>
            <!-- Truncated: Layout of volumes and disks inside the backup -->
        </BackupLayout>
        <SystemConfiguration>
            <RAMInfo TotalSizeMB="4096" />
            <CPUInfo CoresCount="2">
                <Core FrequencyMHz="2688" />
                <Core FrequencyMHz="2688" />
            </CPUInfo>
        </SystemConfiguration>
        <NetworkAdapters>
            <NetAdapter Name="{F2034360-8D08-41C6-93AB-20B9865FD01C}"
                FriendlyName="Intel(R) 82574L Gigabit Network Connection"
                ConnectionFriendlyName="Ethernet" UiNumber="0"
                PnpId="00000000-0000-0000-0000-000000000000" DeviceInstanceIndex="0"
                Gateway="192.168.122.1" DhcpEnabled="False" MACAddress="52:54:00:4d:fb:f0">
                <IpAddresses>
                    <IpAddress Ip="192.168.122.50" NetMask="255.255.255.0" />
                </IpAddresses>
                <DnsServers>
                    <DnsServer Ip="127.0.0.1" />
                </DnsServers>
                <Gateways />
            </NetAdapter>
        </NetworkAdapters>
        <AppAwareProcessed>True</AppAwareProcessed>
        <DiskFilter>
            <!-- Truncated: Exclusions of disks from the backup -->
        </DiskFilter>
        <JobOptions>
            <!-- Truncated: Details of the Backup Job -->
        </JobOptions>
        <VMwareToolsInfo Version="" ServiceState="NotInstalled" />
    </DesktopOibAuxData>
</COibAuxData>

Just as with HvAuxData, some information on the guest machine can be found in DesktopOibAuxData. The amount of memory in MiB is in the TotalSizeMB attribute of SystemConfiguration > RAMInfo and the local IP addresses are in NetworkAdapters > NetAdapter > IpAddresses.

Information about the backed up disk is present in the Disk element:

<Disk EmulatedVolume="False" DiskNumber="0" BusType="SATA" Capacity="107374182400"
    StartingUsableOffset="0" ReadOnlyState="2" IsVirtualDisk="False" IsRemovable="False"
    LogicalSectorSize="512" FriendlyName="QEMU HARDDISK" PackStateFlags="1"
    IsClustered="False" IsQuorum="False" IsClusterGroup="False" DiskSystemId="">
    <DevSetupInfo
        DevPath="\\?\scsi#disk&amp;ven_qemu&amp;prod_harddisk#4&amp;35424867&amp;0&amp;000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}"
        IsExternalDisk="False" />
    <DriveLayout PartitionStyle="MBR" Signature="-6993338">
        <Partitions>
            <!-- Truncated: List of partitions in the disk -->
        </Partitions>
    </DriveLayout>
    <PossibleOwners />
    <ExistsInBackup>True</ExistsInBackup>
    <TaskId />
    <DiskId>
        bf7142c6-8acd-4cd7-84a4-350a7ac62610:82663d8b-2db6-480e-94f7-94cb32b8567f:FF954A46:6aa49b80-8533-4adf-a407-0f92dda4c3e3</DiskId>
    <State>Processed</State>
    <Capacity>107372085248</Capacity>
    <IsFileBackup>False</IsFileBackup>
    <IndexedDiskInfo>
        <VeeamDiskId Vendor="EndPoint" BusType="Sata" Id="0" BusNumber="0" SlotNumber="0"
            ControllerId="00000000-0000-0000-0000-000000000000" />
        <os platform="EVeeamAmd64" type="EVeeamOs2022" fqdn="LAB-DC.corporation.local"
            netbios="LAB-DC" />
        <GuestVolumeInfos>
            <volume mountpoint="c:\">
                <server type="System" />
            </volume>
        </GuestVolumeInfos>
    </IndexedDiskInfo>
    <ValidProcessedOffset>0</ValidProcessedOffset>
    <FailoveredToBackupCacheOffset>0</FailoveredToBackupCacheOffset>
    <OriginalDiskUniqueId>FF954A46</OriginalDiskUniqueId>
</Disk>

The size of the disk in bytes can be found in both the Capacity attribute of Disk and its Capacity element. The name of the raw disk image that will be extracted from the backup is in OriginalDiskUniqueId.

Correlating identifiers found in OIBs

In the OIB element, there are the following identifiers:

  • ObjectId: Identifier of the backed up Object. This correlates to an Object element inside BackupMetaInfo.
  • StorageId: Identifier of the Storage used. This correlates to a Storage element inside BackupMetaInfo.
  • PointId: Identifier of the Restore Point linked to this OIB. This correlates to a Point element inside BackupMetaInfo.

Using these identifiers, we can correlate data from the backup chain metadata to complete the information we want to list.

Object

<Object Id="1f025505-ceea-4c2b-a467-1c0b202208e5" Type="1"
    HostId="3d0dc48d-042d-4ab0-994b-ee0146518814" Name="srv-web"
    ObjectId="3568f913-2f5d-419d-829f-810839ab6e11" ViType="Virtual machine"
    Path="192.168.122.35\srv-web" Uuid="822fab35-581f-4390-96ad-e69e069bebf0"
    Platform="1" PlatformId="00000000-0000-0000-0000-000000000000"
    ParentId="00000000-0000-0000-0000-000000000000"
    Tag="veeam-hvlab2.local3568f913-2f5d-419d-829f-810839ab6e11"
    HashV2="8c7748f1-73f9-cc28-50c8-aa25f16763f5" DisplayName="srv-web">
        <!-- Truncated: Identical to GuestInfo -->
</Object>

From the Object structure, the only two attributes that are interesting for now are ViType and HostId. ViType has a value of Virtual machine if the Object is a VM and an empty value if it is considered a physical machine by VBR. HostId is an identifier that correlates to a Host element inside BackupMetaInfo.

Host

<Host Id="3d0dc48d-042d-4ab0-994b-ee0146518814" Moref="" Name="192.168.122.35" Type="7"
    Options="" HostInstanceId="veeam-hvlab2.local"
    HostInstanceIdV2="46412451-8f9a-45ee-8afd-8987c57ede61" HostUniqueId="" />

The Host structure contains the name of the Host that managed the backed up object (Name) and sometimes a friendly name for it (HostInstanceId).

Storage

<Storage Id="da533706-9c8e-4706-b59e-2a509f1ff2c5"
    BackupId="4c26199b-f31f-4b71-930b-45838affc6ba"
    HostId="6745a759-2205-4cd2-b172-8ec8f7e60ef8"
    FilePath="C:\Backup\Backup Job Hyper-V VMs\srv-web.3568f913-2f5d-419d-829f-810839ab6e11D2024-01-03T164550_748D.vbk"
    Version="1" CreationTime="01/03/2024 16:45:50" CreationTimeUtc="01/03/2024 16:45:50"
    ModificationTime="01/03/2024 16:48:03"
    Stats="&lt;CBackupStats&gt;   [...]   &lt;/CBackupStats&gt;"
    State="1" Availability="0" BlockSize="3" BlockAlignmentSize="4096" GfsPeriod="0"
    CreationMode="0" PartialIncrement="False"
    MetaCryptoKeyIdTag="00000000-0000-0000-0000-000000000000"
    StorageCryptoKeyIdTag="00000000-0000-0000-0000-000000000000"
    ObjectId="1f025505-ceea-4c2b-a467-1c0b202208e5"
    OriginalId="da533706-9c8e-4706-b59e-2a509f1ff2c5" ExternalContentMode="0"
    ChangeVersion="2"
    PartialPath="&lt;Path   [...]   &lt;/Path&gt;"
    LogMetaTag="0" IsExported="False" />

In the Storage structure, we find the path where the Storage file was first created (FilePath). In this case, it is a .vbk file stored in the backup job folder. The last part of this path, which is simply the Storage file's name, is also present in a structure called PartialPath:

<Path IsFilePath="True">
    <Elements>
        srv-web.3568f913-2f5d-419d-829f-810839ab6e11D2024-01-03T164550_748D.vbk
    </Elements>
</Path>

Another interesting element inside Storage is Stats which gives information on the size in bytes of the Storage file (BackupSize), the size of the data that was copied inside it (DataSize), the deduplication ratio (DedupRatio) and the compression ratio (CompressRatio):

<CBackupStats>
    <BackupSize>1496686592</BackupSize>
    <DataSize>21479214806</DataSize>
    <DedupRatio>16</DedupRatio>
    <CompressRatio>43</CompressRatio>
</CBackupStats>

Restore Point

<Point Id="e66e8fa2-70e6-4880-8790-f04fa96590e3" OriginalId="e66e8fa2-70e6-4880-8790-f04fa96590e3"
    LinkId="00000000-0000-0000-0000-000000000000" Num="1.0000000000"
    GroupId="a3fca43f-864f-4c80-8b55-02dd3fa0cde3" CreationTime="01/03/2024 16:45:50"
    CreationTimeUtc="01/03/2024 16:45:50" Type="0" Algorithm="0"
    BackupId="4c26199b-f31f-4b71-930b-45838affc6ba" />

The Point structure contains a number which increments for each new Restore Point in the Backup chain (Num). The Type attribute is the type of backup that was generated: from our testing, 0 corresponds to a Full backup (.vbk) and 1 corresponds to an Increment (.vib).

This structure also contains the creation time of the Restore Point in local time (CreationTime) and in UTC (CreationTimeUtc) which can be verified against the OIB. The BackupId can also be checked against the identifier of the Backup element inside BackupMeta to see if they match.

Backup

<Backup Id="4c26199b-f31f-4b71-930b-45838affc6ba"
    OriginalId="4c26199b-f31f-4b71-930b-45838affc6ba"
    JobId="e75a6a85-947e-4433-ab03-26c6719cfba6" JobName="Backup Job Hyper-V VMs - srv-web"
    PolicyName="Backup Job Hyper-V VMs" JobType="0" SourceType="4" TargetType="0"
    JobTargetHostId="6745a759-2205-4cd2-b172-8ec8f7e60ef8" JobTargetHostProtocol="0"
    RepositoryId="88788f9e-d8f5-4eb4-bc4f-9b3f5403bcec"
    DirPath="C:\Backup\Backup Job Hyper-V VMs"
    PartialPath="&lt;Path   [...]   &lt;/Path&gt;"
    MetaFileName="srv-web_FF4FA.vbm" MetaVersion="11" MetaUpdateTime="01/05/2024 10:01:53"
    BackupPlatform="1" BackupPlatformId="00000000-0000-0000-0000-000000000000"
    BackupPolicyTag="e75a6a85-947e-4433-ab03-26c6719cfba6" CreationTime="01/03/2024 16:45:58"
    CreationTimeUtc="01/03/2024 16:45:58" ParentBackupId="5b1fccac-ed0b-4a93-afbd-49410f030480"
    IsImported="False" IsExported="False" IsJustMigratedToSobr="False" EncryptionState="0"
    DeletedRetentionPeriodDays="0" EnableDeletedRetention="False" UsedMetaType="0"
    AttachedJobType="0" IsShadow="False" IsShadowArchiveTier="False" />

Finally, in the Backup structure at the root of the XML data, some attributes can be of interest, mainly the name of the Backup Job (JobName) or Policy (PolicyName), the path to the Backup directory where all the files related to the backup job are stored (DirPath) and the encryption state (EncryptionState). From our testing, a backup that was unencrypted at the time of creation will have a value of 0 in EncryptionState whereas a backup that was encrypted at the time of creation will have a value of 2.

Parsing and filtering with a Velociraptor artifact

Velociraptor is an open-source agent-based solution for digital forensics and incident response. It uses a client/server model where analysts can remotely launch "Artifacts"23 to query data and initiate response actions on the endpoint. Based on all the preliminary research on Veeam metadata, Synacktiv has developed and pushed a Velociraptor Artifact to the Artifact Exchange24 (a repository of Artifacts developed by the community).

Windows.Veeam.RestorePoints.MetadataFiles takes as input a list of paths to Backup Repositories and parses each Veeam backup chain metadata file found. The information within is selected and correlated to present only relevant fields to the analyst.

Generic information on Restore Points

Some of the extracted data is generic information on each Restore Point. For example, an analyst may want to know the creation time and completion time of the Restore Point, the approximate size of data inside it and the capacity of each backed up disk:

A list of virtual machine's name, restore point creation time and completion time, approximate size and disks capacity.
List of Restore Points with creation and completion times, approximate data size and disks capacity

The Artifact can also be used to list all Restore Points for a particular machine name:

A list of Restore Points filtered to show only restore points for "srv-web" machine. It lists the Restore Point number, its type and the name of the Veeam Host.
List of Restore Points for "srv-web" virtual machine

Filtering with Notebooks

Efficient filtering can be accomplished using Velociraptor Notebooks25 on the Artifact's results. These results contain many items useful for filtering:

  • The Operating System of the backed up machine
  • Its type: is it a virtual or physical machine?
  • Its DNS name
  • Its IP addresses
A table containing information on the backed up images (virtual or physical machine, size of memory in MiB, name of operating system, IP addresses.
Selection of useful items for Restore Points filtering

The Notebooks use a querying language, the Velociraptor Query Language (VQL)26, that provides some formatting functionalities. Combining all these concepts, we can, for example, select and format backup stats of Restore Points from virtual Windows 10 machines only:

SELECT VMName,
       CreationTimeUTC,
       humanize(bytes=int(int=BackupSize)) AS BackupSize, 
       humanize(bytes=int(int=DataSize)) AS DataSize,
       DeduplicationRatio,
       CompressionRatio
  FROM source(artifact="Windows.Veeam.RestorePoints.MetadataFiles")
  WHERE GuestOSName =~ 'Windows 10' AND VirtualType = 'Virtual machine'
  ORDER BY CreationTimeUTC
Table containing stats for each restore point of a Windows 10 virtual machine.
Result of the previous VQL query

Practical use case: preparing extraction

In a following blogpost, we will explore how to extract files from a Storage file. In order to predict which files can be extracted for a particular Restore Point, the Artifact parses the size of each extractable file and the path to the corresponding Storage file:

List of extractable files and their respective size for each Restore Point
List of extractable files and their respective size in bytes for each Restore Point
JSON listing of Restore Points with a highlight on the name of the Storage file and its path.
List of Restore Points with the name of the Storage file and its path

We encourage analysts to experiment with this Artifact and communicate to us any problem they might encounter. This Artifact is still in the early stages of development and we would greatly appreciate any feedback.

Conclusion

Through a deep-dive into Veeam Backup & Replication's metadata, we managed to pave the way for efficient Veeam backups extraction. Using the Windows.Veeam.RestorePoints.MetadataFiles Artifact as a foundation, we can now work remotely on VBR backups using only Velociraptor (an open-source software) and small metadata files that are only a few MiB big.

The next articles in this series will continue to explore VBR's metadata in different use cases and will provide options to precisely select relevant case data, even in terabytes of backup files, using Velociraptor and some tools provided publicly by Veeam.

This article would not have been possible without the help of the Velociraptor Discord. We would like to thank mike.cohen (@scudette), predictiple and andreas for their help in understanding the Velociraptor Query Language and its intricacies.