How to create VHD disk image from a Linux live system?

06
2014-04
  • Federico

    Once more, I have to resort at the experts here at SuperUser, as my other sources (mainly Google ;-)) didn't prove very helpful...

    So basically, I would like to create a VHD image of a physical disk to be archived/accessed/maybe even mounted in a virtual machine. Now, there are dozens of articles and tutorials on how to do that on the web, but none that meets exactly the conditions I would like to achieve:

    • I would like the destination file to be a VHD image, as Windows 7 can mount it natively, even over the network and many other programs can use it (VirtualBox, ...)
    • The disk I'm trying to image contains a Windows XP install, so in theory, I could use the disk2vhd utility, but I would like to find a solution that doesn't require booting that Windows XP install (ie keep the disk read-only)
    • Thus I was searching for a solution involving some sort of live system (running from a USB stic or the network)

    However, all the solutions that I've came across either make use of disk2vhd or use the dd command under linux, which does a complete copy of the disk (ie even empty blocks) and does not output a VHD file. Is there a tool/program under Linux that can directly create a VHD file? Or is is possible to convert a raw disk image created using dd to a VHD file, without allocating space for the empty blocks? How would you proceed?

    As always, any advice or comment is highly appreciated!!

  • Answers
  • Federico

    For future reference, here is how I finally proceeded, with a few comments on the various issues or pitfalls encountered:

    1. Boot the machine with a Linux live system

    First step was to boot the machine containing the disk to image, using a Linux live system.

    NOTE: My first idea was to use an Ubuntu Live USB disk, but the machine did not support booting from USB, so I found it easier to use an old Knoppix live CD.

    2. Image the disk using dd and pipe the data through ssh

    Then, I copied all the disk content to a file image on my local server using dd and piping the data through ssh:

    $ dd if=/dev/hdX bs=4k conv=noerror,sync | ssh -c blowfish myuser@myserver 'dd of=myfile.dd'

    A few comments here: this method will read all the disk contents, so it can take very long (it took me 5hrs for a 80Gb disk). The bottleneck isn't the network, but really the disk read speed. Before launching the copy, I advice to check the BIOS/disk/system parameters to ensure that the disk and the motherboard are working at their highest possible speed (this can be checked using the command hdparm -i and by running a test with hdparm -Tt /dev/hdX).

    NOTE: dd does not output progress of the operation, but we can force it to do so by sending the USR1 signal to the dd process PID from another terminal:

    $ kill -USR1 PIDofdd

    3. Reclaim the unused space

    At this point, the source machine is no longer needed and we will work exclusively on the destination server (running Linux as well). VirtualBox will be used to convert the raw disk image to the VHD format, but before doing so, we can zero out the unused blocks, so that VirtualBox does not allocate space for them in the final file.

    In order to do so, I mounted the images as a loopback device:

    $ mount -o loop,rw,offset=26608813056 -t ntfs-3g /mnt/mydisk/myfile.dd /mnt/tmp_mnt
    $ cat /dev/zero > zero.file
    $ rm zero.file
    

    NOTE: The offset indicating the beginning of the partition within the disk image can be obtained by using parted on the image file:

    $ parted /mnt/mydisk/myfile.dd
    (parted) unit
    Unit?  [compact]? B
    (parted) print
    Model:  (file)
    Disk /mnt/mydisk/myfile.dd: 80026361856B
    Sector size (logical/physical): 512B/512B
    Partition Table: msdos
    
    Number  Start         End           Size          Type      File system  Flags
     1      32256B        21936821759B  21936789504B  primary   ntfs         boot
     2      21936821760B  80023749119B  58086927360B  extended               lba
     5      26608813056B  80023749119B  53414936064B  logical   ntfs
    

    NOTE2: The default Linux kernel NTFS driver provides read-only access, thus it is necessary to install and use the userspace ntfs-3g driver or writing to the disk will raise an error!

    4. Create the VHD image using VBoxManage

    At this point, we can use the VirtualBox utilities to convert the raw image to a VHD file:

    VBoxManage convertfromraw myfile.dd myfile.vhd --format VHD
    
  • Steven Monday

    One approach is to use a couple of handy technologies: VirtualBox, and the ntfsprogs package.

    Recent versions of VirtualBox allow you to create VHD hard disk files, while ntfsprogs provides the ntfsclone utility. As its name suggests, ntfsclone clones NTFS filesystems, and I believe that it does it at the filesystem level, skipping over unused disk blocks.

    So, to begin, create a new VM in VirtualBox, and provision a new, empty VHD-file drive for it. The VHD drive need only be as large as the size of data in use on the physical drive you want to clone (well actually, make it a little bit larger, to allow for some wiggle room).

    Next, find a Linux live CD that contains the ntfsprogs package, as well as openssh-server. I like System Rescue CD for this, but pretty much any Debian- or Ubuntu-based live CD should work as well.

    Boot the VirtualBox VM with the Linux live CD, and start sshd within the VM so that you will be able execute commands on it remotely. Partition the empty VHD drive approriately, using whatever partitioning tool you prefer (I like plain old fdisk, but I'm somewhat old school).

    With another copy of the Linux live CD, boot the machine containing the physical disk you want to clone. I assume that the VirtualBox VM and this machine are accessible to each other over the network. On this machine, execute the following command (all on one line):

    ntfsclone --save-image -o - /dev/sdXX |
        ssh root@VirtualBox-VM 'ntfsclone --restore-image --overwrite /dev/sdYY -'
    

    where:

    • /dev/sdXX is the device name (on the local machine) of the physical drive you want to clone, and
    • /dev/sdYY is the device name (in the VM) of the VHD destination drive.

    Explanation: The first ntfsclone command in the pipeline extracts an image of the source NTFS filesystem and sends it out through the ssh tunnel, while the second ntfsclone command receives the image and restores it to the VHD drive.

    Once the operation completes, the VHD file should contain a file-for-file exact clone of the original physical disk (barring any hardware errors, like bad sectors, that might cause the process to abort prematurely).

    One last thing you may want to do is to run a Windows chkdsk on the VHD drive, just to ensure the cloning didn't introduce any problems (it shouldn't have, but hey, I'm a bit paranoid about these things).

  • Bernard

    (I know this is an old post, but maybe this will at least help others.)

    Do you really need to be able to create the image under Linux? I think you are trying to backup Windows hard drives to VHD. I don't think that Linux file systems can be encapsulated in a VHD container.

    A free SysInternals utility called Disk2vhd will create VHDs of the currently running Windows XP or newer system. This should also work for imaging USB mounted drives.

  • NelsonT

    I also try to clone a live linux system recently.

    Still no perfect solutions. Suggest 2 ways you may try:

    (1) VMware vConverter (free)

    You need to install vsphere hypervisor (free) as the destination.

    I did this days ago, no lucks to succeed. Guess I install vSphere on a VirtualBox.

    Ref here: http://xrubenx.blogspot.com.au/2010/01/vmware-converter-standalone.html

    (2) Use this one http://www.r1soft.com/tools/linux-hot-copy/

    I will try R1Soft hotcopy this weekend.


  • Related Question

    compression - Compressed disk image on Linux
  • Aaron Digulla

    I just got my new computer with a much bigger harddisk. I think I copied all important files over but just to be sure, I'd like to keep a disk image of my old disk. To save space, I'd like to compress it but I didn't find an option to mount a compressed image.

    My goals:

    • Result must be easy to access
    • No need to decompress the whole thing before I can access anything
    • Files should be quick to locate - no TAR/CPIO archive
    • Necessary space should be less than just copying the files over

    So ideally, I'm looking for a read-only, compressed file system which I can create in a file and which grows automatically.


  • Related Answers
  • Aaron Digulla

    Use a forensic analysis software like GUYMAGER (open source, sourceforge.net). It has a nice UI which allows to quickly create a compressed disk image of an entire hard disk.

    Use "Advanced forensic image (.aff)" This creates a single, compressed file (well, it also creates an .info file).

    To modify the default compression rate 1 (fastest, but least compression). If you have a fast computer with lots of cores, you can change this by creating /etc/guymager/local.cfg:

    AffCompression = 3
    

    9 is the best but slowest compression. 3 gives a good compression with good performance.

    Update

    Mounting isn't as simple as it seems. First of all, you need AFFLIB (Ubuntu: aptitude install afflib-tools). Now you can get the raw disk image with [affuse][3] <image> <mount-point>

    But for some reason, mounting the raw image fails. parted says the first partition starts with 1048576B but

    mount -t ext4 -o loop,ro,offset=1048576 /mnt/backup.raw /mnt/backup
    

    fails with the usual useless mount error:

    mount: wrong fs type, bad option, bad superblock on /dev/loop0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
    

    and dmesg says:

    EXT4-fs (loop0): VFS: Can't find ext4 filesystem