On a Win7 NTFS volume, I'm using cwrsync which supports --link-dest correctly to create "snapshot" type backups. So I have:
The content of 2010-12-02 is mostly hardlinks back to files in the 2010-11-28 directory, but there are a few new or changed files only in 2010-12-02. On linux, the 'du' utility will tell me the actual size taken by each incremental snapshot. On Windows, explorer and du under cygwin are both fooled by hardlinks and shows 2010-12-02 taking up a little more space than 2010-11-28.
Is there a Windows utility that will show the correct space acutally used?
Try using Sysinternals Disk Usage (otherwise know as du), specifically using the -u and -v flags will only count unique occurrences, and will show the usage of each folder as it goes along.
As far as I know the file system doesn't show the difference between the original file and a hard link (that is really the point of a hard link) so you can't discount them on a folder-by-folder basis, but need to do this comparatively.
To test I created a random folder with 6 files in to. Cloned the whole thing. Then created several hard and soft links inside the first folder to reference other files in the first folder, and also some in the second.
Running du -u -v testFld results in (note the values next to the folders are in KiB):
du -u -v testFld
Size: 162,794 bytes
Size on disk: 162,794 bytes
Running du -u -v testFld\a results in:
du -u -v testFld\a
Running du -u -v testFld\b results in:
du -u -v testFld\b
Notice the mismatch?
The symlinks in A that refer to files in B are only counted against A during the "full" run, and B only returns 54 (even though the files were originally in B and hard-linked from A). When you measure B seperately (or, if you don't use the -u unique flag) it will count its "full" measure of 74.
TreeSize Professional (~$55, 30 day trial) claims to distingish NTFS hardlink disk space. A quick trial seems to bear this out.
Hardlink support is not turned on out of the box: go to Tools > Options > Scan, re-scan, then use Ctrl-1 and Ctrl-2 to switch between Size and Allocated space. Allocated is actual space used, while Size is the statistic normally reported by other programs.
There is a performance penalty for turning on hardlink support (and symlinks and mounts too if you want that also). The colour palette is garish for my taste, but that seems to be par for the course in this genre. Also be careful when clicking around in the box chart area -- it's easy to accidentally move a folder with a mistaken drag-n-drop when you only meant to expand it.
I think some facts need to be set right here.
Windows cannot "detect" hardlinks, since every file is actually a hardlink to a bunch of bytes on the disk.
The du tool detects duplicates, but that is false too, since if folder A contains files and B only contains hardlinks to the files in A, then du of A and du of B will return the same answer - the size of the files coming originally from A, but these files are now also in B.
This is actually correct, since for example if you deleted A then its files will not be deleted
on the disk, because they are still referenced by B. With hard-links, which file is the source and which one is the hard-link is quite arbitrary and meaningless.
Products such as du will list a directory while discounting duplicates.
This will only work if all files and hard-links are contained in one directory.
Many folder-list products do that.
Conclusion: With hard-links, the question of "the actual size used in an NTFS directory" is meaningless.
I foolishly used Dupemerge to change all my duplicate files into hard links. Now Windows XP is not running right, eg, explorer won't start.
Is there a utility which would traverse the filesystem looking for hard links, copy the file, delete the original link, and rename the copy, keeping the original attributes and name?
I doubt that there's a utility for undoing what was done. You can search for duplicates again, check their link counts and attributes (or maybe Dupemerge can help identify hard links to the same files) and do the copying by hand. This may at least help you find out whether hard links are the cause of problems.
Since you've converted them into hard links, you might be in luck and they might still show up as duplicates using something like DoubleKiller.
Either way, I doubt there's a utility for this exact task.
If all else fails I recommend a re-install...
to fix the operating system use the system file checker:
insert the windows xp installation CD
press CTRL + ALT + DEL to bring up the task manager, go to File > Run (New Task) and type sfc /scannow and click OK.
note: this will only restore the system files, but it will get you going again. as for other software affected you'll have to re-install or repair install where necessary.
SameFiles Assistant 3.1 might work:
Same Files Assistant is the hard links managing utility.
Same Files Assistant is the hard links managing utility.
Specifically one feature it has:
You can roll back hard links to the regular files at any time.
Try Hard Link Magic, it might help.
Also Microsoft's Junction has the ability to recursively traverse directories and list/delete junction-points.
Just be careful to create a system restore point before you do these manipulations.
I have written a Perl script that identifies all regular files that are hard links to the same data. The script works fine on UNIX and Cygwin. I haven’t tested it with Strawberry Perl or any other Windows port of Perl, but I thought I’d share it anyhow. On Windows (Cygwin) I would open a terminal and do ./list-dup-hard-links /cygdrive/c/.
# list-dup-hard-links - list regular file names pointing to the same inode
# list-dup-hard-links DIRECTORY
# For each inode that is referred to by more than one regular file, print
# the inode number and the list of corresponding files.
# Peter John Acklam <[email protected]
In Unix, hard links to one file links same "inode number". "stat" function returns file properties like size, mode, alter date, modification date, inode number, ..., but return inode number "0" for any file in Windows. Use perl Win32::IdentifyFile (CPAN) to get a file disk "localization". Hard links "links" to same disk "localization".