osx - Attempting to combine diff command and display last modified between dupes

06
2014-04
  • New-2-Terminal

    I am attempting to do something a bit unique, and cannot find the right terminal command line to use.

    I am attempting to essentially solve for the following problem:

    I have two volumes of data that exist on separate servers. These volumes presumably have many duplicate files. However, due to the nature of these servers, I cannot programmatically delete anything, as each is it's own edge case of use. Therefore, I must find the duplicates between the two volumes, then organize which of the two volumes has the "latest and greatest" last modified version of the dupe, and manually investigate each.

    For the sake of simplicity lets just call the volumes "Folder1" and "Folder2"

    I've gotten this far via the Terminal on my Mac machine:

    diff -rs /Folder1 /Folder2 > diff-test1.txt 
    

    This gives me the identities of the duplicates between the two, but does not tell me which volume has the 'latest and greatest'. Can anyone help?

  • Answers
  • Deditos

    I would use two rsync --dry-run commands rather than a single diff. E.g., given five files where some are newer in Folder1 and some are newer in Folder2,

    $ ls -port Folder?/*/*
    -rw-r--r--  1 Deditos   0 25 Jan 12:39 Folder2/subfolder/file03.txt
    -rw-r--r--  1 Deditos   0 25 Jan 12:39 Folder2/subfolder/file02.txt
    -rw-r--r--  1 Deditos   0 25 Jan 12:39 Folder2/subfolder/file01.txt
    -rw-r--r--  1 Deditos   0 25 Jan 12:39 Folder1/subfolder/file05.txt
    -rw-r--r--  1 Deditos   0 25 Jan 12:39 Folder1/subfolder/file04.txt
    -rw-r--r--  1 Deditos   0 25 Jan 12:39 Folder1/subfolder/file03.txt
    -rw-r--r--  1 Deditos  29 25 Jan 12:44 Folder1/subfolder/file01.txt
    -rw-r--r--  1 Deditos  29 25 Jan 12:44 Folder1/subfolder/file02.txt
    -rw-r--r--  1 Deditos  29 25 Jan 13:13 Folder2/subfolder/file04.txt
    -rw-r--r--  1 Deditos  29 25 Jan 13:13 Folder2/subfolder/file05.txt
    

    You can check what rsync [SRC] [DEST] would copy from a source directory to update a destination directory:

    $ rsync --dry-run -ariu ./Folder1/ ./Folder2/
    .d..t.... ./
    .d..t.... subfolder/
    >f.st.... subfolder/file01.txt
    >f.st.... subfolder/file02.txt
    $ rsync --dry-run -ariu ./Folder2/ ./Folder1/
    .d..t.... ./
    .d..t.... subfolder/
    >f.st.... subfolder/file04.txt
    >f.st.... subfolder/file05.txt
    

    This also gives you some info about the nature of the difference between the files, e.g., >f.st.... means that the size (s) and the time stamps (t) are different.

    NB If you omit the --dry-run flag then rsync will actually attempt the transfer so be careful with that.


  • Related Question

    Tool to make a DIFF between HTML tables?
  • Joannes Vermorel

    I am seeking a tool to make a DIFF between tables displayed in HTML tables - typically tables with identical layout, filled only with numbers, the numbers differing from on version to another.

    Raw diff tool at the HTML level aren't readable enough for my purpose. I am rather seeking something in the spirit of TableTools but with DIFF support.

    Does anyone know a solution for that?


  • Related Answers
  • Sualeh Fatehi

    Beyond Compare is by far the best visual diff tool for Windows. It is not free, but not expensive. From the website: "Compare .csv data or HTML tables in a Data Compare session"