command line - compare two directory (each with multiple sub-directory and folder) and find files that are not present in either one of them in windows

18
2014-05
  • rdorlearn

    I want to compare two directory (including subfolders) for files (for example pictures) in two separate drives. I have thousands of files and are hidden inside subdirectory and folders.

    Small: Example

    Lets say in C: the following is list of folders and file

    C:\folder1
              file01.jpeg
              file02.jpeg
    
    C:\folder1\folder2 
              file10.jpeg
    
    C:\folder1\folder2\folder3 
              file04.jpeg
              file05.jpeg
    
    C:\folder1\folder4
    
              file06.jpeg
              file07.jpeg
              file03.jpeg
    

    D: has similar folder structure (may not be exact, except folder 1 is same) but some files may be missing or additional

    D:\folder1
              file01.jpeg
              file08.jpeg
    
    D:\folder1\folder2 
              file03.jpeg
    
    D:\folder1\folder2\folder3 
              file04.jpeg
    
    D:\folder1\folder4
              file06.jpeg
              file07.jpeg
              file09.jpeg
    

    Now I need quick way (may be dos command line or software that can find files that are missing or additional and put the files in a new directory say D:\difference

    D:\difference
        file02.jpeg 
        file05.jpeg
        file08.jpeg 
        file09.jpeg
        file10.jpeg     
    
  • Answers
  • Kevin Fegan

    A script to move the files is below...

    If you just want to see which files are the same/different, you can use windiff. This might help with troubleshooting problems with the script.

    So, for your example:

    C:> windiff c:\folder1 d:\folder1
    

    Windiff will open and show which files are:

    • Identical
    • Different (indicating which file is newer)
    • Left-only (file exists only in "first path" (C:\folder1)
    • Right-only (file exists only in "second path" (D:\folder1)

    You can save the findings to a file using the command line option: -S:

    -SS N:\path\filename.ext   [save list of identical files to filename.ext]
    -SD N:\path\filename.ext   [save list of different files to filename.ext]
    -SL N:\path\filename.ext   [save list of left-only files to filename.ext]
    -SR N:\path\filename.ext   [save list of right-only files to filename.ext]
    

    Also, you can include X with -S to close windiff after writing the list like this:

    -SRX N:\path\filename.ext   [save list of right-only files to filename.ext]
    

    You can combine the lists, so if you want a list of the files that exist in only (any) one of the paths:

    C:> windiff -SLRX leftrightonly.txt c:\folder1 d:\folder1
    

    You can only generate one "log" file at a time, so if you wanted to generate all 4 individual "log" files you would have to run windiff 4 times:

    C:> windiff -SSX same.txt c:\folder1 d:\folder1
    C:> windiff -SDX different.txt c:\folder1 d:\folder1
    C:> windiff -SLX leftonly.txt c:\folder1 d:\folder1
    C:> windiff -SRX rightonly.txt c:\folder1 d:\folder1
    

    Note: Files that exist in both paths, but located in different folders, will be shown as "leftonly" or "rightonly".



    If you want a script to move the files that exist in only one of the paths to a different folder, you can use the batch script below.

    Notes:

    • I'm calling a file that exists in only one of the paths a "lonely" file.
    • The script below (with variable "domove=0") will only display "lonely" files without moving them. After you have tested the script and are confident that the correct files will be moved, you can change the value to: variable "domove=1" to have the "lonely" files displayed and moved.
    • In the script, set sdrive1, sdrive2, sfolder, and sdifffolder as necessary.
    • Alternately, set spath1, spath2, and spathdiff if that is more appropriate for your use.
    • If desired, the script could easily be modified to accept these paths from the command line.

    I have made the following assumptions:

    • For each file in the "first path", the entire "second path" is searched for a matching "filename.ext".
    • If the file is NOT found in the "second path" (lonely file), it is moved to a "difference" folder.
    • No attempt is made to compare files that have matching filenames, but that functionality could easily be added.
    • No attempt is made to account for the possibility that multiple files could have the same name and be located in different subfolders of the "first path" (same for "second path"). If this happens for a file that is a "lonely" file, each of those files will be moved to the "difference" folder, overwriting any previously moved files of the same name.
    • After each file in the "first path" is searched for within the "second path", the process is repeated in the other direction, and for each file in the "second path", the entire "first path" is searched for a matching "filename.ext".

    Here is the script:

    @echo off
    
    rem use "domove" for testing
    rem if "%domove%" !=1, "lonely" files found will only be displayed (not moved).
    rem if "%domove%"  =1, "lonely" files found will be displayed and moved.
    set "domove=0"
    
    set "sdrive1=C:\"
    set "sdrive2=D:\"
    set "sfolder=folder1"
    set "sdifffolder=difference"
    
    set "spath1=%sdrive1%%sfolder%"
    set "spath2=%sdrive2%%sfolder%"
    set "spathdiff=%sdrive2%%sdifffolder%"
    rem spath1=C:\folder1, spath2=D:\folder1, spathdiff=D:\difference
    
    rem ***************************************************
    
    rem check if "path1" and "path2" exist
    if exist "%spath1%" if exist "%spath2%" goto :check2
    if not exist "%spath1%" echo Error: Path1:"%spath1%" does not exist.>&2
    if not exist "%spath2%" echo Error: Path2:"%spath2%" does not exist.>&2
    goto :EOF
    
    
    
    :check2
    
        rem check if "path1" is empty (no files)
        dir /a-d /s /b "%spath1%">nul 2>&1
        if %errorlevel% EQU 0 goto :check3
        echo Error: Path1:"%spath1%" is empty (no files).>&2
        goto :EOF
    
    
    
    :check3
    
        rem check if "path2" is empty (no files)
        dir /a-d /s /b "%spath2%">nul 2>&1
        if %errorlevel% EQU 0 goto :check4
        echo Error: Path2:"%spath2%" is empty (no files).>&2
        goto :EOF
    
    
    
    :check4
    
        rem check if "%spathdiff%" exists, but is a file (error)
        if not exist "%spathdiff%" goto :start
        if exist "%spathdiff%\*" goto :start
        echo Error: Folder "%spathdiff%" conflicts with a file with the same name.>&2
        goto :EOF
    
    
    
    :start
    
        rem get a list of all files in "first path", call :work1
        rem passing "(path1:)C:\path\...\filename.ext", "filename.ext", and "D:\path2"
        for /f "usebackq delims=" %%f in (`dir /s /b /a-d "%spath1%"`) do call :work1 "%%~f" "%%~nxf" "%spath2%"
    
        rem reverse the paths:
        rem get a list of all files in "second path", call :work1
        rem passing "(path2:)D:\path\...\filename.ext", "filename.ext", and "C:\path1"
        for /f "usebackq delims=" %%f in (`dir /s /b /a-d "%spath2%"`) do call :work1 "%%~f" "%%~nxf" "%spath1%"
    
        rem done, exit
        goto :EOF
    
    
    
    :work1
    
        set "w1full=%~1"
        set "w1file=%~2"
        set "wpath2=%~3"
    
        rem "%w1file%" is the "target" "filename.ext" from "first path" to look for (in "second path").
        for /f "usebackq delims=" %%g in (`dir /s /b /a-d "%wpath2%"`) do call :work2 "%%~nxg" "%w1file%" "w1file"
    
        rem if "target" "filename.ext" from "first path" was found in the "second path",
        rem it is now empty. it means:
        rem file is somewhere in both paths... no action. go get next file from "first path"
        if "%w1file%."=="." goto :EOF
    
        rem at this point, "%w1file%" ("%w1full%") is a "lonely" file
        rem "%w1file%" only exists in "path1" move it to "difference" path
        rem additional checks might be necessary here
        rem to see if this file already exists in "difference" path
        if not exist "%spathdiff%" md "%spathdiff%">nul 2>&1
        echo Found "lonely" file %w1file%:   move "%w1full%" "%spathdiff%"
        if %domove% EQU 1 move /y "%w1full%" "%spathdiff%">nul 2>&1
        rem you can test %errorlevel% here for error: %errorlevel%=0 if no error
    
        rem go get next file from "first path"
        goto :EOF
    
    
    
    :work2
    
        rem %1 is "current" "filename.ext" from "second path"
        rem %2 is "target" "filename.ext" from "first path"
    
        rem if "target" "filename.ext" is empty, return for more
        if "%~2."=="." goto :EOF
    
        rem if file from "first path" is found in "second path", 
        rem "clear" the variable holding the filename of the "target" "filename.ext" from "first path"
        rem additional checks might be necessary here
        rem to "compare" the two files
        if /I "%~1"=="%~2" set "%~3="
        goto :EOF
    

    Here is the output testing the script with the sample fileset you described:

    Found "lonely" file file02.jpeg:   move "c:\folder1\file02.jpeg" "d:\difference"
    Found "lonely" file file10.jpeg:   move "c:\folder1\folder2\file10.jpeg" "d:\difference"
    Found "lonely" file file05.jpeg:   move "c:\folder1\folder2\folder3\file05.jpeg" "d:\difference"
    Found "lonely" file file08.jpeg:   move "d:\folder1\file08.jpeg" "d:\difference"
    Found "lonely" file file09.jpeg:   move "d:\folder1\folder4\file09.jpeg" "d:\difference"
    
  • Synetech

    If your goal is simply to detect which files are present in one folder and copy them, then you can use a file-sync program. There are plenty which can do the job.

    A folder-comparison program is definitely good for having better control because you can review the differences and they usually also have more refined comparison parameters. You indicated that you simply want to copy the files rather than getting a list or integrating it into a script, so a GUI program is probably ideal for your situation.

    One of the best for this purpose would be WinMerge. Not only does it let you compare the contents of two directories and provide various filters like hiding identical files, but it makes it extremely easy to copy over the ones you want (figure 1).

    It’s also actively developed, so you can file bug-reports and feature-requests. Best of all, it’s free.


    Figure 1: WinMerge has several methods to synchronize files

    Screenshot of WinMerge with several methods of synchronizing files shown


  • Related Question

    windows - Script to create folders in multiple directories using YYYYMMDD date as the folder name
  • Questioner

    At work every morning I have to create multiple file folders (using a YYYYMMDD date format as the file folder name) in different directories across our network for various departments. This is a real pain, and time waster and I would like to automate the process. So my question is. Does anyone know how can i write a script that uses the current system date in YYYYMMDD format, and creates multiple folders in different network directories with each folder named as the date in YYYYMMDD format? Thanks in advance for your answers.


  • Related Answers
  • Snark

    Create a batch file that looks like this:

    @echo off
    for /F "tokens=2-4 delims=/ " %%i in ('date /t') do set yyyymmdd=%%k%%j%%i
    echo Date: %yyyymmdd%
    
    mkdir \\server1\share1\subdir1\%yyyymmdd%
    mkdir \\server1\share2\subdir2\%yyyymmdd%
    mkdir \\server2\share3\subdir3\%yyyymmdd%
    ...
    

    Warning: the format of the date (yyyymmdd=%%k%%j%%i) depends on your regional settings. Because I use the French date format (dd/mm/yyyy), I have to use "%%k%%j%%i" as the format (%%i = day, %%j = month, %%j = year).

    If your regional settings are set to US style (mm/dd/yyyy), you should use "%%k%%i%%j" (%%i = month, %%j = day, %%j = year).


    If you want to include the time as well, use this:

    @echo off
    for /F "tokens=2-4 delims=/ " %%i in ('date /t') do set yyyymmdd=%%k%%j%%i
    echo Date: %yyyymmdd%
    for /F "tokens=1-3 delims=: " %%i in ('echo %time%') do set hhmmss=%%i%%j%%k
    echo Time: %hhmmss%
    
    mkdir \\server1\share1\subdir1\%yyyymmdd%%hhmmss%
    

    The date is stored in the variable %yyyymmdd%, the time in %hhmmss% . Same remark as above for the date, not applicable for the time.

    You could use a separator between the date and time: %yyyymmdd%_%hhmmss% for instance.

  • user7963

    Another, uglier but much more flexible way, is to generate a separate batch file for every directory that needs to be created, that (a) creates the directory and (b) renames the next batch file that needs to be executed to a previously selected common name. You just run a batch file with that common name every day