windows - Command line solution for removing parts from a binary file?

06
2013-09
  • zsero

    I have a binary file and I would like to remove parts from. By removing I mean deleting those parts and thus making the file's size smaller.

    The parts would be between two ASCII strings. So, for example the file would look like this

    ........ start ABCD end ..... start EFGH end ..... start IJKL end ...........
    

    So in this file, I would like to search for strings "start" and "end" and remove the parts between them.

    The way I think I can do it is to

    1. lookup all the locations for "start" and "end"
    2. calculate ranges from that
    3. delete those parts

    Now I am using some GUI based Hex editor and I use the "Search All", "Select Range" and "Delete" commands, but I am sure it would be possible to solve it using some powerful command line hex/text editors.

    Do you know any solution for this problem which doesn't require using a GUI for looking up, copy & paste on clipboard, select range and delete commands but is just a few lines of command line?

    I am interested ini both Linux shell scripts or using some command line hex editors under Windows, or even Python scrips are welcome.

    Do you think it is possible to solve this problem just by a simple Regex replace? Are there any regex replace util which handles binary files well?

  • Answers
  • Kerrek SB

    This sounds like a job for perl:

    perl -pe 's/start.*?end//g' < inputfile.bin > outputfile.bin
    

  • Related Question

    Deleting all files that do not match a certain pattern - Windows command line
  • KdgDev

    Deleting items via the command-line is pretty easy.

    del /options filename.extension
    

    Now, suppose I want to delete all files which do not end with .jpg in a folder, how would I do that.

    The thing is, I have a piece of software that converts all specified images to .jpg, but it leaves the originals, which I don't need anymore.

    It would be much more efficient to execute a single statement, compared to doing multiple statements for every different filetype.


  • Related Answers
  • bobbymcr

    I would do it like this:

    attrib +r *.jpg
    del /q *
    attrib -r *.jpg
    

    This will first make all JPG files read-only, delete everything else (it will automatically skip read-only files), and then make the JPG files writeable again.

  • Joey

    That's actually pretty easy.

    You'll need for to iterate over the files and then simply look for the extension:

    for %f in (*) do if not %~xf==.jpg del "%f"
    

    should do the trick (code here).

  • ChrisF

    I know it's not answering your question directly, but have you looked at the options on your converter to see if:

    a) It can delete the originals itself

    or

    b) Write the .jpg's to a new folder?

  • Simon Sheehan

    I was looking for a way to find all files that did NOT have the extension ".mp3" in a directory TREE on Windows 7 (NTFS volume) containing perhaps 20,000 files in several hundred directories of various depths... so after a bit of angst, I used:

    cd <theplace>
    dir /S | find /V "<DIR>" | find /V "Total" | find /V "bytes" | find /V "Directory" | find /V "Volume" | find /V ".mp3" | more /S
    

    this listed the files that did not match the .mp3 after stripping out everything related to the DIR command output... 99% works... unless the file that doesn't match is named one of the keywords in the standard DIR output - perhaps there's a way to make DIR report less header/summary info - but I didn't bother as this got most of the way there.