bash - How can I encode an url for wget?

07
2014-07
  • aloisdg

    I am looking for a way to convert a string to a clean url.

    For example :

     wget http://myurl.com/toto/foo bar.jpg
    

    This is going to download http://myurl.com/toto/foo and http://bar.jpg.

    I want to download http://myurl.com/toto%20bar.jpg.

    I tried some flag like --restrict-file-names=ascii but without success.

    I do not want a way to encode not a kind of replace one-by-one.

    Any idea ?

  • Answers
  • Bendoh

    Contain the URL in quotes:

    wget "http://myurl.com/toto/foo bar.jpg"
    

    This is a general way of containing strings with spaces as a single argument.


  • Related Question

    url - Xargs and Wget stops working after an hour
  • Jake

    Running script with Cygwin on Windows XP with Dual Core and 4GB Ram

    cat url_list.txt | xargs -P50 wget -i
    

    I am trying to trawl through 4GB of URL to download (approx 43 Million)

    Works okay for about the first hour, then the Bash shell and downloads stop even though its only 2% through the URL list.

    Any ideas at what could be wrong?

    What is the best way to debug why this is stoping after an hour?


  • Related Answers
  • Matrix Mole

    It's possible wget is taking time to download some of the files. Are there any wget/xargs processes in memory during the period that it appears to be hung? If so, is it the full 50 processes as you allocated with the -P50 flag to xargs, or has it somehow creeped up over that number or less than that number and no new instances are being spawned properly? Although it's being run under cygwin, take a look at the process list in windows itself, as each wget download should launch an instance in the task manager.

  • Ole Tange

    I assume the URLs are for different sites. In that case you may hit sites that are slow to respond and which will hang one of your wgets. Since you have 50 running, you will have to hit 50 of those sites before nothing happens.

    To see if this is the case try to kill one of the hanging wgets and see if that one is then unstuck.

    To skip URLs that hang you can give wget a timeout:

    wget -T 60