wget Output Conditionals
Tags:
Do any of my techie readers have an idea about how to solve the following problem in a simple manner? (Or what would be a good place to ask, e.g. a news group that hasn't fallen out of use or something like Perl Monks?)
Once a day, I back up my del.icio.us bookmarks with the following:
wget -q -t1 --http-user=josephgrossberg --http-passwd=password -O foo.txt http://del.icio.us/api/posts/all
However if, for some reason, there is an empty response (e.g. del.icio.us is down), my file is overwritten with nothing.
After reading through some man pages, I found the promising --no-run-if-empty option for xargs:
If the standard input does not contain any non blanks, do not run the command. Normally, the command is run once even if there is no input.
I tested it out and something like echo "joe" | xargs --no-run-if-empty echo > foo.txt behaved correctly (it wrote to the file), as did echo "" | xargs --no-run-if-empty echo > foo.txt (it did not write to the file).
However, this still doesn't work with wget:
wget -q -t1 --http-user=josephgrossberg --http-passwd=password http://del.icio.us/api/posts/all | xargs --no-run-if-empty echo > foo.txt
The foo.txt file gets overwritten with 0 bytes.
Now, I suppose there could be something "obvious" I'm missing, but is there any canonical UNIX way to solve this problem other than writing a full-fledged bash script or piping it to Perl or the like?
If you want portable, I'd go with this:
$ wget -O foo.txt.new [...]
$ if [ `wc -c < foo.txt.new` -gt 0 ]; then mv foo.txt.new foo.txt; else rm foo.txt.new; fi
But, this only protects against a zero-byte response. Wouldn't it be better to just save your backups with YYYYMMDD embedded in the filename, and just keep the last N days worth of backups?
Dossy:
Thanks for the response.
FWIW, I download the XML to the same filename every day, but have it all under CVS. This gives me the daily snapshots, without a whole mess of files to worry about.
Posted by: Joe Grossberg on September 11, 2005 12:14 PM | permalinkAh, see -- if you're using CVS, then I say just fetch with wget and delete it if it's empty. Then, you follow with "cvs ci" then "cvs up" -- if the file was empty, it'll be deleted and "cvs ci" won't check in an update. "cvs up" will be a no-op if a new file was fetched, and it will pull out the last repository copy if the fetch was zero-byte since you rm'ed it.
$ wget ...
$ test \! -s filename.txt && rm filename.txt
$ cvs ci -mblah filename.txt
$ cvs up filename.txt
No more comments! Either someone has violated Godwin's Law, I'm tired of the discussion or, most likely, the ten-week window has closed. You can, however, contact me through email.