![]() |
Cannot get wget to run in shell script run by launchd
I have a shell script that downloads any updated files on my website to my computer. When run from terminal everything works as expected. The backup folders are created, the website is downloaded and copied to the appropriate folders.
I added the script to launchd, to run once a day. The script is definitely executed through launchd, because the folders are created as expected. The only thing that doesn't work is the wget command. wget apparently doesn't run at all, becaue it doesn't write any log files. Is wget special compared to other commands like mkdir or cp? Do I need to call it in a special way when running the script through launchd? I would greatly appreciate any hints! A short version of the script is below (my complete script has a loop for downloading several sites): Code:
#!/bin/bash |
Try using the full path to wget; i.e. /usr/local/bin/wget.
|
The other commands are included in a MacOS X installation, but wget is not. You must have installed that with fink, MacPorts, source compilation, etc. Try adding the full path to the wget binary.
The PATH that you have in a Terminal session is different from the PATH used by launchd jobs. You could alter the PATH environment variable, but an easier way is to tell the script exactly where the command is. It'd probably be a good idea to use the full path to the other commands, but any that you've custom installed will be required. |
Thanks, I'll try this right away. I haven't figured out where wget is located yet, though.
|
To find where you have wget installed, type:
"which wget" |
I didn't know about "which", very useful. Thanks!
I have an old version of wget located at /usr/local/bin/wget, but the newest one is located at /opt/local/bin/wget |
wget is not installed, OS X uses a binary called curl which is essentially the same thing.
Code:
bash-3.2# whereis wget |
I have wget installed, so that's not the issue:
bjarte-mac:Sites bjarte$ which wget /opt/local/bin/wget It might be easier if I used curl, though. Thanks for the tip. |
I think (as others have said a few times above) that 'wget' is not in the PATH that is used by 'launchd'. You should (as has been said) use the full path.
|
Thanks for everyone who helped me out. The wget issue is solved. At the moment I'm trying to figure out the best way to copy files recursively and keeping the folder structure of the files I'm copying. As soon as I've got that sorted, I'll post the complete backup script.
|
I'd suggest using rsync if possible. If you have ftp access I presume that you might also have ssh access? If so, and if rsync is available on the other end as well, you can use the two for much quicker and complete syncing.
rsync -e ssh Of course, then you need a way to authenticate and you'll get into using certificates. But, if you can, it's probably a "better" option. |
I do msql and web root backs up via cron on my Linux web server. I have the script run locally, and each one is on a different timer. I also have a cron to run the con.php script off my site.
|
Summary: The following script will let you download updated files from your websites once a day, and create a separate folder for each day's files. In addition, once a month a complete backup of your website will be downloaded, zipped and stored separately. This script downloads 3 separate websites, but you can easily add more websites.
I cannot use rsync, it's not installed on the servers, and I my webhost will not install it. I finally figured out a way to download the new files from my websites using wget, then copy only the files changed the last 24 hours to a new folder. It's important for me to see what files are changed each day. That way, as soon as I detect an error, I can find out when it happened, and only restore the files with errors in them. I had initally one long script that did both the fetching (wget) and the copying (cp), but when I ran the script through launchd, the script would not wait for wget to finish downloading, and the copy commands would be executed with nothing to copy. I solved this by making two separate scripts, both run through launchd with a 3 hours pause between them, so I would be certain wget was finished downloading before starting the copying. Below are the two scripts and the launchd job files. You have to edit the variables to suit your computer (for example where you want to keep your shell scripts, where you store your backups and your website's url, user name and password). Add the jobs to launchd by using this command: Code:
launchctl load backup-download.xmlCode:
chmod 777 backup-download.shCopy this text and save as backup-download.sh: Code:
#!/bin/bashCopy this text and save as backup-copy.sh: Code:
#!/bin/bashCopy this text and save as backup-download.xml: Code:
<?xml version="1.0" encoding="UTF-8"?>Copy this text and save as backup-copy.xml: Code:
<?xml version="1.0" encoding="UTF-8"?>Launchd and shell scripting for OS X is new to me, I'm used to linux and cron, and there are most likely better ways to do this. Thanks to everyone who responded to my post. |
Nicer ways to enumerate the sites
Code:
for ((i=0;i<=2;i++)); do echo $i ;doneOr even nicer: Code:
sites=("first" "next") |
@baf: I didn't know about this way to write a for-loop in bash, that's clever:
Code:
for ((i=0;i<=2;i++)); do echo $i ;doneSomething like this: Code:
sites=( ("website1" "user1" "password1") ("website2" "user2" "password2") ) |
Ahhh I only looked at the second script where you only had one array.
So use version 1 in the first script. But now you know that version too next you need it :D. |
Maybe I should have written this whole thing in perl. As far as I can see from a quick google search, you cannot use multidimensional arrays in bash scripts.
Anyway, my script works, so my biggest question at the moment is why the script run through launchd doesn't wait for wget to finish before it continues doing the rest of the commands. Is there some kind of wait command I can use? |
Quote:
|
If I remember correctly, this happens:
When I run the script from a terminal, this happens (the desired effect): - wget downloads files (I can see the output folder filling up) - when no more files are downloaded (wget is finished) the backup directory is created - some files are copied from the output directory to the backup directory When I run the script using launchd, this happens (not the desired effect): - wget starts downloading files (I can see the output folder filling up) - immediately after wget starts, the backup directory is created - no files are copied, presumably because no files are downloaded yet - wget finishes downloading I have another problem: The launchd job disappears after reboot. How do I make it stay loaded? |
Quote:
|
Sorry about the confusion: The reason why I created two separate scripts was to force the copying to happen after the wget downloading was finished.
In my original script, I had both downloading (wget) and copying (cp) in the same script, and the described events happened when I ran this script first in a terminal and then as a launchd job. Please reply if it is still unclear. As you understand, I solved my problem by creating two separate scripts, but I find it very odd that this happens. At the moment, the problem with disappearing jobs after reboot is more important for me to solve. |
I also have to ask one question. Is there any reason you aren't running these scripts on the server locally via cron? I just think it is more efficient. My hosting company does daily tape back ups of my webroot and home directory on their server. I have a cron job that dumps mysql into a back up file and another one that archives all files and tosses it into another directory in my home folder on my server.
that way I have back ups and they have back ups of my back ups and I don't have to worry about maintaining that many back ups on my own. Plus, if WAN traffic is down somehow between your client machine and your web hosting server, your scripts won't even run. Most hosting companies also provide tools for this built in. So, is there a reason you have to do it this way? |
@tlarkin: That's a good point, I might be able to run the script on the server.
To keep all the backups on the server is not a solution, though, as I'll quickly run out of space. So I do need to download it to my computer at some point. My hosting company does backup my site nightly, but I want to see what files changed when, to make it easier to rollback the site to the state it was before a specific accident happened. |
Quote:
However, you can have them rysnc to your FTP at home on a daily basis and that way everything is ran off the server and not your machine. Launchd items will take up cpu cycles. |
rsync is not installed on the server. I'm pretty sure I tried using rsync before (which I think requires rsync installed on both host and recipient). That's why I ended up using wget instead.
Anyway, I had some hope of learning to use launchd, but it seems it is tricker than I first imagined. My mac mini doesn't do much during the day except serve torrent files and store backups from my other computers, so I don't really mind some CPU time for this backup script. |
Quote:
|
I don't think I can be bothered to get my host to install rsync and then recreate the script to work on the server. My script is very close to do what I want it to do, I just need to figure out how to reload my launchd jobs after reboot.
|
Quote:
|
Solution: Place the job files (backup-download.xml and backup-copy.xml) in the folder ~/Library/LaunchAgents
After a reboot, both jobs are loaded in launchd, visible by running this command: Code:
$ launchctl list |
@tlarkin: You beat me to it. Your solution enables me to keep all the files in the same place as well, which I like.
|
Quote:
1) /Library/LaunchAgents 2) /Library/LaunchDaemons If your agent is loaded into #1 it will load when any user logs into the system. If you put it in #2 it will load at boot up globally with out any user logging in. Both of these run globally, either by log in or just at boot up. The other place you can load them from is: ~/Library/Launchagents These are located in the specific user's home folder and they run only when that user logs in. If you launchctl to load your agent with the -w switch it loads it in launchd permanently, meaning every time you reboot or log in, it will run. |
That's great, thank you.
I cannot get the -w option to work, though. The man page says this: Code:
launchctl load -w Remove the disabled key and write the configuration files back out to disk.If I place the job files in ~/Library/LaunchAgents, they are reloaded after reboot, though. |
Quote:
|
Ok, sorry, then I misunderstood.
I thought the -w option meant I didn't have to place them in one of the 3 folders. If I place them in ~/Library/LaunchAgents I don't have to use the -w option to get them to reload after reboot. On my mac, my user is logged in automatically on reboot, if that matters. I ended up using hard links, so I could still keep the xml files in the same folder as the script and backup files. To create a hard link for the file backup-copy.xml: Code:
$ln ~/Library/LaunchAgents backup-copy.xml |
Quote:
|
Thanks. Now it works. Relief :-)
|
Quote:
http://developer.apple.com/MacOsX/launchd.html You can accomplish some really powerful solutions with launchd. I have created several post config one time run scripts in my images, so the second it boots up, it runs the script then the script deletes itself along with the launch item so it only runs once at first boot and never again. Great way to automate post image configurations on the Mac platform. |
| All times are GMT -5. The time now is 05:32 PM. |
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.