The macosxhints Forums

The macosxhints Forums (http://hintsforums.macworld.com/index.php)
-   UNIX - General (http://hintsforums.macworld.com/forumdisplay.php?f=16)
-   -   Cannot get wget to run in shell script run by launchd (http://hintsforums.macworld.com/showthread.php?t=104537)

bjarte 08-21-2009 04:07 AM

Cannot get wget to run in shell script run by launchd
 
I have a shell script that downloads any updated files on my website to my computer. When run from terminal everything works as expected. The backup folders are created, the website is downloaded and copied to the appropriate folders.

I added the script to launchd, to run once a day.

The script is definitely executed through launchd, because the folders are created as expected. The only thing that doesn't work is the wget command. wget apparently doesn't run at all, becaue it doesn't write any log files.

Is wget special compared to other commands like mkdir or cp? Do I need to call it in a special way when running the script through launchd?

I would greatly appreciate any hints!

A short version of the script is below (my complete script has a loop for downloading several sites):

Code:

#!/bin/bash
# Assign variables
site="mywebsite.com"
user="username"
pwd="mysecretpassword"
dato=`date "+%Y-%m-%d"`
today=`date "+%d"`
path="/Users/bjarte/Backup/Sites"

# Create log directory if it doesn't exist
mkdir -p "$path/logs"

# Create site's directory if it doesn't exist
mkdir -p "$path/$site"
mkdir -p "$path/$site/0-current"

# Jump to site's directory
cd "$path/$site/0-current"

#Download files changed or added since last backup
wget -N -r --output-file="../../logs/$dato-$site.txt" --user=$user --password=$pwd --directory-prefix="$path/$site/0-current" --no-host-directories --preserve-permissions ftp://$site

# Copy all files newer than 24 hours to a folder called today's date
find . -mtime 0 -type f ! -name ".listing" | cpio -pvdmu "../$dato"

cd "$path/$site"

# Once a month, do a complete backup of the whole site, and zip this backup
if [ $today == "21" ]; then
        mkdir -p "$dato-complete"
        cp -r "0-current" "$dato-complete"
        tar -cjf "$dato-complete.tar.bz2" "$dato-complete"
        rm -r "$dato-complete"
fi


SirDice 08-21-2009 10:56 AM

Try using the full path to wget; i.e. /usr/local/bin/wget.

fracai 08-21-2009 10:56 AM

The other commands are included in a MacOS X installation, but wget is not. You must have installed that with fink, MacPorts, source compilation, etc. Try adding the full path to the wget binary.

The PATH that you have in a Terminal session is different from the PATH used by launchd jobs. You could alter the PATH environment variable, but an easier way is to tell the script exactly where the command is.

It'd probably be a good idea to use the full path to the other commands, but any that you've custom installed will be required.

bjarte 08-21-2009 11:15 AM

Thanks, I'll try this right away. I haven't figured out where wget is located yet, though.

fracai 08-21-2009 11:24 AM

To find where you have wget installed, type:
"which wget"

bjarte 08-21-2009 11:35 AM

I didn't know about "which", very useful. Thanks!

I have an old version of wget located at /usr/local/bin/wget, but the newest one is located at /opt/local/bin/wget

tlarkin 08-21-2009 12:49 PM

wget is not installed, OS X uses a binary called curl which is essentially the same thing.

Code:

bash-3.2# whereis wget
bash-3.2# which wget
bash-3.2# whereis curl
/usr/bin/curl
bash-3.2#


bjarte 08-21-2009 03:00 PM

I have wget installed, so that's not the issue:

bjarte-mac:Sites bjarte$ which wget
/opt/local/bin/wget

It might be easier if I used curl, though. Thanks for the tip.

hayne 08-21-2009 09:57 PM

I think (as others have said a few times above) that 'wget' is not in the PATH that is used by 'launchd'. You should (as has been said) use the full path.

bjarte 08-22-2009 02:45 PM

Thanks for everyone who helped me out. The wget issue is solved. At the moment I'm trying to figure out the best way to copy files recursively and keeping the folder structure of the files I'm copying. As soon as I've got that sorted, I'll post the complete backup script.

fracai 08-22-2009 06:46 PM

I'd suggest using rsync if possible. If you have ftp access I presume that you might also have ssh access? If so, and if rsync is available on the other end as well, you can use the two for much quicker and complete syncing.

rsync -e ssh

Of course, then you need a way to authenticate and you'll get into using certificates. But, if you can, it's probably a "better" option.

tlarkin 08-22-2009 06:54 PM

I do msql and web root backs up via cron on my Linux web server. I have the script run locally, and each one is on a different timer. I also have a cron to run the con.php script off my site.

bjarte 08-23-2009 10:10 AM

Summary: The following script will let you download updated files from your websites once a day, and create a separate folder for each day's files. In addition, once a month a complete backup of your website will be downloaded, zipped and stored separately. This script downloads 3 separate websites, but you can easily add more websites.


I cannot use rsync, it's not installed on the servers, and I my webhost will not install it.

I finally figured out a way to download the new files from my websites using wget, then copy only the files changed the last 24 hours to a new folder. It's important for me to see what files are changed each day. That way, as soon as I detect an error, I can find out when it happened, and only restore the files with errors in them.

I had initally one long script that did both the fetching (wget) and the copying (cp), but when I ran the script through launchd, the script would not wait for wget to finish downloading, and the copy commands would be executed with nothing to copy.

I solved this by making two separate scripts, both run through launchd with a 3 hours pause between them, so I would be certain wget was finished downloading before starting the copying.

Below are the two scripts and the launchd job files.

You have to edit the variables to suit your computer (for example where you want to keep your shell scripts, where you store your backups and your website's url, user name and password).

Add the jobs to launchd by using this command:
Code:

launchctl load backup-download.xml
launchctl load backup-copy.xml

Make your shell scripts executable using this command:
Code:

chmod 777 backup-download.sh
chmod 777 backup-copy.sh

First shell script (downloading websites)

Copy this text and save as backup-download.sh:

Code:

#!/bin/bash
clear
echo "### Welcome to the backup script - part 1 (downloading) ###"

# Change these variables to suit your website
site=( "website1.com" "website2.com" "website3.com" )
user=( "username1" "username2" "username3" )
pwd=( "password1" "password2" "password3" )
dato=`date "+%Y-%m-%d"`
today=`date "+%d"`
# Change this to the path where you want to store your backups
path="/Users/bjarte/Backup/Sites"

# Create log directory if it doesn't exist
mkdir -p "$path/logs"

# Run through the 3 sites in our array
# If you have fewer or more websites, change the numbers
# For example, for 6 websites, the line would be
# for i in 0 1 2 3 4 5; do
for i in 0 1 2; do

        # Create site's directory if it doesn't exist
        mkdir -p "$path/${site[$i]}"
        mkdir -p "$path/${site[$i]}/0-current"
       
        # Jump to site's directory
        cd "$path/${site[$i]}/0-current/"

        #Download files changed or added since last backup
        echo
        echo "Backing up site ${site[$i]}"
        echo
        echo "From: ftp://${site[$i]}"
        echo "To:  ${site[$i]}/0-current"
        /opt/local/bin/wget -N -r --output-file="$path/logs/$dato-${site[$i]}.txt" --user=${user[$i]} --password=${pwd[$i]} --directory-prefix="$path/${site[$i]}/0-current" --no-host-directories --preserve-permissions ftp://${site[$i]}

# End of for loop       
done

echo
echo "Loop complete, all websites downloaded"
echo

Second shell script (copying websites)

Copy this text and save as backup-copy.sh:

Code:

#!/bin/bash
clear
echo "### Welcome to the backup script - part 2 (copying) ###"

# Change these variables to suit your website
site=( "website1.com" "website2.com" "website3.com" )
dato=`date "+%Y-%m-%d"`
today=`date "+%d"`
# Change this to the path where you want to store your backups
path="/Users/bjarte/Backup/Sites"

# Run through the 3 sites in our array
# If you have fewer or more websites, change the numbers
# For example, for 6 websites, the line would be
# for i in 0 1 2 3 4 5; do
for i in 0 1 2; do

        # Jump to site's directory
        cd "$path/${site[$i]}/0-current/"

        # Copy all files newer than 24 hours to a folder called today's date
        echo
        echo "Backing up all today's files"
        mkdir -p "../$dato"
        find . -mtime 0 -type f ! -name ".listing" | cpio -pdmu "../$dato"
        echo "Finished backing up all today's files"

        # Once a month, do a complete backup of the whole site
        if [ $today == "23" ]; then
                echo
                echo "Backing up all files (monthly)"
                cd ..
                cp -r "0-current" "$dato-complete"
                tar --create --bzip2 --file="$dato-complete.tar.bz2" "$dato-complete"
                rm -r "$dato-complete"
                echo "Finished backing up all files"
                echo
        fi

        echo "Backup of ${site[$i]} finished"

# End of for loop       
done

echo
echo "Loop complete, all backups copied"
echo

First launchd job (downloading websites)

Copy this text and save as backup-download.xml:

Code:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict> 
        <key>Label</key>
        <string>com.bjarte.backup.websites.download</string>
        <key>Program</key>
        <string>/Users/bjarte/Backup/Sites/backup-download.sh</string>
        <key>ProgramArguments</key>
        <array>
                <string>backup-download.sh</string>
                <string>daily</string>
        </array>
        <key>Nice</key>
        <integer>1</integer>
        <key>StartCalendarInterval</key>
        <dict> 
                <key>Hour</key>
                <integer>12</integer>
                <key>Minute</key>
                <integer>00</integer>
        </dict>
</dict>
</plist>

Second launchd job (copying websites)

Copy this text and save as backup-copy.xml:

Code:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict> 
        <key>Label</key>
        <string>com.bjarte.backup.websites.copy</string>
        <key>Program</key>
        <string>/Users/bjarte/Backup/Sites/backup-copy.sh</string>
        <key>ProgramArguments</key>
        <array>
                <string>backup-copy.sh</string>
                <string>daily</string>
        </array>
        <key>Nice</key>
        <integer>1</integer>
        <key>StartCalendarInterval</key>
        <dict> 
                <key>Hour</key>
                <integer>15</integer>
                <key>Minute</key>
                <integer>00</integer>
        </dict>
</dict>
</plist>

I hope this is useful for someone, and please reply if you see any errors or any smarter ways to do this.

Launchd and shell scripting for OS X is new to me, I'm used to linux and cron, and there are most likely better ways to do this.

Thanks to everyone who responded to my post.

baf 08-23-2009 10:32 AM

Nicer ways to enumerate the sites
Code:

for ((i=0;i<=2;i++)); do echo $i ;done
Does 0 1 2 if you want it as numbers

Or even nicer:
Code:

sites=("first" "next")
for site in ${sites[*]}; do
echo $site
done


bjarte 08-23-2009 02:22 PM

@baf: I didn't know about this way to write a for-loop in bash, that's clever:
Code:

for ((i=0;i<=2;i++)); do echo $i ;done
The other loop is even shorter, but I can't immediately see a way to use it when I've got 3 arrays. Is it possible to use two-dimensional arrays in a shell script?

Something like this:

Code:

sites=( ("website1" "user1" "password1") ("website2" "user2" "password2") )
for site in ${sites[*]}; do
echo ${site[0]} # prints website
echo ${site[1]} # prints user
echo ${site[2]} # prints password
done


baf 08-23-2009 02:27 PM

Ahhh I only looked at the second script where you only had one array.
So use version 1 in the first script.
But now you know that version too next you need it :D.

bjarte 08-23-2009 02:46 PM

Maybe I should have written this whole thing in perl. As far as I can see from a quick google search, you cannot use multidimensional arrays in bash scripts.

Anyway, my script works, so my biggest question at the moment is why the script run through launchd doesn't wait for wget to finish before it continues doing the rest of the commands. Is there some kind of wait command I can use?

hayne 08-23-2009 05:21 PM

Quote:

Originally Posted by bjarte (Post 548350)
my biggest question at the moment is why the script run through launchd doesn't wait for wget to finish before it continues doing the rest of the commands

Please tell us what you see that makes you think that this is happening.

bjarte 08-24-2009 11:56 AM

If I remember correctly, this happens:

When I run the script from a terminal, this happens (the desired effect):
- wget downloads files (I can see the output folder filling up)
- when no more files are downloaded (wget is finished) the backup directory is created
- some files are copied from the output directory to the backup directory

When I run the script using launchd, this happens (not the desired effect):
- wget starts downloading files (I can see the output folder filling up)
- immediately after wget starts, the backup directory is created
- no files are copied, presumably because no files are downloaded yet
- wget finishes downloading


I have another problem: The launchd job disappears after reboot. How do I make it stay loaded?

hayne 08-24-2009 12:26 PM

Quote:

Originally Posted by bjarte (Post 548439)
When I run the script using launchd, this happens (not the desired effect):
- wget starts downloading files (I can see the output folder filling up)
- immediately after wget starts, the backup directory is created
- no files are copied, presumably because no files are downloaded yet
- wget finishes downloading

I'm not sure if I'm following, but from above posts it looks like you have 2 scripts, each of which is run via its own launchd job. The launchd jobs run in parallel. If you want things to run in sequence, make it one script instead of 2.

bjarte 08-24-2009 12:34 PM

Sorry about the confusion: The reason why I created two separate scripts was to force the copying to happen after the wget downloading was finished.

In my original script, I had both downloading (wget) and copying (cp) in the same script, and the described events happened when I ran this script first in a terminal and then as a launchd job.

Please reply if it is still unclear.

As you understand, I solved my problem by creating two separate scripts, but I find it very odd that this happens.

At the moment, the problem with disappearing jobs after reboot is more important for me to solve.

tlarkin 08-24-2009 12:35 PM

I also have to ask one question. Is there any reason you aren't running these scripts on the server locally via cron? I just think it is more efficient. My hosting company does daily tape back ups of my webroot and home directory on their server. I have a cron job that dumps mysql into a back up file and another one that archives all files and tosses it into another directory in my home folder on my server.

that way I have back ups and they have back ups of my back ups and I don't have to worry about maintaining that many back ups on my own. Plus, if WAN traffic is down somehow between your client machine and your web hosting server, your scripts won't even run.

Most hosting companies also provide tools for this built in.

So, is there a reason you have to do it this way?

bjarte 08-24-2009 12:50 PM

@tlarkin: That's a good point, I might be able to run the script on the server.

To keep all the backups on the server is not a solution, though, as I'll quickly run out of space. So I do need to download it to my computer at some point.

My hosting company does backup my site nightly, but I want to see what files changed when, to make it easier to rollback the site to the state it was before a specific accident happened.

tlarkin 08-24-2009 01:27 PM

Quote:

Originally Posted by bjarte (Post 548449)
@tlarkin: That's a good point, I might be able to run the script on the server.

To keep all the backups on the server is not a solution, though, as I'll quickly run out of space. So I do need to download it to my computer at some point.

My hosting company does backup my site nightly, but I want to see what files changed when, to make it easier to rollback the site to the state it was before a specific accident happened.

I understand, you can have them save date stamped files then set cron to only keep the 30 days worth of back ups, so when day 31 hits it erases day 1 back up, then when back up 32 hits back up 2 is erased and so forth.

However, you can have them rysnc to your FTP at home on a daily basis and that way everything is ran off the server and not your machine. Launchd items will take up cpu cycles.

bjarte 08-24-2009 01:34 PM

rsync is not installed on the server. I'm pretty sure I tried using rsync before (which I think requires rsync installed on both host and recipient). That's why I ended up using wget instead.

Anyway, I had some hope of learning to use launchd, but it seems it is tricker than I first imagined. My mac mini doesn't do much during the day except serve torrent files and store backups from my other computers, so I don't really mind some CPU time for this backup script.

tlarkin 08-24-2009 01:48 PM

Quote:

Originally Posted by bjarte (Post 548461)
rsync is not installed on the server. I'm pretty sure I tried using rsync before (which I think requires rsync installed on both host and recipient). That's why I ended up using wget instead.

Anyway, I had some hope of learning to use launchd, but it seems it is tricker than I first imagined. My mac mini doesn't do much during the day except serve torrent files and store backups from my other computers, so I don't really mind some CPU time for this backup script.

Not sure about your host, but if you aren't the admin of it you can request it be installed and aliased in your bash profile. I requested they install cron for me and they did.

bjarte 08-24-2009 01:56 PM

I don't think I can be bothered to get my host to install rsync and then recreate the script to work on the server. My script is very close to do what I want it to do, I just need to figure out how to reload my launchd jobs after reboot.

tlarkin 08-24-2009 02:17 PM

Quote:

Originally Posted by bjarte (Post 548463)
I don't think I can be bothered to get my host to install rsync and then recreate the script to work on the server. My script is very close to do what I want it to do, I just need to figure out how to reload my launchd jobs after reboot.

that is easy, launchctl -w load /path/to/agent, the -w switch makes it load permanently.

bjarte 08-24-2009 02:18 PM

Solution: Place the job files (backup-download.xml and backup-copy.xml) in the folder ~/Library/LaunchAgents

After a reboot, both jobs are loaded in launchd, visible by running this command:
Code:

$ launchctl list

bjarte 08-24-2009 02:23 PM

@tlarkin: You beat me to it. Your solution enables me to keep all the files in the same place as well, which I like.

tlarkin 08-24-2009 02:26 PM

Quote:

Originally Posted by bjarte (Post 548468)
Solution: Place the job files (backup-download.xml and backup-copy.xml) in the folder ~/Library/LaunchAgents

After a reboot, both jobs are loaded in launchd, visible by running this command:
Code:

$ launchctl list

launchd items run on several levels from several locations. The first two are:

1) /Library/LaunchAgents

2) /Library/LaunchDaemons

If your agent is loaded into #1 it will load when any user logs into the system. If you put it in #2 it will load at boot up globally with out any user logging in. Both of these run globally, either by log in or just at boot up.

The other place you can load them from is:

~/Library/Launchagents

These are located in the specific user's home folder and they run only when that user logs in.

If you launchctl to load your agent with the -w switch it loads it in launchd permanently, meaning every time you reboot or log in, it will run.

bjarte 08-24-2009 02:41 PM

That's great, thank you.

I cannot get the -w option to work, though. The man page says this:
Code:

launchctl load -w Remove the disabled key and write the configuration files back out to disk.
I don't understand what that means, but as far as I can tell, the jobs are not reloaded after reboot.

If I place the job files in ~/Library/LaunchAgents, they are reloaded after reboot, though.

tlarkin 08-24-2009 02:57 PM

Quote:

Originally Posted by bjarte (Post 548474)
That's great, thank you.

I cannot get the -w option to work, though. The man page says this:
Code:

launchctl load -w Remove the disabled key and write the configuration files back out to disk.
I don't understand what that means, but as far as I can tell, the jobs are not reloaded after reboot.

If I place the job files in ~/Library/LaunchAgents, they are reloaded after reboot, though.

Where are you placing the launchd items at? They have to be placed in one of the three directories I listed in my above post.

bjarte 08-24-2009 03:05 PM

Ok, sorry, then I misunderstood.

I thought the -w option meant I didn't have to place them in one of the 3 folders. If I place them in ~/Library/LaunchAgents I don't have to use the -w option to get them to reload after reboot. On my mac, my user is logged in automatically on reboot, if that matters.

I ended up using hard links, so I could still keep the xml files in the same folder as the script and backup files.

To create a hard link for the file backup-copy.xml:
Code:

$ln ~/Library/LaunchAgents backup-copy.xml

tlarkin 08-24-2009 03:22 PM

Quote:

Originally Posted by bjarte (Post 548480)
Ok, sorry, then I misunderstood.

I thought the -w option meant I didn't have to place them in one of the 3 folders. If I place them in ~/Library/LaunchAgents I don't have to use the -w option to get them to reload after reboot. On my mac, my user is logged in automatically on reboot, if that matters.

I ended up using hard links, so I could still keep the xml files in the same folder as the script and backup files.

To create a hard link for the file backup-copy.xml:
Code:

$ln ~/Library/LaunchAgents backup-copy.xml

Yeah, all the -w switch does is tell launchd to keep running that agent according to where it is at. For example, you would place all your log in hooks at /Library/LaunchAgents or ~/Library/LaunchAgents, since launchd runs them at log in.

bjarte 08-24-2009 03:26 PM

Thanks. Now it works. Relief :-)

tlarkin 08-24-2009 03:35 PM

Quote:

Originally Posted by bjarte (Post 548483)
Thanks. Now it works. Relief :-)

Well if you want the exciting apple official white page on launchd, here you go. I have read over it a few times, and it never gets that interesting but it does make you understand how launch agents work

http://developer.apple.com/MacOsX/launchd.html

You can accomplish some really powerful solutions with launchd. I have created several post config one time run scripts in my images, so the second it boots up, it runs the script then the script deletes itself along with the launch item so it only runs once at first boot and never again. Great way to automate post image configurations on the Mac platform.


All times are GMT -5. The time now is 05:32 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.