The macosxhints Forums

The macosxhints Forums (http://hintsforums.macworld.com/index.php)
-   UNIX - General (http://hintsforums.macworld.com/forumdisplay.php?f=16)
-   -   tar: backing up a 240GB takes too long (http://hintsforums.macworld.com/showthread.php?t=81298)

cocotu 11-21-2007 02:50 PM

let me illustrate better:

We have a big directory(240GB) called proj. Within /proj there are many directories around 20.

/proj
/dir1
/dir2
/dir3
and so on.....

When I use tar it gives me a single proj.tar.gz(180GB) file at the destination external HD.

When I use iBackup I get:

/dir1.zip
/dir2.zip
/dir3.zip
and so on......

This makes it easier to retrieve the information.
Thanks fracai!!

baf 11-21-2007 03:07 PM

One thing I don't think you have mentioned. Is this a dual core/processor machine ?
If so it could (no guarantee ) be faster running it as several jobs. If so how many cores ?

fracai 11-21-2007 03:16 PM

Code:

#!/bin/sh

for DIR in proj/*/
do
    echo "${DIR}"
    BASE=$(basename "${DIR}")
    zip -r9 backup/proj/"${BASE}".zip "${DIR}"
done

That will compress each directory found in "proj/" and put the compressed files at "backup/proj/"

The way the script is written now you'd have to execute the script from within the directory above "proj". You should probably modify the script to use absolute paths for both the source and destination values.

cocotu 11-26-2007 03:33 PM

i wasn't able to find how many cores. This is the info. I have:

Mac OS X Server 10.4.6
2Ghz Power PC XServe G5
2 GB DDR SDRAM
machine model:RackMac3.1
machine name: XserveG5
CPU type: PowerPC G5 (3.0)
Number of CPUs: 1
CPU Speed: 2Ghz
L2 Cache (per CPU): 512KB
Memory: 2GB
Bus Speed: 1GHz
Boot ROM Version: 5.17f2

I have another concern. I know I'm able to view the contents of the tar.gz file, but this takes forever. Is there an application like Winrar that would enable me to view the contents of this tar.gz file? I just want to verify the folders inside the proj/.
Thanks..

tlarkin 11-26-2007 04:03 PM

Quote:

Originally Posted by fracai (Post 425790)
"-a" has nothing to do with compression. The archive option is a substitute for "-rlptgoD". In other words it preserves links, permissions, time, owner and group IDs, devices, and is recursive. Note that "-r" is therefore redundant. The "-u" simply tells rsync to update files instead of overwriting newer files at the destination.

Hmm, okay thanks for clearing that up. I was under the impression that -E did all that by preserving all the extended attributes... and you are right.

There is also an option for -z which does in fact say compress data. I am not sure how it compresses data though. You may want to look into it

baf 11-26-2007 04:08 PM

To find core:s info try (in terminal)

Code:

system_profiler SPHardwareDataType
And you'll get back sothing like:

Hardware:

Hardware Overview:

Model Name: MacBook
Model Identifier: MacBook2,1
Processor Name: Intel Core 2 Duo
Processor Speed: 2.16 GHz
Number Of Processors: 1
Total Number Of Cores: 2
L2 Cache (per processor): 4 MB
Memory: 2 GB
Bus Speed: 667 MHz
Boot ROM Version: MB21.00A5.B07
SMC Version: 1.17f0
Serial Number: W87222FBYA8
Sudden Motion Sensor:
State: Enabled


And if it doesnt say anything about cores then you have a single core system.

acme.mail.order 11-26-2007 07:44 PM

Quote:

Originally Posted by tlarkin (Post 428763)
There is also an option for -z which does in fact say compress data. I am not sure how it compresses data though.

tar -czf tarball.tgz files is basically the same as tar -cf - files | gzip > tarball.tar.gz

Quote:

Originally Posted by cocotu (Post 428755)

I have another concern. I know I'm able to view the contents of the tar.gz file, but this takes forever. Is there an application like Winrar that would enable me to view the contents of this tar.gz file? I just want to verify the folders inside the proj/.
Thanks..

This is the monster 200Gb tarfile? tar stores it's index throughout the file, it's going to take a while to scan through that much data, uncompressing internally as it goes. Be happy you're not dumping directly to tape, the original use for tar (Tape ARchiver)

cocotu 11-28-2007 11:50 AM

I ran the command:
system_profiler SPHardwareDataType
and got:

Hardware Overview:

Machine Name: Xserve G5
Machine Model: RackMac3,1
CPU Type: PowerPC G5 (3.0)
Number Of CPUs: 1
CPU Speed: 2 GHz
L2 Cache (per CPU): 512 KB
Memory: 2 GB
Bus Speed: 1 GHz
Boot ROM Version: 5.1.7f2
Serial Number: QP53901KSLX

So it must be a single core system. acme.mail.order I think tlarkin is refering to -z with rsync NOT tar. thanks for all the help!

fracai 11-28-2007 02:43 PM

Quote:

Originally Posted by tlarkin (Post 428763)
Hmm, okay thanks for clearing that up. I was under the impression that -E did all that by preserving all the extended attributes... and you are right.

There is also an option for -z which does in fact say compress data. I am not sure how it compresses data though. You may want to look into it

Quote:

Originally Posted by man rsync
-z, --compress compress file data during the transfer

Note the item that specifies "during transfer". rsync will compress the data while transferring the data and uncompress it at the destination. This is useful if your computer and the destination computer are fast, your data is easily compressible, and the network is slow. You'll get an overall speed boost by pre-compressing the data, transferring the smaller amount of data, and decompressing on the other side.

rsync will not create a compressed archive for you, however nice that feature would be.

cocotu 11-29-2007 03:34 PM

can we pipe it to tar?

rsync <whateverfile> | tar <whateverfile>

I don't think it would work because the destination HD is smaller than the source. It would be great if rsync could do such thing! Can cpio do this? iBackup uses cpio and it zips all the directories as I metioned before. thanks!

fracai 11-30-2007 12:40 PM

rsync doesn't pipe its output so this wouldn't work.

cpio, zip, or even tar can pipe their output, but the main benefit of rsync in making backups smaller is the ability to reference a backup that is already in place and only backup changed data (ie. the backup is incremental).

the bottom line is:
compressing large media files is slow and not typically not very beneficial.
the only way to fit a large backup in a small space is compression.
incremental backups are faster than full backups.
large drives are cheap.

The comment about the number of cores you have is correct, running parallel jobs could be faster, but with the media files you're compressing I'm not sure how much benefit you'd see. You might even just be limited by disk speed at that point. You can check your processor strain by opening "Activity Monitor" while running your backup script. If you have more than 1 core or CPU you'll see more than 1 graph charting your CPU usage. If your usage is maxed out already, parallel jobs won't provide a benefit. If each chart is around 50% you could look into running parallel jobs in your script.


All times are GMT -5. The time now is 05:43 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.