Go Back   The macosxhints Forums > OS X Help Requests > Applications



Reply
 
Thread Tools Rate Thread Display Modes
Old 12-16-2004, 05:32 PM   #1
shomann
Prospect
 
Join Date: Dec 2004
Posts: 6
10.1 Backup Query: Or How I Learned to Stop Worrying and Love Open-Source...

First let me take a moment to say hello. This is my first post on MacOSXHints despite the several years of lurking and gleaning information from the site in general.

Currently I am faced with a revamp of our office-wide backup and archiving system. We are a small video and post-production facility, but due to the fact that we work with uncompressed video (in both SD and HD) and high-res graphic and animation projects, our backup needs are extreme to say the least (uncompressed video can take up about 1.2GB/min ).

Since the mid-Nineties, we have used various "flavors" of Retrospect in server/client configurations to provide a level of security in our daily operations. It started as a PowerMac 8550 utilizing DDS2 DAT tape. This was sufficient for project info, but not for the actual media. About 2 years ago, we upgraded that system to a near-line backup concept in the form of a 1.4TB RAID-5 Wintel server, again running Retrospect. The reason for the bump-up was to attempt to include the media files that were being left behind and to simplfy the growing confusion of off-line DAT storage.

This system, while well planned and conceived (if I do say so myself), only worked about 60% of the time. It's biggest fault lay in Retrospect itself being wildly inconsistent from night-to-night and system-to-system in its speed. While I made several attempts to recorrect the backup either through scripts or by eliminating certain less-critical elements from the backup completely, nothing seemed to work properly.

A few weeks ago I started looking at reversing the backup flow. To be less cryptic, I decided to see what would happen if I turned the server into nothing more than a huge NAS and let each Mac back it's self up to that. This will mean a bit more shuffling scheduling-wise, but I have high hopes that it will be more efficient in terms of speed and time spent.

Along those lines, I started looking at software that would:

(A) Be able to backup to a network
(B) Support incremental backup with versioning (a toughy)
(C) Be reasonable easy to implement
(D) Have a long enough lifespan that I could be assured it will be supported for a least a few years

On the face of it, rsyncX looked like a solid contender but doesn't support versioning (unless I am missing something). Synchronize Pro does all of the above but it's trial totally blows in terms of any meaningful speed tests (limited to 10 MB). However, it so far looks like the only software that does it all.

The other side to this problem was the server. It is a Intel-based server, meaning no Mac OS. Windows is still an option, but I am currently using Fedora Core 2 Linux on it which is wonderful, except that it doesn't support HFS and will therefore be unable to support long file names (though it appeared that resource forks remained intact on a ext3 partition, can anyone confirm).

So to summerize, I am looking for any corrections to my thinking, specfic or global. I am also looking for anyone that has used a Linux box for storage on their Mac network. Thanks in advance!
shomann is offline   Reply With Quote
Old 12-16-2004, 09:26 PM   #2
trevor
Moderator
 
Join Date: Jun 2003
Location: Boulder, CO USA
Posts: 19,854
I don't have any good feedback about your backup/archiving question, but...

Quote:
The other side to this problem was the server. It is a Intel-based server, meaning no Mac OS. Windows is still an option, but I am currently using Fedora Core 2 Linux on it which is wonderful, except that it doesn't support HFS and will therefore be unable to support long file names (though it appeared that resource forks remained intact on a ext3 partition, can anyone confirm).

Resource forks will appear on ext3 drives with a ._ prefix. Browsing from a Mac will show only a single file, but browsing from Linux will show both a file and a ._file.

You may want to investigate Gentoo Linux http://www.gentoo.org/ rather than Fedora. Gentoo allows you to compile HFS support into the kernel, if you wish. (Also, several other Mac-friendly features are available--Gentoo is very customizable, which adds to install time but makes for a nice environment.) It is a rather difficult version of Linux to install initially (budget at least a day if not two for your first Gentoo install, and expect it to be compiling overnight), but once installed is an absolute dream to manage and update--my favorite general-purpose flavor of Linux.

Trevor
trevor is offline   Reply With Quote
Old 12-16-2004, 10:15 PM   #3
acme.mail.order
League Commissioner
 
Join Date: Sep 2003
Location: Tokyo
Posts: 6,334
What, exactly, are you backing up for? (no, it's not a duhhhh..... obvious question)

For a video shop, it seems to me that your data is relatively transient (no long-term storage needs), but uptime is critical.

Do the video files NEED to be backed up daily? My experience has been that the original files are not modified in the editing process until the project is finally rendered. Thus, we don't need to back them up at all. Make the master, copy to the work machine if the network is too slow for live work, and that's it. Just backup the work files.

If we assume that a building fire would effectively kill the current projects, then we don't need to be too concerned with off-site transport (which would probably be tape, yes?)

If physical drive failure is the biggest concern, I'd put the main video files on a fast network raid stripe set with parity and not worry about it on a daily basis. If a drive dies, probably no one but you will notice.

If you keep the work/edit files ( I assume much smaller than the video files) in their own folder on each Mac then versioning can be done quickly and easily with the `hdiutil` command. Make a disk image of the files, saved somewhere else.

Your needs sound unique enough that a packaged solution will not be quite right. Fortunately unix has enough tools that making your own backup solution is not all that difficult.

Got the budget for an XServe Raid?
acme.mail.order is offline   Reply With Quote
Old 12-16-2004, 11:05 PM   #4
voldenuit
League Commissioner
 
Join Date: Sep 2003
Location: Old Europe
Posts: 5,146
You are working on an interesting problem...

First of all, for network speed, get yourself a working GigaBit, Jumbo-Frames enabled network going, that will run at speeds slower harddisks have trouble to sustain. If you have a lot of machines, you could even put a second Ethernet-card into the server and do channel-bonding.
That's not trivial, get back to us if you are stuck at some point.

Next, Linux, whatever distro you prefer, is great and netatalk2 is a lot better than version 1.x, if you know what you are doing, no need for an XServe and forget about that other OS that shall remain unnamed.

I would recommand an approach where the server polls the clients, that will give you better control of the bandwidth-usage during the nightshift and it will be easier to maintain.

You could very well have a cronned script on the server that kicks off rsyncX-processes on the remote machines via ssh.

As for versioning, did you consider putting it in the workflow, rather than have the backup program take care of it ?

That would give the users control, which can be both good and bad, depending on how smart they are .
voldenuit is offline   Reply With Quote
Old 12-20-2004, 10:48 AM   #5
shomann
Prospect
 
Join Date: Dec 2004
Posts: 6
Excellent dialog we have going here

Our management system of projects is fairly complex as it is. Archiving of the projects is done at project end as the needs of each project is wildly divergent from the next.

The backup system is more of a "save your a$$" measure than anything. As acme suggested, a building fire would destroy thousands of source tapes, so offsite backup would only be appropriate for the small amount of financial records we keep as everything else would be wiped out (damn, that makes me sick just thinking about it).

As far as the video goes, we DO NOT backup our source footage that is timecoded, only the work files (i.e. After Effects, FCP, etc) and their rendered output. This saves alot of work for the backup system not to mention space. However, the rendered animation and graphic files can be on the order of a 20GB+ change per day. Those need to be rolled in to the backup to make it a simple process to restore a corrupted (or more often, accidentally deleted) file(s).

The system is already on a GigE network. The server and the main "heavy-lifting" workstations are all running GigE. The server is still configured as a RAID-5 across (8) 200GB hard drives (yeah, PATA but so far nothing but smooth sailing). I have already been able to make the server see and be seen on the AppleTalk network and have seen comparable read/write speed of a normal Mac. The only missing piece is the HFS support which is what I am still working on (the resource forks aren't the problem, even on my test transfer I had a long file name error). The Gentoo suggestion is something I will look into and I seem to remember an option under Fedora to recompile with HFS support. I know this can work as YellowDog is basically a Redhat/Fedora distro and it supports HFS.

I am not afraid to jump to the onboard Unix tools if I have too, but I do feel more comfortable in a GUI environment (you should have seen me last year when I setup a MythTV box at home).
shomann is offline   Reply With Quote
Old 12-21-2004, 06:48 PM   #6
acme.mail.order
League Commissioner
 
Join Date: Sep 2003
Location: Tokyo
Posts: 6,334
So, if I understand this correctly, you need to backup and version 20Gb a day, but network and storage is not an issue.

BUT you need to handle HFS data on a non-HFS system.

I would recommend Disk Images.
Plus:
  • HFS storage.
  • Fully 8-bit clean, no forks (travels well) Will live happily on anything.
  • Mac-user compatible.
  • Everything in one easy-to-name, unmodifiable (optional) package.
  • Can be easily created in the background from a source folder.
  • Everything you need is included in OS10.2 and later.
Minus:
  • Can't be opened on anything but a Mac without a LOT of work.
  • You might need to rearrange where your work files are stored locally so a single folder contains everything.
  • You have to dive into the shell for the automation to work, BUT there is a GUI tool available (Disk Utility) for manual work.
I don't have a fraction of your backup needs, but I have our main databases backing themselves up at each morning boot. The added time is not noticeable.

Scripts to do the local backups will be about 3-5 lines long - not complicated. And it's easy to make an installer package so no shell work is required for installation.

If this is what you want I'll be happy to post my backup scripts.
acme.mail.order is offline   Reply With Quote
Old 12-23-2004, 09:49 PM   #7
shomann
Prospect
 
Join Date: Dec 2004
Posts: 6
I was thinking disk images actually, just didn't know how to get there. So you are telling me that I can have incremental and versioned backup via disk images and a bit of script automation? An example I would like to follow:

Edit 1 Computer gets its work files backed up every day at 10pm.
Day 1 sees all of the work files being added to the server (via disk images).
Day 2 sees only the files that have changed since Day 1 being added to the server. Files on the server do not get overwritten in the process.
And so on...

If I use disk images, I could simplify the server end of this by keeping ext3 in place. There is no need for full HFS implementation if all I am doing is dumping files on there. The resource forks, file names and invisible files all stay intact in the .dmg files right?

There is no option to use a Mac as the backup server, BTW. It will have to be the Intel box I already have built.

Thanks for any help, you ought to write a tip to the main site, I am sure others will want to do this.

Last edited by shomann; 12-23-2004 at 09:54 PM.
shomann is offline   Reply With Quote
Old 12-24-2004, 06:23 AM   #8
acme.mail.order
League Commissioner
 
Join Date: Sep 2003
Location: Tokyo
Posts: 6,334
Quote:
Originally Posted by shomann
If I use disk images, I could simplify the server end of this by keeping ext3 in place. There is no need for full HFS implementation if all I am doing is dumping files on there. The resource forks, file names and invisible files all stay intact in the .dmg files right?

Exactly. It's like a pdf file - a wrapper for whatever you want to put inside it.

Complete backups are easier than incrementals. IMHO incremental backups in the days of terabyte storage and gigabit networks are a little dated. However, you may have a valid reason to do them.

The scripts:

Assumptions:
Your files are all stored in /Users/shortname/Movies/
the local machine is not hurting for storage space on the user partition.

Code:
    USER=`id -un`
    DATE=`date "+%m-%d-%y"`
    hdiutil create -srcfolder ~/Movies/ -fs HFS+ -volname $USER$DATE -ov -anyowners ~/Public/daily.dmg &
Notice I set the volume name to the user plus datestamp, but the file has a set name. There's a good reason for this - it removes the need to do version management on the local machine. Yesterday's file gets clobbered nightly.
Also note that it's in the Public folder - this way the server doesn't need passwords.

Collecting the files is best done by the server as voldenuit suggested above.
If you are (like me) unfamiliar with command-line access to remote volumes and aren't enthusiastic enough to learn, then replace 'Public' with 'Sites' in the code above and turn on Personal Web Sharing in System Preferences:Sharing.

On the server, make a script containing:
Code:
   DATE=`date "+%m-%d-%y"`
   cd /path/to/backup/directory
   curl -o username1-$DATE.dmg http://192.168.1.101/~username1/daily.dmg
   curl -o username2-$DATE.dmg http://192.168.1.102/~username2/daily.dmg
and so on.
There's a dozen ways to do this, and probably someone else will come up with a one-liner to logon, copy, rename and logout. Hopefully it will be readable Yes, there's scp, but I hate fussing with passwords on a closed network.

Making scripts run from cron is well-flogged elsewhere.

If you really want/need to do incrementals, then use:
Code:
  mkdir ~/Today
  find ~/Movies/ -newerct '1 days ago' -exec cp {} ~/Today/ \;
     USER=`id -un`
     DATE=`date "+%m-%d-%y"`
     hdiutil create -srcfolder ~/Today/ -fs HFS+ -volname $USER$DATE -ov -anyowners ~/Public/daily.dmg 
  rm -r ~/Today
acme.mail.order is offline   Reply With Quote
Old 12-27-2004, 08:44 AM   #9
shomann
Prospect
 
Join Date: Dec 2004
Posts: 6
Nice. That would work for most our machines here, but unfortunately most our machines do not have enough space to make the dmg in the first place.
This script would be GREAT for a system just starting out though, so hopefully others will find some use in it.
You know...there might be a way to use a portable firewire drive to start this system up on the machines that are 80% + full...
shomann is offline   Reply With Quote
Old 12-27-2004, 09:15 AM   #10
acme.mail.order
League Commissioner
 
Join Date: Sep 2003
Location: Tokyo
Posts: 6,334
If the machines have the main server mounted as a network drive then it should appear in the /Volumes directory. Just have the .dmg created there, directly on the server. You will have to carefully schedule the timed jobs so your network doesn't clog up too much.
acme.mail.order is offline   Reply With Quote
Old 12-27-2004, 03:10 PM   #11
CAlvarez
Hall of Famer
 
Join Date: Sep 2004
Location: Phoenix, AZ
Posts: 4,975
Take a look at the NAS appliances from Snap Appliance. They include backup software for workstations as well as servers, and support attached tape libraries (up to five tapes with the included license, extra cost for more tapes).

I don't personally use them for workstation backup, but love them as storage servers.
__________________
--
Carlos Alvarez, Phoenix, AZ

"MacBook Nano" (Lenovo S10) Atom 1.6/2GB/160GB Mac OS X 10.5.6
Gigabyte Quad Core 2.83GHz Hackintosh 4GB/500GB Mac OS X 10.6
MacBook Air 1.8/2GB/64GB SSD

http://www.televolve.com
CAlvarez is offline   Reply With Quote
Old 12-27-2004, 05:30 PM   #12
Caius
All Star
 
Join Date: Oct 2004
Location: Leeds, UK
Posts: 757
Topic Title

You haven't seen the film Dr Strangelove or how I learned to stop worrying and love the bomb by any chance have you?
__________________
http://caius.name/
Caius is offline   Reply With Quote
Old 12-28-2004, 10:04 AM   #13
shomann
Prospect
 
Join Date: Dec 2004
Posts: 6
iNemo you are the only one that has brought that up... good catch.

acme, thanks for the guidance. I still have the same basic problem and that is I think I will still need an HFS volume to copy the files to before they can be organized into .dmgs.

I am looking into a Linux program called hfsplus. This should give me the option to use either a packaged solution, or a script solution based on what you have laid out here.

Again thanks, expect a documented post soon on my (hopefully) success.
shomann is offline   Reply With Quote
Old 12-28-2004, 10:21 AM   #14
Caius
All Star
 
Join Date: Oct 2004
Location: Leeds, UK
Posts: 757
Cool

Quote:
Originally Posted by shomann
iNemo you are the only one that has brought that up... good catch.

I like my films.
__________________
http://caius.name/
Caius is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump



All times are GMT -5. The time now is 09:36 PM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.