Go Back   The macosxhints Forums > OS X Help Requests > UNIX - General



Reply
 
Thread Tools Rate Thread Display Modes
Old 01-22-2005, 04:47 AM   #1
nmerriam
Triple-A Player
 
Join Date: Sep 2004
Posts: 60
Way to grab the common part of a file name?

I've been going script-crazy the past few days, automating just about everything I do on a regular basis (and making the scripts accessible through OnMyCommand).

I've got one organizing script right now that's working well, but I can think of a way to make it better.

Say I have a directory of files with common root names, but different endings:

California Vacation 2-20-2005 0001.jpg
California Vacation 2-20-2005 0002.jpg
California Vacation 2-20-2005 0003.jpg

or

James Taylor - Make it Rain - In the mood.mp3
James Taylor - Make it Rain - Kill me softly.mp3
James Taylor - Make it Rain - Just us chickens.mp3

Right now my script pops up a dialogue and inserts the name of the first file, so I just delete the unique part on the end, hit enter, and then the script will make a directory with the common part and move each file that has that common beginning to that directory.

California Vacation 2-20-2005
California Vacation 2-20-2005 0001.jpg
California Vacation 2-20-2005 0002.jpg
California Vacation 2-20-2005 0003.jpg

Is there a command line technique that could automate the finding of the common root? I'm thinking in the abstract that making an array of the filenames and comparing character by character until they are != ?

Would there be a way to have it recognize the trailing common "padding", as above where I wouldn't want a directory named "California Vacation 2-20-2005 000" or "James Taylor - Make it Rain - " -- some way to tell it to trim the end of spaces, hyphens, even redundant zeros? I don't have the slightest idea where to start -- I'm certain perl could do it (it can do anything!) but I also wouldn't be surprised to find out that 3 simple command line utilities piped together could do it.

Hoping someone here can at least point me in the best direction to start googling!

ack, the things that keep me up at night
Nathaniel
nmerriam is offline   Reply With Quote
Old 01-22-2005, 12:03 PM   #2
Pope Stewart
Triple-A Player
 
Join Date: May 2004
Posts: 84
If you wanted to do that over and over, then think of some common flag to mark the unique stuff.

Say use -- when you make the files
California Vacation 2-20-2005 0001.jpg -> California Vacation 2-20-2005--0001.jpg

Then you could use awk
Code:
ls -1 | awk 'BEGIN { FS="--"; } NR==1{ printf( "%s\n", $1 ); }'
That will list your directory and then based on the unique delimiter that you have chosen will print out the root part of the first file it sees.

Outside of that, if you want to make something more universal, you might have to find the things that are common to all of your files and then create a giant if statement to check them all and perform the same kind of tokenizing based on a delimiter.

That's how I would see it at least.
Pope Stewart is offline   Reply With Quote
Old 01-23-2005, 06:44 AM   #3
mark hunte
MVP
 
Join Date: Apr 2004
Location: Hello London Calling
Posts: 1,787
I am still new at this.
Using "Pope Stewart's" Awk line, I can get this to run in Textwrangler, but not Terminal ??
Code:
#!/bin/bash
cd "/Users/mhunte/Desktop/testfolder"
path=`pwd`


for patht in *
do
if [ -f "$patht" ]; then

check=`ls -1 | awk 'BEGIN { FS="-*[2-]"; } NR==1{ printf( "%s\n", $1 ); }'`



theitems=`osascript <<END
set thepath to   POSIX file "$path"
property thisfolder : alias thepath
set newfold to alias "Hard Disk:Users:shortusername:Desktop:newfolder:"
tell application "Finder" to set theitems to get (every document file of thisfolder whose name contains "$check")
tell application "Finder"
	set newfo to make new folder at newfold with properties {name:"$check"}
	tell application "Finder" to move theitems to newfo
end tell
END`

else
    echo "$path- is now empty"
fi


done
just to note: the main thing I have got closer to grasping here is passing variables between the shell and applescript.
Its not perfect and I know there are probably better ways to handle nmerriam problem/which I want to learn.
The 'cd' part for the path is so I know exactly where this script is looking at . ie not my / or ~ directories.

So anyway why am I getting.



Code:
247:248: syntax error: Expected expression but found end of line. (-2741)
247:248: syntax error: Expected expression but found end of line. (-2741)
247:248: syntax error: Expected expression but found end of line. (-2741)
247:248: syntax error: Expected expression but found end of line. (-2741)
247:248: syntax error: Expected expression but found end of line. (-2741)
in terminal and not in TextWangler ?

Last edited by mark hunte; 01-23-2005 at 10:17 AM.
mark hunte is offline   Reply With Quote
Old 01-23-2005, 09:52 AM   #4
Pope Stewart
Triple-A Player
 
Join Date: May 2004
Posts: 84
Well...I took the script and made some modifications so that it would work on my directory structure and it worked from terminal. The only things I changed were the path variable and newfolder applescript variable.

Maybe it has something to do with your line encodings. Check those. I am not totally sure, but maybe it does. In terminal type `cat scriptname` If there is a long string of commands where there should be a newline, then it might be a apple newline and not a unix newline....maybe
Pope Stewart is offline   Reply With Quote
Old 01-23-2005, 10:07 AM   #5
mark hunte
MVP
 
Join Date: Apr 2004
Location: Hello London Calling
Posts: 1,787
Bloody copy and past in Terminal was the problem.
I forgot if you have a small Terminal window, it wraps long lines.

What a pain.

It works now.

Thanks for reply.

Do you have any ideas how to make this better, like only report once when done?

Last edited by mark hunte; 01-23-2005 at 10:14 AM.
mark hunte is offline   Reply With Quote
Old 01-23-2005, 11:16 AM   #6
hayne
Site Admin
 
Join Date: Jan 2002
Location: Montreal
Posts: 32,459
Here (below) is a Perl script that will find the common start of the strings passed as command-line arguments. For example (where I assume you have saved the script in a file called "commonStart" and made it executable):
commonStart foo123 foo456 fooling foobar
would give:
foo

You could use it to find the commonality between the filenames in a folder by using it in a pipe like the following:

ls | xargs commonStart

But the above won't work if there are any spaces in the filenames since the space character is interpreted as a separator between command-line arguments. So we need to put quotes around the filenames to make sure that the spaces don't cause problems.
That can be done by use of the following Bash function (put it in your Bash startup file to make it available):
enquote () { /usr/bin/sed 's/^/"/;s/$/"/' ; }

ls | enquote | xargs commonStart

Or just use the 'sed' command explicitly:

ls | /usr/bin/sed 's/^/"/;s/$/"/' | xargs commonStart

Code:
#!/usr/bin/perl -w

# This script determines the string that is the common start of the
# strings passed as command-line arguments
# Cameron Hayne (macdev@hayne.net) January 2005

my @strings = @ARGV;
my $numStrings = scalar(@ARGV);
if ($numStrings < 2)
{
    print "Must supply at least 2 strings as command-line arguents\n";
    exit;
}

my $sep = chr(0); # null character - presumed not to occur in strings
my $common = $strings[0];
for (my $i = 1; $i < $numStrings; $i++)
{
    if ("$common$sep$strings[$i]" =~ /^(.*).*$sep\1.*$/)
    {
	$common = $1;
    }
}

print "$common\n";
Of course the Perl script could be specialized to get the directory listing and operate on the filenames instead of expecting command-line arguments, but I thought it better to keep the Perl script general, so it could be used in other circumstances.

Last edited by hayne; 01-23-2005 at 11:38 AM.
hayne is offline   Reply With Quote
Old 01-23-2005, 12:15 PM   #7
mark hunte
MVP
 
Join Date: Apr 2004
Location: Hello London Calling
Posts: 1,787
UPDATE **-Way to grab the common part of a file name?

Thanks hayne.

I will spend some time digesting that script.

for some reason I get
Code:
/commonStart: line 9: my: command not found
/commonStart: line 10: syntax error near unexpected token `('
/commonStart: line 10: `my $numStrings = scalar(@ARGV);'
Also
on the sed part, I guess I could incorperate it into the script.

But I also wonder if I could use the 'IFS' The bash Internal Field Separator. and change the 'space' separator to 'tab'
I did have a go but I get the same errors



UPDATE**

I stopped getting the errors on your pure script

when I took the /usr/bin/ part out of the ls | /usr/bin/sed 's/^/"/;s/$/"/'

when I insert
Code:
#!/bin/bash
newifs=$IFS;
IFS="
";
IFS=$newif:

#!/usr/bin/perl -w

# This
at the top of your script. ( I dont know if this double declaring a interpreter should even work, but thought what the heck)

I get the exact same errors, which in a warped way I find encouraging , as it may be telling me that it could work...???

the IFS is in this is using a <TAB>

Last edited by mark hunte; 01-23-2005 at 12:35 PM. Reason: forgot to add something
mark hunte is offline   Reply With Quote
Old 01-23-2005, 11:31 PM   #8
hayne
Site Admin
 
Join Date: Jan 2002
Location: Montreal
Posts: 32,459
Quote:
Originally Posted by mark hunte
for some reason I get
Code:
/commonStart: line 9: my: command not found
/commonStart: line 10: syntax error near unexpected token `('
/commonStart: line 10: `my $numStrings = scalar(@ARGV);'

Those errors are because you don't have the
#!/usr/bin/perl -w
as the first line of the script. E.g. you probably have a blank line or something else above it.
The script I wrote is a Perl script and thus the first line must announce that fact by being as above.
hayne is offline   Reply With Quote
Reply

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump



All times are GMT -5. The time now is 08:46 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.