The macosxhints Forums

The macosxhints Forums (http://hintsforums.macworld.com/index.php)
-   UNIX - Newcomers (http://hintsforums.macworld.com/forumdisplay.php?f=15)
-   -   grep question (http://hintsforums.macworld.com/showthread.php?t=55926)

detlef 05-21-2006 04:52 AM

grep question
 
I want to replace variable text strings in several lines. Below is an example:

Musik CDs/Reggae/AB.mp3
Musik CDs/Classic/Bach/E.mp3
Musik CDs/Rock/Zappa/Baby Snakes/IKLM.mp3
Musik CDs/Blues/various/US/B.B King/NOP.mp3

I want to keep "Musik CDs/" and "/name_of_mp3.mp3" and replace everything in between with "Shortcuts".

The result should look like this:

Musik CDs/Shortcuts/AB.mp3
Musik CDs/Shortcuts/E.mp3
Musik CDs/Shortcuts/IKLM.mp3
Musik CDs/Shortcuts/NOP.mp3

I am trying to do this in TextWrangler. How would I do that?

acme.mail.order 05-21-2006 05:19 AM

`grep` is a unix program that does pattern matching. "regex" is the more common abbreviation of Regular Expressions.

You simply want to use the 'or' operator '|' :

s/Reggae|Classic|Rock|Blues/Shortcuts/

Or if there are many, many categories or subcategories, match the characters between the first and last slashes:

s/\/.*\//\/Shortcuts\//

This regex will match the LONGEST character string between (and including) the slashes. It gets a bit more complex if you only want to match one set.

And I don't use Textwrangler - what I've typed will work in unix-like substitutions with a syntax of "s/find/replace/"

detlef 05-21-2006 06:24 AM

If I use unix than I need to make the changes within a file.

How would the command look like if I want to make the changes in a file named: Musik.html

hayne 05-21-2006 06:48 AM

Quote:

Originally Posted by detlef
If I use unix than I need to make the changes within a file.

You should be able to do the above mentioned grep-like regular expression substitutions with TextWrangler's Search & Replace. Just check the 'grep' option in the Search & Replace dialog.

detlef 05-21-2006 06:54 AM

yes, that is what I have been trying to do. I have used the find & replace function of TextWrangler a lot but I have not been able to figure out how to do the replacement I was talking about above.

acme.mail.order 05-21-2006 06:55 AM

Quote:

Originally Posted by detlef
If I use unix than I need to make the changes within a file.

Says who? Make changes between files, in a file, or streamed:

sed -e 's/\/.*\//\/Shortcuts\//' Musik.html > Musik2.html

Or if you're feeling brave:

perl -pi -e 's/\/.*\//\/Shortcuts\//' Musik.html

Note, though, that an html file makes the match expression a bit more complex as the "/" delimiter I used above also exists in normal html tags and you're not guaranteed to have it all on one line by itself. You should mention these things at the beginning :rolleyes:

NadeemF 05-21-2006 06:59 AM

You should probably use sed, but I'd use
perl -pi.~1~ -e 's|Musik CDs/[^/]+/|Musik CDs/Shortcuts/|g' Musik.html
This asks perl to
1) make a backup of Musik.html (called Musik.html.~1~)
2) change 'Musik CDs/'anything'/' to 'Musik CDs/Shortcuts/' globally (ie more than once)

You could have more a more sophisticated version,
eg to only change those ending in mp3
perl -pi.~1~ -e 's|Musik CDs/[^/]+/(.*\.mp3)|Musik CDs/Shortcuts/$1|g' Musik.html

hayne 05-21-2006 07:26 AM

3 Attachment(s)
Quote:

Originally Posted by detlef
yes, that is what I have been trying to do. I have used the find & replace function of TextWrangler a lot but I have not been able to figure out how to do the replacement I was talking about above.

Have a look at this (using the pattern suggested by NadeemF):

detlef 05-21-2006 07:45 AM

yes, that is going in the right direction but what I want is this:

Musik CDs/Shortcuts/AB.mp3
Musik CDs/Shortcuts/E.mp3
Musik CDs/Shortcuts/IKLM.mp3
Musik CDs/Shortcuts/NOP.mp3

hayne 05-21-2006 07:54 AM

Quote:

Originally Posted by detlef
yes, that is going in the right direction but what I want is this:

Musik CDs/Shortcuts/AB.mp3
Musik CDs/Shortcuts/E.mp3
Musik CDs/Shortcuts/IKLM.mp3
Musik CDs/Shortcuts/NOP.mp3

Then try the regex suggested by acme.mail.order
The nice thing about TextWrangler is that you can try some regex, then Undo if it doesn't do what you want. So experiment a bit.
If nothing seems to work, then ask again or read one of the many regex tutorials.

detlef 05-21-2006 08:04 AM

many thanks for your help. Yes, it is nice that in TextWrangler you can quickly undo what you just did. I actually tried already quite a bit. I am just not there yet. Unfortunately the unix commands cannot be directly applied in TextWrangler. I would have already used one of the suggested unix commands if I would only have one or two files to deal with. I need to replace the strings in 55 files. In TextWrangler I can process all files at ones. That's why I try to stick with TextWrangler. I am sure there is a way to do it. I have send an email to the TextWrangler-Talk. If I get an answer there I will post it here. But hayne if you find the solution please let me know.

guardian34 05-21-2006 08:08 AM

Find: Musik CDs/.+/

Replace: Musik CDs/Shortcuts/


This is a case where greedy matching is desired.

detlef 05-21-2006 08:43 AM

almost success. I actually tried this one on my real problem. There it didn't work. But it sure works on the little example. Now I know that this command does work, it just needs to be a little bit modified. Below is a real example:

<tr class="nonalt"><td>18.</td><td><a href="http://10.0.1.201/Musik/Klassik/Antonin Dvorak/Antonin Dvorak - Rusalka - CD 2/(Antonin Dvorak) - White blossoms all along the road.mp3">White blossoms all along ...</a></td><td>Antonin Dvorak</td><td>Rusalka - CD 2</td><td>128</td><td>MPEG</td></tr>

I you use /.+/ than of course it replace everything all the way to tr>. Maybe the search can be limited. If the search only goes to mp3 of every single line than the /.+/ will work !?

acme.mail.order 05-21-2006 09:21 AM

Quote:

Originally Posted by detlef
In TextWrangler I can process all files at ones.

In unix, simply use a wildcard instead of a filename (in any non-stream process) and you will process all the files at once. For example, one of the above perl commands with a *.html will do everything in a directory, or you can use `find -exec ....`

btw, what part of Tokyo are you in?

detlef 05-21-2006 09:27 AM

mmh, maybe I should have a closer look at your solution.

I live in Nakano. And you?

acme.mail.order 05-21-2006 10:05 AM

Setagaya

Whenever I'm working on any complex text replacement, I usually test the regex by dumping the stream to the terminal as an instant preview, piping to head or less if needed. Once I'm happy it's working, then I'll use perl's edit-in-place feature. No undo needed as nothing's really been done yet, and when I do run for real, it's much, much faster than the gui versions.

Feeding your example into :

sed -e 's/Musik[^"]*\//Musik\/Shortcut\//'

(match "Musik" followed by any number of non-quote characters followed by a slash)
I get :

<tr class="nonalt"><td>18.</td><td><a href="http://10.0.1.201/Musik/Shortcut/(Antonin Dvorak) - White blossoms all along the road.mp3">White blossoms all along ...</a></td><td>Antonin Dvorak</td><td>Rusalka - CD 2</td><td>128</td><td>MPEG</td></tr>

I think you will probably want to remove the parenthesis as well.

detlef 05-21-2006 10:51 AM

yes, this is working. I tried it on my html files. Thanks for your help. With *.html then I should be able to process all the files at one. Great.

acme.mail.order 05-21-2006 11:04 AM

Run

tar -zcf backup.tgz ./*html

first. You'll either process them all at once or totally trash them all at once. Yes, that's the voice of experience :eek:

guardian34 05-21-2006 12:03 PM

Quote:

Originally Posted by detlef
I actually tried this one on my real problem. There it didn't work. But it sure works on the little example.

In the future, I'd start with a real example…


Find: Musik/.+/(.+\.mp3)

Replace: Musik/Shortcut/\1

detlef 05-23-2006 08:00 PM

thanks guardian34.

xesrever 05-28-2006 09:50 AM

Find: (Musik CDs)/.+/(\w+\.mp3)
Replace: \1\/Shortcuts\/\2

Of course be sure to check "Use Grep"!

By the way, if you try to use this regexp with sed (with the -E switch) or perl, you'd have to escape the forward slashes (i.e. \/).

guardian34 05-28-2006 10:30 AM

(Emphasis added)
Quote:

Originally Posted by xesrever
Find: ^(Musik CDs)/.+/(\w+\.mp3)$

The patterns in Detlef's actual files don't start at the beginning of the line (nor do they contain the pattern " CDs"… there is also more content before the end of the line).



Future posters, please post actual examples that you are trying to work with.

xesrever 05-28-2006 10:55 AM

Oh my bad. Thanks for pointing that out. I'll edit that in my original post. Now it should work.


All times are GMT -5. The time now is 10:29 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.