![]() |
grep question
I want to replace variable text strings in several lines. Below is an example:
Musik CDs/Reggae/AB.mp3 Musik CDs/Classic/Bach/E.mp3 Musik CDs/Rock/Zappa/Baby Snakes/IKLM.mp3 Musik CDs/Blues/various/US/B.B King/NOP.mp3 I want to keep "Musik CDs/" and "/name_of_mp3.mp3" and replace everything in between with "Shortcuts". The result should look like this: Musik CDs/Shortcuts/AB.mp3 Musik CDs/Shortcuts/E.mp3 Musik CDs/Shortcuts/IKLM.mp3 Musik CDs/Shortcuts/NOP.mp3 I am trying to do this in TextWrangler. How would I do that? |
`grep` is a unix program that does pattern matching. "regex" is the more common abbreviation of Regular Expressions.
You simply want to use the 'or' operator '|' : s/Reggae|Classic|Rock|Blues/Shortcuts/ Or if there are many, many categories or subcategories, match the characters between the first and last slashes: s/\/.*\//\/Shortcuts\// This regex will match the LONGEST character string between (and including) the slashes. It gets a bit more complex if you only want to match one set. And I don't use Textwrangler - what I've typed will work in unix-like substitutions with a syntax of "s/find/replace/" |
If I use unix than I need to make the changes within a file.
How would the command look like if I want to make the changes in a file named: Musik.html |
Quote:
|
yes, that is what I have been trying to do. I have used the find & replace function of TextWrangler a lot but I have not been able to figure out how to do the replacement I was talking about above.
|
Quote:
sed -e 's/\/.*\//\/Shortcuts\//' Musik.html > Musik2.html Or if you're feeling brave: perl -pi -e 's/\/.*\//\/Shortcuts\//' Musik.html Note, though, that an html file makes the match expression a bit more complex as the "/" delimiter I used above also exists in normal html tags and you're not guaranteed to have it all on one line by itself. You should mention these things at the beginning :rolleyes: |
You should probably use sed, but I'd use
perl -pi.~1~ -e 's|Musik CDs/[^/]+/|Musik CDs/Shortcuts/|g' Musik.html This asks perl to 1) make a backup of Musik.html (called Musik.html.~1~) 2) change 'Musik CDs/'anything'/' to 'Musik CDs/Shortcuts/' globally (ie more than once) You could have more a more sophisticated version, eg to only change those ending in mp3 perl -pi.~1~ -e 's|Musik CDs/[^/]+/(.*\.mp3)|Musik CDs/Shortcuts/$1|g' Musik.html |
3 Attachment(s)
Quote:
|
yes, that is going in the right direction but what I want is this:
Musik CDs/Shortcuts/AB.mp3 Musik CDs/Shortcuts/E.mp3 Musik CDs/Shortcuts/IKLM.mp3 Musik CDs/Shortcuts/NOP.mp3 |
Quote:
The nice thing about TextWrangler is that you can try some regex, then Undo if it doesn't do what you want. So experiment a bit. If nothing seems to work, then ask again or read one of the many regex tutorials. |
many thanks for your help. Yes, it is nice that in TextWrangler you can quickly undo what you just did. I actually tried already quite a bit. I am just not there yet. Unfortunately the unix commands cannot be directly applied in TextWrangler. I would have already used one of the suggested unix commands if I would only have one or two files to deal with. I need to replace the strings in 55 files. In TextWrangler I can process all files at ones. That's why I try to stick with TextWrangler. I am sure there is a way to do it. I have send an email to the TextWrangler-Talk. If I get an answer there I will post it here. But hayne if you find the solution please let me know.
|
Find: Musik CDs/.+/
Replace: Musik CDs/Shortcuts/ This is a case where greedy matching is desired. |
almost success. I actually tried this one on my real problem. There it didn't work. But it sure works on the little example. Now I know that this command does work, it just needs to be a little bit modified. Below is a real example:
<tr class="nonalt"><td>18.</td><td><a href="http://10.0.1.201/Musik/Klassik/Antonin Dvorak/Antonin Dvorak - Rusalka - CD 2/(Antonin Dvorak) - White blossoms all along the road.mp3">White blossoms all along ...</a></td><td>Antonin Dvorak</td><td>Rusalka - CD 2</td><td>128</td><td>MPEG</td></tr> I you use /.+/ than of course it replace everything all the way to tr>. Maybe the search can be limited. If the search only goes to mp3 of every single line than the /.+/ will work !? |
Quote:
btw, what part of Tokyo are you in? |
mmh, maybe I should have a closer look at your solution.
I live in Nakano. And you? |
Setagaya
Whenever I'm working on any complex text replacement, I usually test the regex by dumping the stream to the terminal as an instant preview, piping to head or less if needed. Once I'm happy it's working, then I'll use perl's edit-in-place feature. No undo needed as nothing's really been done yet, and when I do run for real, it's much, much faster than the gui versions. Feeding your example into : sed -e 's/Musik[^"]*\//Musik\/Shortcut\//' (match "Musik" followed by any number of non-quote characters followed by a slash) I get : <tr class="nonalt"><td>18.</td><td><a href="http://10.0.1.201/Musik/Shortcut/(Antonin Dvorak) - White blossoms all along the road.mp3">White blossoms all along ...</a></td><td>Antonin Dvorak</td><td>Rusalka - CD 2</td><td>128</td><td>MPEG</td></tr> I think you will probably want to remove the parenthesis as well. |
yes, this is working. I tried it on my html files. Thanks for your help. With *.html then I should be able to process all the files at one. Great.
|
Run
tar -zcf backup.tgz ./*html first. You'll either process them all at once or totally trash them all at once. Yes, that's the voice of experience :eek: |
Quote:
Find: Musik/.+/(.+\.mp3) Replace: Musik/Shortcut/\1 |
thanks guardian34.
|
Find: (Musik CDs)/.+/(\w+\.mp3)
Replace: \1\/Shortcuts\/\2 Of course be sure to check "Use Grep"! By the way, if you try to use this regexp with sed (with the -E switch) or perl, you'd have to escape the forward slashes (i.e. \/). |
(Emphasis added)
Quote:
Future posters, please post actual examples that you are trying to work with. |
Oh my bad. Thanks for pointing that out. I'll edit that in my original post. Now it should work.
|
| All times are GMT -5. The time now is 10:29 PM. |
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.