![]() |
Quote:
i can never understand user reticence to actually post the actual error text at this point, which is the single most important factor in diagnosing the issue at this juncture. jbc, can you humor us and illuminate us, psychologically? |
Sorry, merv...long day. And it's just an "I'm a perl dummy" error pretty much. Since it's a one-liner, there's not much context.
Code:
Line 1: String found where operator expected near \Maybe I should clarify that ultimately the script will be called from an MTA with the line "transport_filter \path\to\perlscript". An email message is sent to the script as stdin by the MTA, and it then replaces the original message with stdout from the script. So it has to be a standalone script of some sort, not a terminal command. |
Dug out the llama; I'd forgotten about being able to specify option switches on the shebang line. Works fine now except for a "Can't emulate -e on #! line" error. Tracking that one down...
[edit: Uh, duh. It's getting late...deleted -e option....pl file is fine] |
Just kill the "-e" bit: you're right, that's just a command line flag that says "the stuff between the quotes is the *e*ntire script". Definitely not necessary for a script stored in a file.
In other news: you're right, more extensive (but not extensive enough!) testing shows that my purported solution needs some work. In particular I'd stupidly assumed that each MIME section had it's own unique ID, rather than each *message* having such an ID. Oops. I'll have another go tonight and try to wrestle this thing to the ground. Cheers, Paul |
Paul-
Think I found a perl finesse that was causing the problem...I'm not sure I understand it yet, but you probably will. The original line you posted deleted everything from the first match to the end of the input. Looked ahead in the llama book and found "non-greedy" quantifiers, but ".*?" caused the script to delete only two lines for each match. Then while reading about "memory parentheses", I noticed mention of "back references" ("\1") vs "memory variables" ("$1"). Tried changing the first occurrence of the memory variable to a back reference, and it works perfectly! Code:
#!/opt/local/bin/perl -0 -pNeed a few tweaks to be sure the script is working with whole lines, and it will be ideal for my needs. Thanks very much for getting me pointed in the right direction! As far as mime parts, it seems to be the case that each part begins and ends with the same boundary identifier, although a multipart message may have different boundary identifiers for different parts. Basically the "part" consists of the the starting boundary, the type/encoding/etc headers, the content, and the ending boundary, as near as I can tell. Sounds as if you had it right. |
Good timing: I'd been fooling around for about fifteen minutes trying to nail down the details, and was just about to post what I imagine (will check in a moment) is pretty much exactly what you've got: that is...
perl -p -0 -e 's/^(--[^\n]*)\nContent\-Type:\s*text\/html(.*?)(?=\1)//msig' filename (((Whir, whir, whir...))) OK, very similar to yours: mine uses a little bit of fancy regexpness -- the "negative lookahead operator", which is the (?=...) piece of the above puzzle. That is, the substitution "peeks forward" and only matches when there's a copy of the MIME boundary hanging around on the end, but doesn't "consume" the boundary. Same end result as what you've done in substituting it back in, but perhaps a little more elegant. And almost certainly less efficient, but it's still pretty much instantaneous, so who cares? Nice work, by the way, in hunting down the problems! Cheers, Paul |
Ah, the "non-greedy" quantifiers need to be in parentheses to work here! Must've missed that somehow. Will definitely use them, since "non-greediness" is critical to not mangling the mail with this.
One more good pointer...thanks, Paul. Brad |
Paul, one final note...your last version was correct. My script failed in some cases where the boundary was shared between two parts that were to be removed, presumably because the boundary I put back in was not getting matched as the start of a new section.
The negative lookahead operator seems to solve this problem. It's 2 AM...I'm off to bed. |
| All times are GMT -5. The time now is 06:15 PM. |
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.