The macosxhints Forums

The macosxhints Forums (http://hintsforums.macworld.com/index.php)
-   UNIX - Newcomers (http://hintsforums.macworld.com/forumdisplay.php?f=15)
-   -   Simple grep question (http://hintsforums.macworld.com/showthread.php?t=4820)

osxpez 08-22-2002 11:22 AM

Quote:

Originally posted by vickishome
I'm able to do egrep foo|bar|baz textfile without the backslashes. In fact, when I add the backslashes, it doesn't work anymore. I first thought maybe it was because I chose to not use the extended option with grep, but egrep is supposed to already have the extended options.

Any idea why I don't need/can't use the backslashes?
No idea at all. That seems very strange! Maybe tcsh sees that "egrep foo" can't be a command and then decides that | doesn't have a special meaning any more and then hands egrep those backslashes as is... No, that would be too weird even for tcsh.

does:

egrep 'foo|bar|baz' textfile

work?

vickishome 08-22-2002 11:43 AM

Please don't tell me it's strange, let alone very strange! "Strange" seems to be my NMOO (normal mode of operation) with Unix. :p

This is what I get when I try egrep with the | and \|:
Code:

% cat test.txt
tech word
word support
technical word
word supportive
techsupport word
word techsupport
tech word support
technical word support
techhhhhh support
support tech
ez
x
technical word supportive

% egrep 'ez|x|techhhh' test.txt
techhhhhh support
ez
x

% egrep 'ez\|x\|techhhh' test.txt
%

Notice that it works without the backslashes, but when I add the backslashes, I just get another % prompt with no error messages or results.

pmccann 08-22-2002 12:06 PM

But that's *fine*! I think I see what's going on here: Doug backslashed his "|" symbols because he didn't quote the whole regular expression. That backslashing prevented the shell from thinking they were pipes (and thus chucking a fit about not being able to find the command on the right hand side of the first pipe, or maybe offering a strange substitution). Quoting your expression has the same effect: *egrep* gets to see the "|" symbols instead of the greedy old shell grabbing them in transit.

When you wrote above that your egrep worked *without quotes and without backslashes* it certainly was "interesting". But if you were quoting all the way, just invisibly in the post that's about 3 above this one, then the world is at peace.

When you write

% egrep 'ez\|x\|techhhh' test.txt

you're asking for occurences of the literal text 'ez|x|techhhh'. In other words, you've "double-negativized" the "|" symbol, so that it's interpreted literally.

I hope that's close, anyway. Seems to pass my tests!

Cheers,
Paul

osxpez 08-22-2002 12:07 PM

Ah, but that's not strange at all! (Strange is far from NMOD with Unix BTW).

Notice that stetner said:
Code:

egrep foo\|bar\|baz textfile
Whereas you do:
Code:

egrep 'foo\|bar\|baz' textfile
stetner's example gives grep "foo|bar|baz" as the regexp. But since you singlequote the regexp you feed egrep "foo\|bar\|baz" which tells egrep to disregard the vertical bar's function as being alternate pattern delimiter. So you search literally for "foo|bar|baz" which doesn't match any line in your file. For it to work you need to either write it like stetner suggested or like I suggested:
Code:

egrep 'foo|bar|baz' textfile
Which gives you less backslashtithis. :)

vickishome 08-22-2002 01:20 PM

Ahhhh! Okay, the light just switched on. :) I didn't even notice the absence of the quotes in stetner's message. I just did a copy/paste from his message when writing mine so the absent quotes copied/pasted right along with it. Using quotes is just second nature to me (I've always used quotes when allowed, even if not required). I will be more watchful for the variations of quotes/absent quotes in the future.

Okay, mystery solved. Thanks! :)

osxpez 08-22-2002 03:58 PM

Vicki: I just have to confuse you a little bit more. If you had done your double quoting using grep instead of egrep, like:
Code:

grep 'foo\|bar\|baz' textfile
... things would have worked. This is because grep's regular expressions regards \| as being the alternate pattern delimiter. It works with \?, \+, \{, \|, \(, and \) as well.

mervTormel 08-22-2002 05:38 PM

greep is creepy. i prefer grap or grop, grup even.

Craig R. Arko 08-22-2002 10:23 PM

Very mysterious and ooky, if you ask me. ;)

My only possible contribution to this thread:

grep stands for General Regular Expression Parser, IIRC.

Back to read and learn something mode. :cool:

vickishome 08-22-2002 10:38 PM

Quote:

Originally posted by osxpez
Vicki: I just have to confuse you a little bit more. If you had done your double quoting using grep instead of egrep, like:

grep 'foo\|bar\|baz' textfile

... things would have worked.
Ah, why not... confuse me all you want. :D

Actually, I understood what you said. But I had no idea you could do that with grep (multiple search)! I thought you had to use egrep for a multiple search! I learn so many things from you guys! :)

osxpez 08-23-2002 02:59 AM

Vicki: With GNU grep there's actually no difference between greps and egreps regular expression engines. It's just the parsing of those "extended" features that are different. With grep you have to put a backslash in front of those operating characters to get them to act special, with egrep you don't. With egrep you have to put a backslash in front of those characters to switch their special meaning off, with grep you don't.

Craig: According to "Master Regular Expressions" by Jeffrey Friedl grep got it's name from a common operation in the ed editor:

:g/Regular Expression/p

Which can be read as Global Regular Expression Print and it was so poular that a standalone utility, grep, was created for it. I don't know what sources Friedl has for this particular info, but I find it more plausible than "General Regular Expression Parser". Because grep does so much more than just parse the regexp and even so, what's "general" about grep?

And my suggestions for names to Paul's "egrep -i" alias are: griffin or grip. The last one is the Swedish word for griffin (which is a mythical creature combining the bodies of three animals). In Swedish grip is pronounced greep. :)

the_shrubber 08-23-2002 03:14 AM

kudos
 
not adding any tech info, but just wanted to extend a kudos to vicki for the investigative attitude that will turn any newbie into a guru in no time (well, actually, lots of time but eventually people starting thinking you can do magic [not that i would know personally])

if only we could replace all the "MS Word in 21 days" books/classes with a "How to approach computers" or "How to approach computer software" books/classes, geeks might actually start shutting up about "lusers" of the world.

vickishome 08-23-2002 06:55 AM

You just made my day! Thank you! :)

stetner 08-23-2002 07:23 AM

Quote:

Originally posted by pmccann
Another one of those duplicate binaries that occur in osx:

% ls -l `which grep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:21 /usr/bin/grep
% ls -l `which egrep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:22 /usr/bin/egrep


And so on for fgrep. That is, it's the same binary, but responds differently depending on the name by which it's called.
I was going to post the same thing last night Paul. But I checked first:D
Code:

% pwd
/usr/bin
% ls -l [ef]grep grep
-rwxr-xr-x    1 root    wheel      105548 Aug  3 10:17 egrep
-rwxr-xr-x    1 root    wheel      105548 Aug  3 10:17 fgrep
-rwxr-xr-x    1 root    wheel      105548 Aug  3 10:17 grep
% cmp grep egrep     
grep egrep differ: char 508, line 1
% cmp egrep fgrep
egrep fgrep differ: char 99104, line 341
% cmp fgrep grep
fgrep grep differ: char 508, line 1
%

Usually they are the same binary and in fact are 'hard' linked to the same binary. But not in this case. *shrug*

stetner 08-23-2002 07:33 AM

Quote:

Originally posted by Craig R. Arko
grep stands for General Regular Expression Parser, IIRC.

Back to read and learn something mode. :cool:
Hmmm, I am sure it was Global Regular Expression Print. You see back in the old days of ex and decwriters you would do a

:g/reg exp/p

to print out all lines with the reg exp in it.

I will just go back to cleaning my dentures now..... :)

vickishome 08-23-2002 07:50 AM

My book says it's global...
Quote:

After laborious research and countless hours debating with Unix developers, I am reasonably certain that the derivation of the name grep is as follows:
Before this command existed, Unix users would use a crude line-based editor called ed to find matching text. As you know, search patterns in Unix are called regular expressions. To search throughout a file, the user prefixed the command with global. After a match was made, the user wanted to have it listed to the screen with print. To put it all together, the operation was global/regular expression/print. That phrase was pretty long, however, so users shortened it to g/re/p. Thereafter, when a command was written, grep seemed to be a natural, if an odd and confusing, name.
Ducking and running now... :p

Craig R. Arko 08-23-2002 08:01 AM

I stand corrected. :)

It seems to be an error of my generation. ;)

sao 08-23-2002 08:05 AM

Code:

grep

 great reliable enormous potato  :D


Cheers...

pmccann 08-24-2002 01:46 AM

Quote:

Originally posted by stetner
I was going to post the same thing last night Paul. But I checked first:D
Yeah, go on, yuck it up!

Quote:

Usually they are the same binary and in fact are 'hard' linked to the same binary. But not in this case. *shrug*
Damn, damn, damn: I just checked most of the others quoted in my post above, and all but the relevant binaries (surprise!) seem to be as I claimed! That is, the same but duplicated. *Not linked either.* Very weird indeed.

Cheers,
Paul

osxpez 08-24-2002 04:44 AM

Quote:

Originally posted by pmccann
That is, the same but duplicated. *Not linked either.* Very weird indeed.
Maybe they are links in the original repository and become copies somewhere in the packaging or unpackaging?


All times are GMT -5. The time now is 10:35 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.