The macosxhints Forums

The macosxhints Forums (http://hintsforums.macworld.com/index.php)
-   UNIX - Newcomers (http://hintsforums.macworld.com/forumdisplay.php?f=15)
-   -   Simple grep question (http://hintsforums.macworld.com/showthread.php?t=4820)

vickishome 08-21-2002 02:42 PM

Simple grep question
 
What is the difference between these two commands?

grep 'zz*' filename

grep z filename

sbur 08-21-2002 02:49 PM

operationally, there is no difference. The * mean that the preceeding character can be matched zero or more times. So, any occurance of "z" will be flaged by both queries. If you want a "zz" specifically, leave off the *.

vickishome 08-21-2002 02:56 PM

No difference - that's what I thought.

I was confused because the book I'm reading said you had to use grep 'zz*' filename to find lines that have "one or more z's" which is what I thought grep z filename would do. I couldn't figure out what the difference was as they appeared to do the same thing in my mind.

Thanks for comfirming it for me. :)

vickishome 08-21-2002 03:00 PM

Actually, I have another question. There had to be a point the book was trying to make.

Can someone give me an example where it is preferred (or required) to use the syntax grep 'zz*' filename instead of grep z filename? Or is it just a case of semantics?

vickishome 08-21-2002 03:08 PM

And while I'm at it... what's the point of searching for "zero or more" of anything? Zero would mean the search string is not present. More than zero would mean it is present. Wouldn't that always be the case? Either it's there or it's not there!

I'm completely missing the entire purpose of 'zz*'. I can't think of a single reason for having it there so it has me confused and baffled.

bakaDeshi 08-21-2002 03:58 PM

It would probably help if you thought of an example with substitution. I know you're just starting out. Example:

I have a word in a file called "zoolander".
If I do a search for 'z' and substitute it with 'A', the result would be "Aoolander".

If I do a search for 'zz*' and substitute it with 'A', the result would be "A".

So when you search for 'z', it locates zoolander.

And when you search for 'zz*', it locates zoolander.

Clearer???

HTH

vickishome 08-21-2002 05:02 PM

Hmm... I think I might be following you here. Let me see if I understood you.

This line, grep 'zz*' filename is saying:

Find words that begin with the letter z and have anything after it.

Which is the same as a ls z* (except we're doing a search for characters/words within the file).

Is this right?

Kemul 08-21-2002 05:25 PM

Hello,

This interests me. So I tried something on the terminal:

Code:

% cat test.txt
abcdefghi
aabcdefghi
aaabcdefghi
aaaaaaaaaaaaaaaaaaa

% grep a test.txt
abcdefghi
aabcdefghi
aaabcdefghi
aaaaaaaaaaaaaaaaaaa

% grep 'aa' test.txt
aabcdefghi
aaabcdefghi
aaaaaaaaaaaaaaaaaaa

% grep 'aa*' test.txt
abcdefghi
aabcdefghi
aaabcdefghi
aaaaaaaaaaaaaaaaaaa

So, I think if you run grep with 'something', then the things inside the ' ' (single quotes) are treated as regular expressions. Without the ' ' means find 1 or more occurences of that string. * means find 0 or more occurences of the string. In my example, aa* means find 'a' followed by 0 or more 'a's.

This makes me feel like taking a unix class test 2 years ago hehe... :p

vickishome 08-21-2002 06:06 PM

Nope, this is not working as I understood bakaDeshi to say. The results of 'v' and 'vz*' are exactly the same. Look at this:
Code:

% cat test.txt
violets test beginning of line
test violets middle of line
test xviolets in the middle of a word
this line does not contain the character in question

% grep 'v' test.txt
violets test beginning of line
test violets middle of line
test xviolets in the middle of a word

% grep 'vz*' test.txt
violets test beginning of line
test violets middle of line
test xviolets in the middle of a word

I thought bakaDeshi was trying to say it was a way of finding words that began with a specific character, but that's not what my little test shows to be happening. The last grep command still brings up "xviolets" even though "v" is not at the beginning of the word.

We can use '^z' to find lines that begin with a specific character, but not words within the line. Grepping (is that a verb?) for ' v' will find words beginning with v if we assume that the word is prefixed with a space, but that's not always going to be the case (what if the word has a parenthasis before it?). Further, ' v' would not find words starting with 'v' that are in the beginning of the line (but we could combine the search with '^v' if needed).

I see absolutely no difference between using 'v' and 'vz*'. Which means I continue to see absolutely no purpose for the syntax 'z*' (other than to confuse me. :p).

vickishome 08-21-2002 06:24 PM

Also look at what happens when the line has just one single character (notice the last line just has a v).
Code:

% cat test.txt
violets test beginning of line
test violets middle of line
test xviolets in the middle of a word
this line does not contain the character in question
v

% grep 'v' test.txt
violets test beginning of line
test violets middle of line
test xviolets in the middle of a word
v

% grep 'v.' test.txt
violets test beginning of line
test violets middle of line
test xviolets in the middle of a word

% grep 'vz*' test.txt
violets test beginning of line
test violets middle of line
test xviolets in the middle of a word
v

When I grep for the character 'v' followed by any character ('.'), the last line is not found. That should indicate that there are no characters after the v in that line. But grepping for 'vz*' still finds that line exactly the same way in which grepping for 'v' does (because there are 0 occurances of the character "z" after the "v" in that line).

Am I just being dense here? Why would Unix have a search command that looks for zero or more of anything?

vickishome 08-21-2002 06:48 PM

So I look for answers on the web, and think I found it.

http://advisor.uchicago.edu/docs/unix/reg-exp.html
Quote:

The asterisk. Any one-character expression followed by an asterisk will match that character zero or more times. The regular expression "his*" matches the strings "his," "hiss," or "hissssssssssss" -- but also "hi" (an `h,' followed by an `i,' followed by zero `s' characters).

grep `tech*support' mbox

would locate references to "techsupport" as well as "technical advisors and support staff," so long as "tech" preceded "support" on the same line.
That sounds good. So I test it out. It's not working that way when I test it!

Code:

% cat test.txt
tech word
word support
technical word
word supportive
techsupport word
word techsupport
tech word support
technical word support
technical word supportive

% grep 'tech*support' test.txt
techsupport word
word techsupport

% grep 'techz*support' test.txt
techsupport word
word techsupport

So why didn't it find the other lines as the info I found claimed it would?

And I thought this was going to be a very simple question to answer when I posted this thread. :(

Kemul 08-21-2002 07:19 PM

Welcome to the world of regex!

I'm in confusion too now... :D

Try putting [] inside the ' '. So, using your example, I tried
Code:

grep '[tech*support]' test.txt
and I get every line.

mervTormel 08-21-2002 08:02 PM

well, some documentation suffers from inaccuracies. always be skeptical of docs, and look for validation and corroboration in the exercise.

in regular expressions, * is used to match zero or more occurrences of the preceding regexp, which is typically a single char

so, grep "tech*support" will match "tec" and zero or more of the letter 'h' followed by "support"

what you need is the dot metachar in there to wildcard 'any single char'

Code:

$ grep "tech.*support" foo
techsupport word
word techsupport
tech word support
technical word support
technical word supportive

read "match 'tech' and 'any char zero or more times' and 'support'"

mervTormel 08-21-2002 08:30 PM

Quote:

Why would Unix have a search command that looks for zero or more of anything?
remember, regular expressions describe patterns. oft times we don't/can't know the data intimately and we're not concerned with the literals, but the results of the search for patterns.

the notion of finding 'something-anything-somethingElse' could not be accurately accomplished without the notion of 'anything' containing zero or more occurrences of a regexp. right? because 'anything' could be nothing (zero occurrences) and that makes our pattern search work.

DFrakes 08-21-2002 09:12 PM

Quote:

grep `tech*support' mbox

would locate references to "techsupport" as well as "technical advisors and support staff," so long as "tech" preceded "support" on the same line.
The above is actually inaccurate

grep "tech*support"

would look for any string "tec_support" where _ is zero or more instances of the character h (the asterisk follows the "h" and there is no space).

Thus, while "techsupport" and "techhhhhhsupport" would be found, "tech support" and "technical advisors and support" would not.

As merv pointed out, you want to search for the entire phrase "tech", followed by any characters any number of times, followed by the phrase "support" -- thus you would want to use

grep "tech.*support"

where . is a wildcard (so .* would look for any character zero or more times)

vickishome 08-21-2002 09:41 PM

Quote:

what you need is the dot metachar in there to wildcard 'any single char'
Finally, something that works as advertised and makes logical sense to me! :D
Code:

% grep 'tech.*support' test.txt
techsupport word
word techsupport
tech word support
technical word support
technical word supportive

That's actually quite a nifty little trick to have!

The rest of what you guys wrote also made sense to me. I was able to reproduce everything the way it was presented here. It helps to see it work in action. Thanks to everyone for helping! You guys are a great! :)

vickishome 08-21-2002 09:52 PM

Quote:

I'm in confusion too now... :D

Try putting [] inside the ' '. So, using your example, I tried
Code:

grep '[tech*support]' test.txt
and I get every line.
Yes, because as I understand it (which isn't saying a lot :D), using [] turns the search string into a search for the individual characters. It's saying "find any line that has a t or e or c or zero or more occurances of h or s yadda, yadda. So not only will that grep find all the lines in the test.txt file I posted earlier, it would also find a line that just has the single character e, for example. It will also find a line that has "tech support" backwards as support tech. That's actually a whole different type of search.

vickishome 08-21-2002 10:16 PM

BTW, Merv, remember our previous discussion regarding pipes? By golly, I think I'm finally getting that one down. I figured this one out on my own.
Code:

% cat test.txt
tech word
word support
technical word
word supportive
techsupport word
word techsupport
tech word support
technical word support
techhhhhh support
support tech
ez
x
technical word supportive

% grep 'tech' test.txt | grep 'support'
techsupport word
word techsupport
tech word support
technical word support
techhhhhh support
support tech
technical word supportive

It worked! There may be hope for me afterall. :D

pmccann 08-21-2002 11:53 PM

Hi Vicki,

I don't really want to rain on your parade, but "metacharacters" (the beasties such as *, +, . that have special meanings within a regular expression) are generally stripped of that special meaning when used inside a "character class", which is what the square brackets produce.

Suppose we have:

% cat starry
one star *
no star
my star
your *
something plus a +

Then observe the following:

% grep '[h*]' starry
one star *
your *

% grep '[k+]' starry
something plus a +

(That is; if the h* was interpreted as "zero or more h's" you'd get everything from the first grep, and if k+ was "one or more k's" you'd get nothing out of the second grep. As it is, you get the lines that much the * and the + literally.)

Hope that makes sense.

Cheers,
Paul

vickishome 08-22-2002 12:01 AM

Quote:

I don't really want to rain on your parade...
Rain on me all you want! I always appreciate the help. :)

I thought you had to use the \ to turn the metacharacters into literal characters. I didn't consider how the [] would change that. Thanks for pointing that out.

pmccann 08-22-2002 12:36 AM

Note to self (and maybe to others?). The basic "grep" command, which invokes "basic regular expressions" is **really** basic. No "+" metacharacter, no "?" metacharacter (for "0 or 1"), and so on. As mT mentioned in a parallel thread recently, setting your "GREP_OPTIONS" environment variable is one way to get around this, so that you have "extended regular expressions" available by default.

For tcsh users (if you don't know what I'm talking about here: **that's you!!**)

setenv GREP_OPTIONS "--extended-regexp"

Chuck this into your .login file[*] in your home directory (and make one if you haven't got one). Then simply "source .login" and you're armed with extended regular expressions in grep. Hey, this seems to work OK for bash as well. Strike one up for using .login instead of a shell specific file.

Merv also mentioned that he was using a few other options for grep. See the thread concerned if you're interested (and "man grep"):

http://forums.macosxhints.com/showth...ht=GREPOPTIONS

Cheers,
Paul
[*] Aaargh, I fear another "why not use /usr/share/init/tcsh/..." thread coming on. Rest assured, I'll shut up. Really. Well, maybe. Not even "maybe"? Must be the weather... Now don't be a baby. Apologies for any horrid muzak that's leapt into readers' minds. 'twas a *nasty* thing to do to you.

osxpez 08-22-2002 01:47 AM

I didn't know about $GREP_OPTIONS. Couldn't that confuse scripts that depend on grep not using extended regexps? The old school way would be to use egrep for that.

scaryfish 08-22-2002 02:01 AM

Yeah that really had me confused for a while. I was trying to search for any file with the suffix .img.<number>
eg
.img.3
or
.img.8
but not
.img.12
so I had grep '\.img\.[0-9]{1}'
ie. search for any .img with a single digit number afterwards. It never produced any results - because basic grep keeps the { as literals - had to change them to \{ and \} to get it to work.

BTW, I was just trying to find a way of using the command line to concatenate split files in order. Turns out I can just go
cat *.img.[0-9] *.img.[0-9][0-9] > output.img
and I don't even need to use grep or anything.

pmccann 08-22-2002 03:34 AM

Quote:

Originally posted by osxpez
I didn't know about $GREP_OPTIONS. Couldn't that confuse scripts that depend on grep not using extended regexps? The old school way would be to use egrep for that.
Yeah, sure: if you run scripts with your environment in place the options could do nasty things. Swings and roundabouts, but there are certainly places where falling off could hurt. sudo being one glaring example. (For those not following the rambling thoughts here: the problem is that when you run processes with elevated privileges via sudo you "bring along" your environment for the superhero's journey. Including any GREP_OPTIONS settings that you might have set.) Maybe an alias would be a safer way to implement some of this: say...

alias greep 'grep --extended-regexp --ignore-case'

so that (extended regexps + case independence) is the default. The problem that raises is remembering what the thing's called. 'greep' has nice a nice mnemonic character... Thanks (again) for the wake-up call.

Cheers,
Paul

osxpez 08-22-2002 06:30 AM

But, what's wrong with using "egrep"? Or "egrep -i" if you want to disregard case. (You could always alias egrepi for "egrep -i").

vickishome 08-22-2002 07:08 AM

I just added setenv GREP_OPTIONS "--extended-regexp" to my ~/.login file. Now that I have some basics using grep, I'd like to see if I have any scripts calling grep that may be affected by this change.

Can someone tell me which dirs have scripts that run automatically? Or is there a good way to narrow down the search so I don't search my entire HD?

vickishome 08-22-2002 07:29 AM

I checked the following 4 dirs for 'grep':

/etc/
/usr/share/init/tcsh/
~/
~/Library/init/tcsh/


And I found these lines:

[share/init/tcsh]
aliases:alias word 'grep \!* /usr/share/dict/web2' # Grep thru dictionary
completions:alias list_all_hostnames 'grep -v "^#" /etc/hosts'
completions: 'n@-framework@`ls -1 ${framework_path} | grep .framework\$ | sed 's/\\.framework//' | uniq`@' \

[~/Library/init/tcsh]
aliases.mine:alias findit "ps ax | grep \!:1 | grep -v grep"


Some of the code is above my head so I'm not sure what they all do. Can someone tell me if the setenv will negatively affect anything?

mervTormel 08-22-2002 07:31 AM

the only scripts of yours that run automatically would be ones in your crontab, but i don't think they'll be run with your interactive login environment.

what you have to worry about are your ~/bin/ scripts (and /usr/local/) that you run that do greps in your interactive shell and make sure the regexps are extended regexp savvy or that your environment is clean of the GREP variables. i have a bash function 'zung' to toggle GREP variables in and out of existence when i think i might be rogue. grok?

btw, the grep --ignore-case switch is slightly more valuable than the others as it ignores case in both the source and the target.

vickishome 08-22-2002 07:35 AM

While looking through man grep, I found this option:
Code:

-E, --extended-regexp
Interpret PATTERN as an extended regular expression

Might it be better to not make the setenv change, but use the -E option when desired?

mervTormel 08-22-2002 07:49 AM

well, that's why the options are there. it's entirely up to you how to conduct your shell world. there are tradeoffs, upsides and downsides to every issue.

--
no doubt about it, there's two sides to every story.

osxpez 08-22-2002 07:49 AM

I feel totally invisible here! :)

Yes, I think "grep -E" is much better than tampering with the environment. But still "egrep" is there for these kinds of things. In the old days egrep used to be a different program than grep, but that handled extended regular exressions. But reading the grep man page now seems to indicate that egrep is actually a link to grep (or if it is vice versa) and that grep checks what name it was called by and then switches extended regexps on. Maybe a small history lesson could shed some light on this:

"grep" is short for "global regular expression print". That's what grep (without options) does; It globally aplies the regular expression and than prints rows that matches. My guess is that "egrep" stands for "extended global regular expression print". What the hell "fgrep" stands for is beyond me! :)

osxpez 08-22-2002 07:55 AM

I totally fail to see the downside with using egrep.

vickishome 08-22-2002 07:55 AM

Quote:

Originally posted by osxpez
I feel totally invisible here! :)
I see you! But that doesn't mean I understand what you (or anyone else) is saying half of the time. :)
Quote:

My guess is that "egrep" stands for "extended global regular expression print". What the hell "fgrep" stands for is beyond me! :)
Now I understand your comments about egrep. I think I'm going to pass on setting the setenv for now and use either -E or egrep when I want. When I understand the entire environment better, I may change my mind (by then, hopefully, I'll better understand the full impact on the setenv change). I'd like to stick to the K.I.S.S. principle while just starting out.

fgrep - I have no idea, but I took it as "file" grep. :)

osxpez 08-22-2002 08:04 AM

The f in fgrep stands for "fixed strings". The irony with fgrep is that it doesn't involve regular expressions. It's "fixed strings global regular exression print". If you, like me, enjoy geek humour then this should make you at least smile. :)

osxpez 08-22-2002 08:11 AM

One more thing. The grep man page on RH Linux says that egrep "is similar (but not identical) to grep -E". egrep is the "old school Unix" compatible one. The OS X man page didn't state this I think. It would be interesting to know how "egrep" an "grep -E" differs. Someone please enlighten me.

mervTormel 08-22-2002 08:23 AM

fgrep is fast grep, which it isn't, or fixed grep, because it doesn't accept metachars. egrep is your fastest grep today, me thinks.

from o'reilly's unix power tools: the old saw

unix beginners use grep because it's all they know about

intermediate users use fgrep because the manual says it's faster

advanced users use egrep because they've tried it

---

there are some timing tests here, and egrep beats even perl in both clock time and cpu usage.

fgrep has its uses; searching for literals, like *, it can save you some quoting.

i would doubt very much that egrep and grep -E have any difference in the OSX or GNU incarnations.

osxpez 08-22-2002 08:49 AM

Of course egrep is faster than Perl on regexp matching. Or most regexps anyway. deterministic regexp matching is most often faster than non-deterministic. That's why awk often is so much faster than Perl on raw regexp matching. But Perl needs its non-deterministic engine, because otherwise you couldn't use backreferencing. Gawk has special sub() functions that do non-deterministic. Clumsy, but at least the gawk programmer can trade speed for functionality.

stetner 08-22-2002 09:34 AM

The thing I like best about egrep is:

egrep foo\|bar\|baz textfile

which will find all occurrences of foo or bar or baz in the file (the back slashes are so the shell doesn't think you are trying to pipe).

vickishome 08-22-2002 11:01 AM

Quote:

Originally posted by stetner
egrep foo\|bar\|baz textfile

<snip> (the back slashes are so the shell doesn't think you are trying to pipe).
Newbie raising hand from the back of the class again.

I'm able to do egrep foo|bar|baz textfile without the backslashes. In fact, when I add the backslashes, it doesn't work anymore. I first thought maybe it was because I chose to not use the extended option with grep, but egrep is supposed to already have the extended options.

Any idea why I don't need/can't use the backslashes?

pmccann 08-22-2002 11:13 AM

Wow, it has got busy 'round here! Quite the most polite hornet's nest that I've ever had the pleasure of stirring!

On osx egrep and 'grep -e' are the same thing: first page of "man grep"

Quote:

egrep is the same as grep -E. fgrep is the
same as grep -F.
Another one of those duplicate binaries that occur in osx:

% ls -l `which grep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:21 /usr/bin/grep
% ls -l `which egrep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:22 /usr/bin/egrep


And so on for fgrep. That is, it's the same binary, but responds differently depending on the name by which it's called. Why they don't just link to the same thing (ie have egrep and fgrep as links to grep) I have no idea. Anyone? this same structure occurs in various other places as well. (compress/uncompress, batch/at/atrm/atq, merge/rcsdiff/rcsmerge, gunzip/gzip/zcat/gzcat, csh/tcsh, zsh/sh, tar/pax/cpio). Byte for byte the same (each of slashed alternatives), yet separate copies.

Having made that list I've forgotten what vital and entertaining information I'd intended to convey. Harumph. Just use egrep? Yeah, but that's no fun! I'm still on a campaign to convince the world that greep rocks.

"The change would be very subtle....It might take ten years or so....
Gradually his grep would change it's shape....A more hooked nose...
Wider, thinner lips....Beady eyes....A larger forehead."

(A longer name, case insensitivity, better regexp flavour...)

Cheers,
Paul

osxpez 08-22-2002 11:22 AM

Quote:

Originally posted by vickishome
I'm able to do egrep foo|bar|baz textfile without the backslashes. In fact, when I add the backslashes, it doesn't work anymore. I first thought maybe it was because I chose to not use the extended option with grep, but egrep is supposed to already have the extended options.

Any idea why I don't need/can't use the backslashes?
No idea at all. That seems very strange! Maybe tcsh sees that "egrep foo" can't be a command and then decides that | doesn't have a special meaning any more and then hands egrep those backslashes as is... No, that would be too weird even for tcsh.

does:

egrep 'foo|bar|baz' textfile

work?

vickishome 08-22-2002 11:43 AM

Please don't tell me it's strange, let alone very strange! "Strange" seems to be my NMOO (normal mode of operation) with Unix. :p

This is what I get when I try egrep with the | and \|:
Code:

% cat test.txt
tech word
word support
technical word
word supportive
techsupport word
word techsupport
tech word support
technical word support
techhhhhh support
support tech
ez
x
technical word supportive

% egrep 'ez|x|techhhh' test.txt
techhhhhh support
ez
x

% egrep 'ez\|x\|techhhh' test.txt
%

Notice that it works without the backslashes, but when I add the backslashes, I just get another % prompt with no error messages or results.

pmccann 08-22-2002 12:06 PM

But that's *fine*! I think I see what's going on here: Doug backslashed his "|" symbols because he didn't quote the whole regular expression. That backslashing prevented the shell from thinking they were pipes (and thus chucking a fit about not being able to find the command on the right hand side of the first pipe, or maybe offering a strange substitution). Quoting your expression has the same effect: *egrep* gets to see the "|" symbols instead of the greedy old shell grabbing them in transit.

When you wrote above that your egrep worked *without quotes and without backslashes* it certainly was "interesting". But if you were quoting all the way, just invisibly in the post that's about 3 above this one, then the world is at peace.

When you write

% egrep 'ez\|x\|techhhh' test.txt

you're asking for occurences of the literal text 'ez|x|techhhh'. In other words, you've "double-negativized" the "|" symbol, so that it's interpreted literally.

I hope that's close, anyway. Seems to pass my tests!

Cheers,
Paul

osxpez 08-22-2002 12:07 PM

Ah, but that's not strange at all! (Strange is far from NMOD with Unix BTW).

Notice that stetner said:
Code:

egrep foo\|bar\|baz textfile
Whereas you do:
Code:

egrep 'foo\|bar\|baz' textfile
stetner's example gives grep "foo|bar|baz" as the regexp. But since you singlequote the regexp you feed egrep "foo\|bar\|baz" which tells egrep to disregard the vertical bar's function as being alternate pattern delimiter. So you search literally for "foo|bar|baz" which doesn't match any line in your file. For it to work you need to either write it like stetner suggested or like I suggested:
Code:

egrep 'foo|bar|baz' textfile
Which gives you less backslashtithis. :)

vickishome 08-22-2002 01:20 PM

Ahhhh! Okay, the light just switched on. :) I didn't even notice the absence of the quotes in stetner's message. I just did a copy/paste from his message when writing mine so the absent quotes copied/pasted right along with it. Using quotes is just second nature to me (I've always used quotes when allowed, even if not required). I will be more watchful for the variations of quotes/absent quotes in the future.

Okay, mystery solved. Thanks! :)

osxpez 08-22-2002 03:58 PM

Vicki: I just have to confuse you a little bit more. If you had done your double quoting using grep instead of egrep, like:
Code:

grep 'foo\|bar\|baz' textfile
... things would have worked. This is because grep's regular expressions regards \| as being the alternate pattern delimiter. It works with \?, \+, \{, \|, \(, and \) as well.

mervTormel 08-22-2002 05:38 PM

greep is creepy. i prefer grap or grop, grup even.

Craig R. Arko 08-22-2002 10:23 PM

Very mysterious and ooky, if you ask me. ;)

My only possible contribution to this thread:

grep stands for General Regular Expression Parser, IIRC.

Back to read and learn something mode. :cool:

vickishome 08-22-2002 10:38 PM

Quote:

Originally posted by osxpez
Vicki: I just have to confuse you a little bit more. If you had done your double quoting using grep instead of egrep, like:

grep 'foo\|bar\|baz' textfile

... things would have worked.
Ah, why not... confuse me all you want. :D

Actually, I understood what you said. But I had no idea you could do that with grep (multiple search)! I thought you had to use egrep for a multiple search! I learn so many things from you guys! :)

osxpez 08-23-2002 02:59 AM

Vicki: With GNU grep there's actually no difference between greps and egreps regular expression engines. It's just the parsing of those "extended" features that are different. With grep you have to put a backslash in front of those operating characters to get them to act special, with egrep you don't. With egrep you have to put a backslash in front of those characters to switch their special meaning off, with grep you don't.

Craig: According to "Master Regular Expressions" by Jeffrey Friedl grep got it's name from a common operation in the ed editor:

:g/Regular Expression/p

Which can be read as Global Regular Expression Print and it was so poular that a standalone utility, grep, was created for it. I don't know what sources Friedl has for this particular info, but I find it more plausible than "General Regular Expression Parser". Because grep does so much more than just parse the regexp and even so, what's "general" about grep?

And my suggestions for names to Paul's "egrep -i" alias are: griffin or grip. The last one is the Swedish word for griffin (which is a mythical creature combining the bodies of three animals). In Swedish grip is pronounced greep. :)

the_shrubber 08-23-2002 03:14 AM

kudos
 
not adding any tech info, but just wanted to extend a kudos to vicki for the investigative attitude that will turn any newbie into a guru in no time (well, actually, lots of time but eventually people starting thinking you can do magic [not that i would know personally])

if only we could replace all the "MS Word in 21 days" books/classes with a "How to approach computers" or "How to approach computer software" books/classes, geeks might actually start shutting up about "lusers" of the world.

vickishome 08-23-2002 06:55 AM

You just made my day! Thank you! :)

stetner 08-23-2002 07:23 AM

Quote:

Originally posted by pmccann
Another one of those duplicate binaries that occur in osx:

% ls -l `which grep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:21 /usr/bin/grep
% ls -l `which egrep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:22 /usr/bin/egrep


And so on for fgrep. That is, it's the same binary, but responds differently depending on the name by which it's called.
I was going to post the same thing last night Paul. But I checked first:D
Code:

% pwd
/usr/bin
% ls -l [ef]grep grep
-rwxr-xr-x    1 root    wheel      105548 Aug  3 10:17 egrep
-rwxr-xr-x    1 root    wheel      105548 Aug  3 10:17 fgrep
-rwxr-xr-x    1 root    wheel      105548 Aug  3 10:17 grep
% cmp grep egrep     
grep egrep differ: char 508, line 1
% cmp egrep fgrep
egrep fgrep differ: char 99104, line 341
% cmp fgrep grep
fgrep grep differ: char 508, line 1
%

Usually they are the same binary and in fact are 'hard' linked to the same binary. But not in this case. *shrug*

stetner 08-23-2002 07:33 AM

Quote:

Originally posted by Craig R. Arko
grep stands for General Regular Expression Parser, IIRC.

Back to read and learn something mode. :cool:
Hmmm, I am sure it was Global Regular Expression Print. You see back in the old days of ex and decwriters you would do a

:g/reg exp/p

to print out all lines with the reg exp in it.

I will just go back to cleaning my dentures now..... :)

vickishome 08-23-2002 07:50 AM

My book says it's global...
Quote:

After laborious research and countless hours debating with Unix developers, I am reasonably certain that the derivation of the name grep is as follows:
Before this command existed, Unix users would use a crude line-based editor called ed to find matching text. As you know, search patterns in Unix are called regular expressions. To search throughout a file, the user prefixed the command with global. After a match was made, the user wanted to have it listed to the screen with print. To put it all together, the operation was global/regular expression/print. That phrase was pretty long, however, so users shortened it to g/re/p. Thereafter, when a command was written, grep seemed to be a natural, if an odd and confusing, name.
Ducking and running now... :p

Craig R. Arko 08-23-2002 08:01 AM

I stand corrected. :)

It seems to be an error of my generation. ;)

sao 08-23-2002 08:05 AM

Code:

grep

 great reliable enormous potato  :D


Cheers...

pmccann 08-24-2002 01:46 AM

Quote:

Originally posted by stetner
I was going to post the same thing last night Paul. But I checked first:D
Yeah, go on, yuck it up!

Quote:

Usually they are the same binary and in fact are 'hard' linked to the same binary. But not in this case. *shrug*
Damn, damn, damn: I just checked most of the others quoted in my post above, and all but the relevant binaries (surprise!) seem to be as I claimed! That is, the same but duplicated. *Not linked either.* Very weird indeed.

Cheers,
Paul

osxpez 08-24-2002 04:44 AM

Quote:

Originally posted by pmccann
That is, the same but duplicated. *Not linked either.* Very weird indeed.
Maybe they are links in the original repository and become copies somewhere in the packaging or unpackaging?


All times are GMT -5. The time now is 10:35 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.