The macosxhints Forums

The macosxhints Forums (http://hintsforums.macworld.com/index.php)
-   UNIX - Newcomers (http://hintsforums.macworld.com/forumdisplay.php?f=15)
-   -   Simple grep question (http://hintsforums.macworld.com/showthread.php?t=4820)

pmccann 08-22-2002 12:36 AM

Note to self (and maybe to others?). The basic "grep" command, which invokes "basic regular expressions" is **really** basic. No "+" metacharacter, no "?" metacharacter (for "0 or 1"), and so on. As mT mentioned in a parallel thread recently, setting your "GREP_OPTIONS" environment variable is one way to get around this, so that you have "extended regular expressions" available by default.

For tcsh users (if you don't know what I'm talking about here: **that's you!!**)

setenv GREP_OPTIONS "--extended-regexp"

Chuck this into your .login file[*] in your home directory (and make one if you haven't got one). Then simply "source .login" and you're armed with extended regular expressions in grep. Hey, this seems to work OK for bash as well. Strike one up for using .login instead of a shell specific file.

Merv also mentioned that he was using a few other options for grep. See the thread concerned if you're interested (and "man grep"):

http://forums.macosxhints.com/showth...ht=GREPOPTIONS

Cheers,
Paul
[*] Aaargh, I fear another "why not use /usr/share/init/tcsh/..." thread coming on. Rest assured, I'll shut up. Really. Well, maybe. Not even "maybe"? Must be the weather... Now don't be a baby. Apologies for any horrid muzak that's leapt into readers' minds. 'twas a *nasty* thing to do to you.

osxpez 08-22-2002 01:47 AM

I didn't know about $GREP_OPTIONS. Couldn't that confuse scripts that depend on grep not using extended regexps? The old school way would be to use egrep for that.

scaryfish 08-22-2002 02:01 AM

Yeah that really had me confused for a while. I was trying to search for any file with the suffix .img.<number>
eg
.img.3
or
.img.8
but not
.img.12
so I had grep '\.img\.[0-9]{1}'
ie. search for any .img with a single digit number afterwards. It never produced any results - because basic grep keeps the { as literals - had to change them to \{ and \} to get it to work.

BTW, I was just trying to find a way of using the command line to concatenate split files in order. Turns out I can just go
cat *.img.[0-9] *.img.[0-9][0-9] > output.img
and I don't even need to use grep or anything.

pmccann 08-22-2002 03:34 AM

Quote:

Originally posted by osxpez
I didn't know about $GREP_OPTIONS. Couldn't that confuse scripts that depend on grep not using extended regexps? The old school way would be to use egrep for that.
Yeah, sure: if you run scripts with your environment in place the options could do nasty things. Swings and roundabouts, but there are certainly places where falling off could hurt. sudo being one glaring example. (For those not following the rambling thoughts here: the problem is that when you run processes with elevated privileges via sudo you "bring along" your environment for the superhero's journey. Including any GREP_OPTIONS settings that you might have set.) Maybe an alias would be a safer way to implement some of this: say...

alias greep 'grep --extended-regexp --ignore-case'

so that (extended regexps + case independence) is the default. The problem that raises is remembering what the thing's called. 'greep' has nice a nice mnemonic character... Thanks (again) for the wake-up call.

Cheers,
Paul

osxpez 08-22-2002 06:30 AM

But, what's wrong with using "egrep"? Or "egrep -i" if you want to disregard case. (You could always alias egrepi for "egrep -i").

vickishome 08-22-2002 07:08 AM

I just added setenv GREP_OPTIONS "--extended-regexp" to my ~/.login file. Now that I have some basics using grep, I'd like to see if I have any scripts calling grep that may be affected by this change.

Can someone tell me which dirs have scripts that run automatically? Or is there a good way to narrow down the search so I don't search my entire HD?

vickishome 08-22-2002 07:29 AM

I checked the following 4 dirs for 'grep':

/etc/
/usr/share/init/tcsh/
~/
~/Library/init/tcsh/


And I found these lines:

[share/init/tcsh]
aliases:alias word 'grep \!* /usr/share/dict/web2' # Grep thru dictionary
completions:alias list_all_hostnames 'grep -v "^#" /etc/hosts'
completions: 'n@-framework@`ls -1 ${framework_path} | grep .framework\$ | sed 's/\\.framework//' | uniq`@' \

[~/Library/init/tcsh]
aliases.mine:alias findit "ps ax | grep \!:1 | grep -v grep"


Some of the code is above my head so I'm not sure what they all do. Can someone tell me if the setenv will negatively affect anything?

mervTormel 08-22-2002 07:31 AM

the only scripts of yours that run automatically would be ones in your crontab, but i don't think they'll be run with your interactive login environment.

what you have to worry about are your ~/bin/ scripts (and /usr/local/) that you run that do greps in your interactive shell and make sure the regexps are extended regexp savvy or that your environment is clean of the GREP variables. i have a bash function 'zung' to toggle GREP variables in and out of existence when i think i might be rogue. grok?

btw, the grep --ignore-case switch is slightly more valuable than the others as it ignores case in both the source and the target.

vickishome 08-22-2002 07:35 AM

While looking through man grep, I found this option:
Code:

-E, --extended-regexp
Interpret PATTERN as an extended regular expression

Might it be better to not make the setenv change, but use the -E option when desired?

mervTormel 08-22-2002 07:49 AM

well, that's why the options are there. it's entirely up to you how to conduct your shell world. there are tradeoffs, upsides and downsides to every issue.

--
no doubt about it, there's two sides to every story.

osxpez 08-22-2002 07:49 AM

I feel totally invisible here! :)

Yes, I think "grep -E" is much better than tampering with the environment. But still "egrep" is there for these kinds of things. In the old days egrep used to be a different program than grep, but that handled extended regular exressions. But reading the grep man page now seems to indicate that egrep is actually a link to grep (or if it is vice versa) and that grep checks what name it was called by and then switches extended regexps on. Maybe a small history lesson could shed some light on this:

"grep" is short for "global regular expression print". That's what grep (without options) does; It globally aplies the regular expression and than prints rows that matches. My guess is that "egrep" stands for "extended global regular expression print". What the hell "fgrep" stands for is beyond me! :)

osxpez 08-22-2002 07:55 AM

I totally fail to see the downside with using egrep.

vickishome 08-22-2002 07:55 AM

Quote:

Originally posted by osxpez
I feel totally invisible here! :)
I see you! But that doesn't mean I understand what you (or anyone else) is saying half of the time. :)
Quote:

My guess is that "egrep" stands for "extended global regular expression print". What the hell "fgrep" stands for is beyond me! :)
Now I understand your comments about egrep. I think I'm going to pass on setting the setenv for now and use either -E or egrep when I want. When I understand the entire environment better, I may change my mind (by then, hopefully, I'll better understand the full impact on the setenv change). I'd like to stick to the K.I.S.S. principle while just starting out.

fgrep - I have no idea, but I took it as "file" grep. :)

osxpez 08-22-2002 08:04 AM

The f in fgrep stands for "fixed strings". The irony with fgrep is that it doesn't involve regular expressions. It's "fixed strings global regular exression print". If you, like me, enjoy geek humour then this should make you at least smile. :)

osxpez 08-22-2002 08:11 AM

One more thing. The grep man page on RH Linux says that egrep "is similar (but not identical) to grep -E". egrep is the "old school Unix" compatible one. The OS X man page didn't state this I think. It would be interesting to know how "egrep" an "grep -E" differs. Someone please enlighten me.

mervTormel 08-22-2002 08:23 AM

fgrep is fast grep, which it isn't, or fixed grep, because it doesn't accept metachars. egrep is your fastest grep today, me thinks.

from o'reilly's unix power tools: the old saw

unix beginners use grep because it's all they know about

intermediate users use fgrep because the manual says it's faster

advanced users use egrep because they've tried it

---

there are some timing tests here, and egrep beats even perl in both clock time and cpu usage.

fgrep has its uses; searching for literals, like *, it can save you some quoting.

i would doubt very much that egrep and grep -E have any difference in the OSX or GNU incarnations.

osxpez 08-22-2002 08:49 AM

Of course egrep is faster than Perl on regexp matching. Or most regexps anyway. deterministic regexp matching is most often faster than non-deterministic. That's why awk often is so much faster than Perl on raw regexp matching. But Perl needs its non-deterministic engine, because otherwise you couldn't use backreferencing. Gawk has special sub() functions that do non-deterministic. Clumsy, but at least the gawk programmer can trade speed for functionality.

stetner 08-22-2002 09:34 AM

The thing I like best about egrep is:

egrep foo\|bar\|baz textfile

which will find all occurrences of foo or bar or baz in the file (the back slashes are so the shell doesn't think you are trying to pipe).

vickishome 08-22-2002 11:01 AM

Quote:

Originally posted by stetner
egrep foo\|bar\|baz textfile

<snip> (the back slashes are so the shell doesn't think you are trying to pipe).
Newbie raising hand from the back of the class again.

I'm able to do egrep foo|bar|baz textfile without the backslashes. In fact, when I add the backslashes, it doesn't work anymore. I first thought maybe it was because I chose to not use the extended option with grep, but egrep is supposed to already have the extended options.

Any idea why I don't need/can't use the backslashes?

pmccann 08-22-2002 11:13 AM

Wow, it has got busy 'round here! Quite the most polite hornet's nest that I've ever had the pleasure of stirring!

On osx egrep and 'grep -e' are the same thing: first page of "man grep"

Quote:

egrep is the same as grep -E. fgrep is the
same as grep -F.
Another one of those duplicate binaries that occur in osx:

% ls -l `which grep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:21 /usr/bin/grep
% ls -l `which egrep`
-rwxr-xr-x 1 root wheel 105548 Aug 4 23:22 /usr/bin/egrep


And so on for fgrep. That is, it's the same binary, but responds differently depending on the name by which it's called. Why they don't just link to the same thing (ie have egrep and fgrep as links to grep) I have no idea. Anyone? this same structure occurs in various other places as well. (compress/uncompress, batch/at/atrm/atq, merge/rcsdiff/rcsmerge, gunzip/gzip/zcat/gzcat, csh/tcsh, zsh/sh, tar/pax/cpio). Byte for byte the same (each of slashed alternatives), yet separate copies.

Having made that list I've forgotten what vital and entertaining information I'd intended to convey. Harumph. Just use egrep? Yeah, but that's no fun! I'm still on a campaign to convince the world that greep rocks.

"The change would be very subtle....It might take ten years or so....
Gradually his grep would change it's shape....A more hooked nose...
Wider, thinner lips....Beady eyes....A larger forehead."

(A longer name, case insensitivity, better regexp flavour...)

Cheers,
Paul


All times are GMT -5. The time now is 10:35 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.