Help with grep script

BAPaul · March 31, 2013, 6:26pm

Hello, i am in need of assistance with a linux homework.

The problem statement, all variables and given/known data:
I must print out all the lines which have an uneven number of characters([number of characters on line]%2=1)
Relevant commands, code, scripts, algorithms:
GREP
The attempts at a solution (include all code and scripts):

I have read most of the manual and still not able to come up with a solution to count characters using grep. Very important: using grep is a must.

Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
Universitatea Babes Bolyai Facultatea de Matematica si Informatica, Cluj-Napoca,Romania,Dr. Boian Florian(the course is in Romanian, so i figure it's useless to post a link here)

Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).

jim_mcnamara · March 31, 2013, 7:43pm

I would guess your prof is more interested in how to construct a regular expression to eliminate odd or even numbers of characters. Have you done regular expressions (regex) in class? Or character classes e.g., '[0-9]' means for all digits?
e.g. -

'^[a-z]{10}$'

What does this regex do?

Yoda · March 31, 2013, 8:20pm

I am not sure if there is any method to use grep alone for counting the number of characters.

But you can use grep along with wc to count number of characters.

Here is an example:

$ printf "BAPaul" | grep -o "." | wc -l
6

-o      Only matching
"."     Match any single character
wc -l   Print the newline counts

hanson44 · March 31, 2013, 9:07pm

I disagree with the previous responses. There's definitely a way to do it. Took me just a few seconds to "get it", at least one way. And it's not a "trick question". This is a very good question your professor asked.

My understanding is I'm not supposed to just give the answer. I'm sure you can figure it out if you understand regular expressions. Do you understand regular expressions?

I'll give a clue. The regular expression is relatively short.

Before you try to solve the problem, your first step should be to build a test file to confirm if your solution works or not. Could you provide the test file?

elixir_sinari · March 31, 2013, 9:20pm

I agree with hanson44; this can be done with grep alone. A good question.
Another clue: Read up on quantifiers in regular expressions.

RudiC · April 1, 2013, 5:32am

LOL and agreed - it is very short!

Scrutinizer · April 1, 2013, 5:53am

Yet another clue would be that you might want to revisit the assumption that you need to count characters...

MadeInGermany · April 3, 2013, 2:27pm

grep defaults to RE, and I don't find a solution with RE.
But there are some solutions with ERE (extended regular expression).
Also the example in previous post let me think that you all mean ERE when saying "regular expression":

^[a-z]{10}$

where {10} is ERE (while \{10\} is RE).
So I suggest to study man egrep
and only take the ERE option from man grep .

Scrutinizer · April 3, 2013, 2:44pm

There is a solution with either BRE (basic regular expression) or ERE, that does not matter.

hanson44 · April 3, 2013, 3:37pm

Like the previous poster, I can confirm there is a solution to this delightful little problem. And it's not a trick question. The solution is a great demonstration of how useful regular expressions can be.

Hey, original poster, where are you? We'd like to hear what you tried so far, or if the assignment is over.

bakunin · April 4, 2013, 6:01pm

I will give a hint: what is the mathematical definition of "odd" and "even"? The answer should help with the regexp.

I hope this helps.

bakunin

alvincorrea · April 24, 2013, 3:39am

After some thinking I got the answer I guess.... group multiples of any 2 characters will give you the even number of characters (including whitespaces).

Now for finding odd exclude the lines which match above pattern

hanson44 · April 24, 2013, 3:46am

I'm afraid that's pretty vague. But maybe it's better you didn't post anything concrete. I don't know what the rules are about posting a solution to this homework assignment. Is the class over? Of course, no way to know. Any guidance from the administrators?

bakunin · April 24, 2013, 5:48am

There are no real rules regarding this (btw., this might be a good idea for an amendment), but informally we always have tried to give the students pointers but still leave the work to do it themselves.

Posts with outright, full-blown solutions were typically moderated (set to invisible for normal users) until the thread-o/p posted a solution of his own. Users posting such solutions here usually get a PM explaining the special nature of this board and what we are trying to do here and asking them to contribute accordingly - pointers, but no solutions.

In this case, though, it looks like the thread-o/p has abandoned the thread (he was last active three weeks ago) and if alvincorrea wants to attempt to solve the puzzle he could do so in the normal forae:

I have found an interesting problem <<here <link to this thread> >>. The task is to <<problem description>> and my proposed solution is
```text
.....
```
Could anybody please tell me if this is correct and if not, what is the problem.

I hope this helps.

bakunin

alvincorrea · April 24, 2013, 6:42am

Thanks @bakunin
I think this

 grep -v  '^(..)+$'  filename

will work although this will also take spaces into consideration ,as I am not aware if those were to be ignored or not
Please let me know if I have missed something or there are any other more simpler approaches as I will be very pleased to learn about it too

bakunin · April 24, 2013, 7:51am

alvincorrea:

I think this
 grep -v  '^(..)+$'  filename
will work although this will also take spaces into consideration ,as I am not aware if those were to be ignored or not
Please let me know if I have missed something or there are any other more simpler approaches as I will be very pleased to learn about it too

I think spaces are characters like any other. The important point in such problems is not to do it this way or that way (like including the spaces or not), but to be aware of the limitations of ones solution. As you are aware that your solution only works under a narrowly defined set of definitions your take is impeccable.

One little problem remains, though: you seem to have developed this with a GNU-grep. GNU-grep - like other regexp-based GNU programs (i.e. sed) uses GNU-BREs (Basic Regular Expressions), which are slightly different from POSIX-BREs. In this case, you specifically use the "+" quantor, which means "1 or more". In POSIX-BREs this quantor doesn't exist, only the "*", which means "0 or more". If you want to be portable across different Unix-flavours you should write the regexp this way:

/^..\(..\)*$/

which does the same as yours.

Second, it is debatable if 0 is an even number. This is a matter of definition, though, so if you define 0 to be even you should replace "+" with "*".

I hope this helps.

bakunin

elixir_sinari · April 24, 2013, 1:12pm

grep '^\(..\)*.$' file

will be portable.

hanson44 · April 24, 2013, 6:25pm

Yes, I would say grep '^$..$*.$' file the best solution.

$ cat file
1

12
123
12345
123456

$ grep '^\(..\)*.$' file
1
123
12345