Format text output

Hello,

i've got a script based on queries which generates this output into a file

cat cron_S-PostGres-ncc_license_Simpel_ActSub_per_Product
 201502-b31s31i10            | MVNO          | 114751 | 2017-07-19
 201502-b31s31i10R60-60 | MVNO          |  62115   | 2017-07-19
 201511-b31s31i15            | MVNO          | 311970 | 2017-07-19
 Simpel PostPay                 | MVNO          |     59     | 2017-07-19
 Simpel PrePay                   | MVNO         |  10254   | 2017-07-19

I'm finding it a bit hard in order to try to do another script which opens this file and present the output in the following format:

201502-b31s31i10  - 114751 (2017-07-19), 201502-b31s31i10R60 - 62115 (2017-07-19), ... 

I tried to play a bit with grep and awk, but i don't want to specifically grep any string from the first row since they may increase by time, making the grep static (i.e. by time the script needs to be altered to add the new string in grep.

can you kindly shed some light?

It looks like you have to perform the same set of steps on each line.

Take, for example, this line:

 201502-b31s31i10            | MVNO          | 114751 | 2017-07-19

1) Split it on the "|" character, to get the following tokens:

Token 1 = "201502-b31s31i10"
Token 2 = "MVNO"
Token 3 = "114751"
Token 4 = "2017-07-19"

2) Now use tokens 1, 3 and 4 to form this string:

201502-b31s31i10 - 114751 (2017-07-19)

3) And append it to another "output" string (i.e. a string that you will print once you are done processing your entire file.)

Keep doing steps 1), 2) and 3) for all lines you go through.
In the end, print the "output" string.

Here's a Perl implementation, in case Perl is an option for you.
The -F modifier specifies the character "|" to split the line on.

$
$ cat data.txt
 201502-b31s31i10            | MVNO          | 114751 | 2017-07-19
 201502-b31s31i10R60-60 | MVNO          |  62115   | 2017-07-19
 201511-b31s31i15            | MVNO          | 311970 | 2017-07-19
 Simpel PostPay                 | MVNO          |     59     | 2017-07-19
 Simpel PrePay                   | MVNO         |  10254   | 2017-07-19
$
$
$ perl -F/\\\|/ -lane '($x,$y,$z) = map {s/^\s*//; s/\s*$//; $_ } @F[0,2,3];  # trim whitespaces from tokens 0, 2, 3
                       $s .= "$x - $y ($z), ";                                # append trimmed tokens to output string
                       END {
                           $s =~ s/, $//;                                     # trim unnecessary characters from output
                           print $s
                       }
                      ' data.txt
201502-b31s31i10 - 114751 (2017-07-19), 201502-b31s31i10R60-60 - 62115 (2017-07-19), 201511-b31s31i15 - 311970 (2017-07-19), Simpel PostPay - 59 (2017-07-19), Simpel PrePay - 10254 (2017-07-19)
$
$

Hi durden_tyler,

Your perl one liner is excellent. I only just modified it since now the output needs to be slightly different from what i presented previously and it worked.

Thank you very much.

Rgds,

Very helpful, thanks Tyler.

Either of you two might want to click the "Thanks" button on durden_tyler's post to enforce his/her gratitude...

1 Like

A word of caution: UNIX text files (and the utilities working on them) are line-oriented. That means these utilities don't read from a file or input stream basically one character at a time, but they read in chunks. These chunks are series of characters separated by newline ("\n") characters ("lines").

If the line you construct gets too long (it depends on your system how long "too long" is, have a look in the kernel headers) you might eventually get a "line too long" error.

Note that it is possible to read from a file byte-based (that is: a certain number of bytes at a time, not caring about line separators), but this cannot be done in (shell) scripts using only standard methods.

I hope this helps.

bakunin