Formatting text file

Hi All,

how to format text

Thanks

It depends on what language you want to use to process this file. Here is a solution which would work for Zsh:

Assuming that one line of the file is stored in the shell variable line, and that you have set the shell options

setopt re_match_pcre
setopt bash_rematch

the statement

[[ $line =~ pname\ (.+?)[.].+?=([0-9]*).+?:.([0-9.]*) ]] && echo "${BASH_REMATCH[2]} ,   ${BASH_REMATCH[3]} ,  ${BASH_REMATCH[4]} ,  ${BASH_REMATCH[5]}"

would output the fields in the format you are asking for.

If you prefer to use bash or Bourne Shell (I don't know about ksh), this approach would not work, so you would have to use a different language for this purpose anyway (for instance, Ruby, Perl, awk or Python). I therefore recommend, that you write the whole script in zsh.

Hi ,
Thanks for your code.
We are using ksh.
I have tried below one.I got the error ksh: syntax error: `=~' unexpected

[[ $line =~ pname\ (.+?)[.].+?=([0-9]*).+?:.([0-9.]*) ]] && echo "${BASH_REMATCH[2]} , ${BASH_REMATCH[3]} , ${BASH_REMATCH[4]} ,  ${BASH_REMATCH[5]}" f1.txt

Thanks

[akshay@localhost tmp]$ cat file
pname tmgnsr.tpfile_01 analy, lists=8, lines=5900, duration cap: 27.682 seconds
pname tmgnsr.tpfile_01 analy, lists=5, lines=1234, duration cap: 5.12 seconds
pname tmgnsr.sdcfile_01 analy, lists=4, lines=878172, duration cap: 27.145 seconds
pname tmgnsr.sdcfile_02 analy, lists=4, lines=200, duration cap: 34.624 seconds
pname tmgnsr.sdcfile_03 analy, lists=4, lines=200, duration cap: 8.076 seconds
pname tmgnsr.mndfile_01 analy, lists=3, lines=900, duration cap: 37.393 seconds
pname tmgnsr.mndfile_03 analy, lists=1, lines=900, duration cap: 43.077 seconds
pname tmgnsr.mndfile_04 analy, lists=8, lines=5900, duration cap: 16.371 seconds
pname tmgnsr.mndfile_05 analy, lists=1, lines=30, duration cap: 10.967 seconds
pname tmgnsr.mndfile_06 analy, lists=8, lines=900, duration cap: 6.688 seconds
pname tmgnsr.mndfile_abc_01 analy, lists=8, lines=900, duration cap: 22.231 seconds
[akshay@localhost tmp]$ gawk --version
GNU Awk 3.1.7
Copyright (C) 1989, 1991-2009 Free Software Foundation.
..
..
[akshay@localhost tmp]$ gawk  '{match($0,/pname (.+?)[.].*=(.*),.*=(.*),.*:(.*) /,m);print m[1],m[2],m[3],m[4]}' file
tmgnsr 8 5900  27.682
tmgnsr 5 1234  5.12
tmgnsr 4 878172  27.145
tmgnsr 4 200  34.624
tmgnsr 4 200  8.076
tmgnsr 3 900  37.393
tmgnsr 1 900  43.077
tmgnsr 8 5900  16.371
tmgnsr 1 30  10.967
tmgnsr 8 900  6.688
tmgnsr 8 900  22.231

Hi,

gawk is not working.

ksh: gawk:  not found

Thanks

Hello ROCK_PLSQL,

Could you please try following and let me know if this helps you. Also I have written this as per your shown Input_file(data). If you have other conditions or data is not accurate as per your shown Input_file then you will have to post sample data with complete details please.

awk -F"[ |=|: ]" '{sub(/\..*/,X,$2);print $2 OFS "," OFS $5 OFS $7 OFS $11}'   Input_file

Output will be as follows.

tmgnsr , 8, 5900, 27.682
tmgnsr , 5, 1234, 5.12
tmgnsr , 4, 878172, 27.145
tmgnsr , 4, 200, 34.624
tmgnsr , 4, 200, 8.076
tmgnsr , 3, 900, 37.393
tmgnsr , 1, 900, 43.077
tmgnsr , 8, 5900, 16.371
tmgnsr , 1, 30, 10.967
tmgnsr , 8, 900, 6.688
tmgnsr , 8, 900, 22.231

Thanks,
R. Singh

1 Like

Hi,

Thanks for your code.

It's working fine.

Thanks

In perl, something similar to awk

$ perl -lne '$,=", "; @p=split(m/pname (.+?)[.].*=(.*),.*=(.*),.*: ?(.*) .*/,$_); print @p[1..4]' file
tmgnsr, 8, 5900, 27.682
tmgnsr, 5, 1234, 5.12
tmgnsr, 4, 878172, 27.145
tmgnsr, 4, 200, 34.624
tmgnsr, 4, 200, 8.076
tmgnsr, 3, 900, 37.393
tmgnsr, 1, 900, 43.077
tmgnsr, 8, 5900, 16.371
tmgnsr, 1, 30, 10.967
tmgnsr, 8, 900, 6.688
tmgnsr, 8, 900, 22.231

Well, as I said, my code is for zsh, so it is no surprise that it doesn't work with ksh.

I don't know the larger context of your script. Maybe it makes sense to convert the script to zsh, maybe it is better to use a different language.

Moderator comments were removed during original forum migration.
1 Like

Hi,

In second field after the "." i want the date till the space.

I want the output as below

tmgnsr ,tpfile_01 , 8, 5900, 27.682
tmgnsr ,tpfile_01 , 5, 1234, 5.12
tmgnsr ,sdcfile_01 , 4, 878172, 27.145
tmgnsr ,sdcfile_02 , 4, 200, 34.624
tmgnsr ,sdcfile_03 , 4, 200, 8.076
tmgnsr ,mndfile_01 , 3, 900, 37.393
tmgnsr ,mndfile_02 , 1, 900, 43.077
tmgnsr ,mndfile_03 , 8, 5900, 16.371
tmgnsr ,mndfile_04 , 1, 30, 10.967
tmgnsr ,mndfile_05 , 8, 900, 6.688
tmgnsr ,mndfile_abc_01 , 8, 900, 22.231

I hav tried the below. But it's coming continuisouly , but it has to be sapareted by ","

tmgnsr.tpfile_01 , 8, 5900, 27.682
tmgnsr.tpfile_01 , 5, 1234, 5.12
tmgnsr.sdcfile_01, , 4, 878172, 27.145
awk -F"[ |=|: ]" '{sub(/\..*/,X,$1);print $2 OFS "," OFS $5 OFS $7 OFS $11}'   Input_file

Thanks

@ROCK_PLSQL

Your post is not useful to anyone who follow this fora, see you removed your question in #1, what readers should think ? without question someone answered, person who answered here is mad ?

Hi,

Sorry.By mistake I have removed.

Going forward I will take care of it.

Please help me.

Thanks

---------- Post updated at 05:03 PM ---------- Previous update was at 04:52 PM ----------

Hi RavinderSingh,

Can u please help me.

Thanks

Hello ROCK_PLSQL,

I am not really sure how your Input_file looks as your previous post is not clear. Let's say following is your Input_file.

cat Input_file
tmgnsr ,tpfile_01 , 8, 5900, 27.682
tmgnsr ,tpfile_01 , 5, 1234, 5.12
tmgnsr ,sdcfile_01 , 4, 878172, 27.145
tmgnsr ,sdcfile_02 , 4, 200, 34.624
tmgnsr ,sdcfile_03 , 4, 200, 8.076
tmgnsr ,mndfile_01 , 3, 900, 37.393
tmgnsr ,mndfile_02 , 1, 900, 43.077
tmgnsr ,mndfile_03 , 8, 5900, 16.371
tmgnsr ,mndfile_04 , 1, 30, 10.967
tmgnsr ,mndfile_05 , 8, 900, 6.688
tmgnsr ,mndfile_abc_01 , 8, 900, 22.231

Then following is the code for same.

awk -F"," '{sub(/[[:space:]]+\,/,".",$0);print $0}'   Input_file

Output will be as follows.

tmgnsr.tpfile_01 , 8, 5900, 27.682
tmgnsr.tpfile_01 , 5, 1234, 5.12
tmgnsr.sdcfile_01 , 4, 878172, 27.145
tmgnsr.sdcfile_02 , 4, 200, 34.624
tmgnsr.sdcfile_03 , 4, 200, 8.076
tmgnsr.mndfile_01 , 3, 900, 37.393
tmgnsr.mndfile_02 , 1, 900, 43.077
tmgnsr.mndfile_03 , 8, 5900, 16.371
tmgnsr.mndfile_04 , 1, 30, 10.967
tmgnsr.mndfile_05 , 8, 900, 6.688
tmgnsr.mndfile_abc_01 , 8, 900, 22.231

If above doesn't satisfy your requirements then please post sample Input_file with expected sample output with all your conditions.

Thanks,
R. Singh

Hi Ravindra,

This is my input file.

pname tmgnsr.tpfile_01 analy, lists=8, lines=5900, duration cap: 27.682 seconds
pname tmgnsr.tpfile_01 analy, lists=5, lines=1234, duration cap: 5.12 seconds
pname tmgnsr.sdcfile_01 analy, lists=4, lines=878172, duration cap: 27.145 seconds
pname tmgnsr.sdcfile_02 analy, lists=4, lines=200, duration cap: 34.624 seconds
pname tmgnsr.sdcfile_03 analy, lists=4, lines=200, duration cap: 8.076 seconds
pname tmgnsr.mndfile_01 analy, lists=3, lines=900, duration cap: 37.393 seconds
pname tmgnsr.mndfile_03 analy, lists=1, lines=900, duration cap: 43.077 seconds
pname tmgnsr.mndfile_04 analy, lists=8, lines=5900, duration cap: 16.371 seconds
pname tmgnsr.mndfile_05 analy, lists=1, lines=30, duration cap: 10.967 seconds
pname tmgnsr.mndfile_06 analy, lists=8, lines=900, duration cap: 6.688 seconds
pname tmgnsr.mndfile_abc_01 analy, lists=8, lines=900, duration cap: 22.231 seconds

I want the output as below.

tmgnsr ,tpfile_01 , 8, 5900, 27.682
tmgnsr ,tpfile_01 , 5, 1234, 5.12
tmgnsr ,sdcfile_01 , 4, 878172, 27.145
tmgnsr ,sdcfile_02 , 4, 200, 34.624
tmgnsr ,sdcfile_03 , 4, 200, 8.076
tmgnsr ,mndfile_01 , 3, 900, 37.393
tmgnsr ,mndfile_02 , 1, 900, 43.077
tmgnsr ,mndfile_03 , 8, 5900, 16.371
tmgnsr ,mndfile_04 , 1, 30, 10.967
tmgnsr ,mndfile_05 , 8, 900, 6.688
tmgnsr ,mndfile_abc_01 , 8, 900, 22.231

Thanks.

Hello ROCK_PLSQL,

Could you please try following and let me know if this helps.

awk -F"[,|=|: ]" '{sub(/\./," ,",$2);sub(/$/," ",$2);print $2 OFS $6 OFS $9 OFS $14}' OFS=", "   Input_file

Output will be as follows.

tmgnsr ,tpfile_01 , 8, 5900, 27.682
tmgnsr ,tpfile_01 , 5, 1234, 5.12
tmgnsr ,sdcfile_01 , 4, 878172, 27.145
tmgnsr ,sdcfile_02 , 4, 200, 34.624
tmgnsr ,sdcfile_03 , 4, 200, 8.076
tmgnsr ,mndfile_01 , 3, 900, 37.393
tmgnsr ,mndfile_03 , 1, 900, 43.077
tmgnsr ,mndfile_04 , 8, 5900, 16.371
tmgnsr ,mndfile_05 , 1, 30, 10.967
tmgnsr ,mndfile_06 , 8, 900, 6.688
tmgnsr ,mndfile_abc_01 , 8, 900, 22.231
 

Thanks,
R. Singh

Hi,

Thanks for your answer.
It's working fine apart from space is coming after comma in first and second field.

tmgnsr ,tpfile_01 , 

it should be

tmgnsr,tpfile_01,

Thanks

1 Like

You seem to be saying that RavinderSingh13 gave you code in post #16 that produced exactly the output you said you wanted in post #15, and now you want different output without some of the spaces you said you wanted.

Instead of asking Ravinder to change his code to correct your incorrect specification, why don't you try to change the simple awk script Ravinder suggested to match your new requirements. Then, if you can't figure out which one, two, or three spaces need to be removed from his code to meet you NEW requirements, show us what you have done, explain what still isn't working correctly, and ask for help explaining where you are stuck.

Stop thinking of this forum as your unpaid programming staff and start thinking of it as a tutor to help you learn how to write your own code.

If I remember right, you had spaces in your original posting, but anyway: Once you have the output fields, it is up to you how many spaces you insert.

Plus, as the moderator noted, it is nearly impossible to discuss a topic where the previous entries are modified. Things are so messed up, that I suggest that you start a completely new thread, post the code you are using so far, and ask whatever needs to be asked.

Moderator comments were removed during original forum migration.
1 Like