Newline between unequal record fields

tree · March 14, 2013, 1:27pm

Assume the following 5 records (field separator is a space):

0903 0903 0910 0910 0910 0910 0910 0910 0917 0917 0917 0917 0924
1001 1001 1001 1001 1008 1008 1008 1008 1015 1015 1015 1015 1022
1029 1029 1029 1029 1105 1105 1105 1105 1112 1112 1112 1112 1119
1126 1126 1126 1126 1203 1203 1203 1203 1210 1210 1210 1210 1217
1224 1224 1224 1224 1224 1224 1224 1224 1231 1231 1231 1231

The output result needed:

0903 0903
0910 0910 0910 0910 0910 0910
0917 0917 0917 0917
0924
1001 1001 1001 1001
1008 1008 1008 1008
1015 1015 1015 1015
1022
1029 1029 1029 1029
1105 1105 1105 1105
1112 1112 1112 1112
1119
1126 1126 1126 1126
1203 1203 1203 1203
1210 1210 1210 1210
1217
1224 1224 1224 1224 1224 1224 1224 1224
1231 1231 1231 1231

Assume additional records will have different values. Without doing this by hand I've been unable solve it. I tried using a combination of sed, awk, and grep scripts with no success. Any help would be appreciated.

Scrutinizer · March 14, 2013, 1:38pm

Try:

awk '{for(i=1; i<NF; i++) $i=$i ($(i+1)==$i?FS:RS)}1' OFS=

tree · March 14, 2013, 3:00pm

Totally answered my problem. Been working on this for a week. Finished reading the O'Reilly book on bash scripting but could find the answer. It's good but doesn't go into too much detail on sed or awk. Thanks.

Corona688 · March 14, 2013, 3:06pm

awk is its own programming language, it's hard to go over it in detail without it becoming its own book. Not a difficult language mind you, but quite different.

rdrtx1 · March 14, 2013, 3:26pm

try also (in case same values wrap on the next line):

awk '{w=(w)?w:$1;for(i=1; i<=NF; i++) {printf ($i==w)? $i" ":"\n"$i" "; w=$i}} END {print ""}' infile

Yoda · March 14, 2013, 3:32pm

If you are interested, here is a solution using bash:

#!/bin/bash

while read line
do
        for c_num in $line
        do
                [[ "$c_num" == "$p_num" ]] && printf "%s " $c_num || printf "\n%s " $c_num
                p_num="$c_num"
        done
done < file
printf "\n"

DGPickett · March 14, 2013, 4:37pm

One approach is to make the fields all lines - homogenous if separated, but my standard sed looper is fine for merging lines:

tr ' ' '\12' < in_file | sed '
  :loop
  $q
  N
  s/\(....\)\n\1/\1 \1/
  t loop
  P
  s/.*\n//
  t loop
 ' > out_file

But this might mess up for two lines of the same number. In some apps, that might be great; you can put a "| sort" after the "tr" and merge far separated numbers, or a "| sort | uniq -c" and reduce them to a count.

Maybe pure sed is actually better yet:

sed '
  s/ /\
/g
  s/\(....\)\n\1/\1 \1/g
  s/\(....\)\n\1/\1 \1/g
 ' in_file > out_file

Cheap trick, making all the spaces line feeds and then making them back into spaces where equal. There's a lesson about negative cases there. Mostly, line feed was a certainly not in use substitute character. Once I swapped line feed and form feed so I could sed pages into insert statements (one page per row in one column) and then reversed the fomr feeds back to line feeds. Note that you have to sub twice, for the odd and even spaces. Also, if you know what a string is, you do not have to source the original bytes, any dup quad looks the same!

anbu23 · March 15, 2013, 1:26am

$ sed "s/$/ /;s/\([^ ]* \)\(\1\)*/&\\
/g" f | sed "/^$/d"
0903 0903
0910 0910 0910 0910 0910 0910
0917 0917 0917 0917
0924
1001 1001 1001 1001
1008 1008 1008 1008
1015 1015 1015 1015
1022
1029 1029 1029 1029
1105 1105 1105 1105
1112 1112 1112 1112
1119
1126 1126 1126 1126
1203 1203 1203 1203
1210 1210 1210 1210
1217
1224 1224 1224 1224 1224 1224 1224 1224
1231 1231 1231 1231

devtakh · March 15, 2013, 1:53am

 tr '\n' ' ' < file2 | awk 'BEGIN{IFS=OFS=" "}{
    for (i=1;i<=NF;i++){
        if ( i == 1 )printf("%s ",$1); 
        else { if ( $i != $(i-1))printf("\n%s ",$i);
              else printf("%s ", $i)}
}
}'

0903 0903
0910 0910 0910 0910 0910 0910
0917 0917 0917 0917
0924
1001 1001 1001 1001
1008 1008 1008 1008
1015 1015 1015 1015
1022
1029 1029 1029 1029
1105 1105 1105 1105
1112 1112 1112 1112
1119
1126 1126 1126 1126
1203 1203 1203 1203
1210 1210 1210 1210
1217
1224 1224 1224 1224 1224 1224 1224 1224

cheers,
Devaraj Takhellambam