Printing from col x to end of line, except last col

LMHmedchem · January 7, 2013, 1:06pm

Hello,

I have some tab delimited data and I need to move the last col. I could hard code it,

awk '{ print $1,$NF,$2,$3,$4,etc }' infile > outfile

but it would be nice to know the syntax to print a range cols.

I know in cut you can do,

cut -f 1,4-8,11-

to print fields 1, 4,5,6,7,8, and from 11 to the end of the line, but I don't know if cut has a variable for the last col like $NF.

Since I am moving $NF from the end, it shouldn't be printed again, but that complexifys things a bit more. If it was printed again, it would be easy enough to get rid of it in a second step, but it seems as if there should be a straightforward way to do this in one step.

Suggestions are appreciated,

LMHmedchem

Corona688 · January 7, 2013, 2:49pm

As far as I know awk doesn't have a syntax to print a range of cols.

awk can cheat by letting you assign to special variables, though.

awk '{
        $2=$NF" "$2; # Squeeze an extra column into column 2
        NF=5; # Limit it to 5 columns
       } 1 # Print everything'

rdrtx1 · January 7, 2013, 3:52pm

also, using Corona688's example for moving last column and keeping tab delimiters, try:

awk -F"\t" '{ $2=$NF OFS $2; NF--; } 1 ' OFS="\t" input

LMHmedchem · January 7, 2013, 4:41pm

Thanks, this appears to be working. I am also trying to replace the first instance of "C0014" with "class". It seems like either of these should work,

sed 's/C0014/class/1'
sed 's/C0014/class/'

but both are replacing all instances of C0014 with class, as if I was using /g.

What am I missing here?

LMHmedchem

rdrtx1 · January 7, 2013, 4:47pm

Assuming the awk script is also being used to do the replace, try:

awk -F"\t" '{ if (!r) r=sub("C0014","class"); $2=$NF OFS $2; NF--; } 1 ' OFS="\t" input

LMHmedchem · January 7, 2013, 4:52pm

Thanks, that worked fine and it's nice to have it in one step. Do you have any idea why the sed command doesn't work?

I was a bit unclear and I need to pass in a value for the string that will be replaced by "class", meaning it will not be C0014 every time.

SETS="C0014"
awk -F"\t" '{ if (!r) r=sub("$SETS","class"); $2=$NF OFS $2; NF--; } 1 ' OFS="\t"  'pass_'$SETS'_filter_.txt'  >  PASS1.txt

this does not work, so I guess I need to assign the value of $SETS to an awk variable???

LMHmedchem

rdrtx1 · January 7, 2013, 5:28pm

in the awk script, try:

SETS="C0014"
awk -F"\t" -v sets="$SETS" '{ if (!r) r=sub(sets,"class"); $2=$NF OFS $2; NF--; } 1 ' OFS="\t" 'pass_'$SETS'_filter_.txt'

The sed script above:

sed 's/C0014/class/' file

should replace first occurrence in every line.

LMHmedchem · January 7, 2013, 8:05pm

Well that explains why sed was replacing all instances, since there was only one per line.

I am having trouble with some nested conditionals.

# if compounds were found for the class
if [ $MOLFILECOUNT -gt 0 ]
  then

#   sort list of class structures
    data_sort_rows_headerName.sh  $SETS'_sort_'*'.txt' \
                                  PASS1.txt  \
                                  SORT1.txt \
                                  '_makesdf_'$SETS'_'$MOLFILECOUNT'_'$DATASOURCE'_'$DATE_CODE'.txt'

# if the output file does not exist, use SORT1 to start the file
   if[ ! -f "$OUTPUFILE" ]
   then
      cp  SORT1.txt  TEMPOUTPUT1
   else
      cat TEMPOUTPUT1  SORT1.txt  > TEMPOUTPUT2
   fi

fi

This code is throwing an exception,

line 111: syntax error near unexpected token `then'
line 111: ` then'

I have highlighted the then that the error is referring to. I know I have used syntax like this before, so I'm not sure what the issue could be.

LMHmedchem

rdrtx1 · January 7, 2013, 8:12pm

try putting a space after "if":

if [ ! -f "$OUTPUFILE" ]