Cut output to same byte position

Hi folks

I have a file with thousands of lines with fixed length fields:

sample (assume x is a blank space)

111333xx444TTTLKOPxxxxxxxxx

I need to make a copy of this file but with only some of the field positions, for example I'd like to copy the sample to the follwing: so I'd like to print bytes 4-5 and 15-16 and they be in the same character positions in the new file.

xxx33xxxxxxxxTLxxxxxxxxxxxxxxx

I started looking at cut -b4-5,15-16 but my output is in position 1-4 instead of the same 4-5 and 15-16 with blank spaces everywhere there was one in the original.

Any help would be appreciated.

try with awk. Use the substr($0,Starting_Position,Length) function to cut specific byte from the file.

e.g.
111333xx444TTTLKOPxxxxxxxxx

awk '
{
filler1=substr($0,0,3)
fild1=substr($0,4,2)
filler2=substr($0,6,8)
fild2=substr($0,15,2)
filler3=substr($0,17,11)
printf("%s%s%s%s%s",filler1,fild1,filler2,fild2,filler3)
}' file1 > out_file

Note: Adjuct the filed position as per your correct file layout.

--Manish Jha

Hi Manish

This does print the correct positions but does not fill in the spaces between with the same number of bytes turned to blanks that were in the original. For example, if we take the following 2 lines.

ABC123DEF
GEH456JKL
Say I want to print position 1-2 and 6-8 and want everything in between turned to blank spaces so the byte positions from my input are in the same positions as my output.

Say x is a blank space the output should look like:
ABxxx3DEx
GExxx6JKx

The suggested awk with substr gives me the right substrings but in the wrong position in the new file:
AB3DE
GE6JK

Thanks for any suggestions

echo "ABC123DEF" | sed "s/\(.\{2\}\)\(.\{3\}\)\(.\{3\}\).*/\1   \3 /"

sed "s/\(.\{2\}\)\(.\{3\}\)\(.\{3\}\).*/\1 \3 /"

Match the first two char

sed "s/\(.\{2\}\)\(.\{3\}\)\(.\{3\}\).*/\1 \3 /"

Match next three char

sed "s/\(.\{2\}\)\(.\{3\}\)\(.\{3\}\).*/\1 \3 /"

Match next three char followed by the above three char

.* match till end of the line

I added three blanks between \1 and \3 and one blank after \3 to replace the respective char in input with blanks

Is this ok ?

awk ' { printf "%sxxx%sx\n", substr($0, 1, 2),  substr($0, 6, 3) }'  filename

ABxxx3DEx
GExxx6JKx

Thanks everyone.

my samples are simplified so what I actually have are lines 300 bytes long.

I need to print bytes 1-3, 203-240, and 260-289, with blank spaces between so the output remains in the same position. This presents a problem with sed because putting 200 spaces in the replace segment doesn't make sense. The suggestions above are fine for my samples which are 10-12 bytes long but not for 300 byte long lines where are only need to print a few bytes across the line.

Interesting issue I have here I never thought would be so tricky to figure out when I started :cool:

Thanks for all suggestions

If you have Python, an alternative
Sample input:
-------------------
111333 444TTTLKOP
122333 444DDDLKOP
422333 4445DDLTlR

#!/usr/bin/python
start1,end1 = 3,6 #position 4-6
start2,end2 = 14,17 #position 15-17
for lines in open("test.txt"):
        lines = list(lines.strip())
        lines[start1:end1] = " " * (end1 - start1) #sub space at positoin 4-5
        lines[start2:end2] = " " * (end2 - start2)
        print ''.join(lines)

output:

111     444TTT   P
122     444DDD   P
422     4445DD   R

try this,

though not efficient, some pointers to proceed with :slight_smile:

#! /bin/zsh

gen()
{
 appVar="x"
 cnt1=$1
 cnt2=$2
 var1=""
 var2=""

 while [ $cnt1 -gt 0 ]
 do
   cnt1=$(($cnt1 - 1))
   var1=$var1$appVar
 done

 while [ $cnt2 -gt 0 ]
 do
gen 3 1

awk '{ printf "%s %s\n", substr($0, 1, 2), substr($0, 6, 3) }' filename | while read line1 line2
do
echo $line1$var1$line2$var2
done

exit 0

   cnt2=$(($cnt2 - 1))
   var2=$var2$appVar
 done
}

Try...

awk '{for(i=1;i<=length;i++)
        printf ((i>=1&&i<=3||i>=203&&i<=240||i>=260&&i<=289)?substr($0,i,1):"x")
      printf ORS}' file1 > file2

Thanks everyone, this last post from Ygor looks like the easiest option. I've been playing with it this morning but getting a syntax error I can't get past.

awk: syntax error near line 2
awk: illegal statement near line 2

Any ideas?

Thanks all! :slight_smile:

On Solaris, use nawk.

Nawk works better but still not 100%, the code is now:

nawk '{for(i=1;i<=length;i++)
printf((i>=1&&i<=3||i>=203&&i<=240||i>=260&&i<=289), substr($0,i,1))}' file_in > file_out

but my file in looks like
ABC SOLARIS 2200 MAIN STREET

My out file is getting 1's in place of the characters I want to print and 0's in place of the blank spaces I want:
1110000000000011111111100000111111111

I'm sure it's something simple but I'm not seeing it :cool:

Thanks for any input

You have altered the awk script. Why not try the code as posted?

You are correct, thanks Ygor! I was getting a syntax error as awk before so I altered it then changed it to nawk without replacing the changes i had made in awk. It does exactly what I need with a space in place of the "x".

Thanks very much!! :smiley: