SED on AIX Limitation

nemesis.spa · February 10, 2011, 7:15am

Hello,

I have a problem running a script created in ksh for Linux (Tested on Debian 5.0, Ubuntu Server 10.04 and RHEL 5.1), it works properly.
I trying to pass it to a AIX 5.3. :wall:
The problem is the character limit of 256 on a command system and SED.

I need to cut the contents of a file of N characters in chunks of 1850 characters separated by '\n', and the command used in Linux is:

cat $TMP_MAP_FILE | sed 's/^\(.\{'1850'\}\)/&\n/g' > $map_file

the error that appears is the following:

sed: 0602-404 Function s/^\(.\{1850\}\g)/&\n/ not can be parsed.

Tell me, if you have any ideas or another way to do it. I'm doing tests with dd
and the block size.

And sorry for my english
Thank you very much!

Franklin52 · February 10, 2011, 7:27am

Have you tried the fold command?

fold -w 1850 file

nemesis.spa · February 10, 2011, 7:46am

He had not tried, it works. (Do not know the command) :o

But I have a second problem.

The second part was to format the strings obtained, which set in fields of different sizes separated by ',' like a. csv.

code is as follows ...

cat $TMP_MAP_FILE | sed 's/^\(.\{'$j'\}\)/&;/g' > $MAP_FILE

Thank you very much

Franklin52 · February 10, 2011, 7:57am

Can you post an example of the input file and the desired output?

nemesis.spa · February 10, 2011, 8:01am

Of course...

INPUT:
aaaaaabbbbbbcccccccccddddddddeeeffffffgggggggggggg

OUTPUT:
aaaaaa;bbbbbb;ccccccccc;dddddddd;eee;ffffff;gggggggggggg;

One of the fields has 1600 characters

PD: I have tried:

perl -pi -e 's/^\(.\{'1845'\}\)/&;/' zz

Thanks!

Franklin52 · February 10, 2011, 8:26am

How do you determine the width of the individual fields?

nemesis.spa · February 10, 2011, 10:18am

Copy a part of the script. This code is inside a loop that will receive files.
Sorry for the comments, becose they are in Spanish

        # Variables que almacenan las longitudes de cabecera, registro y fragmentacion
        CHAR_CABECERA=22        # !!!!!!! A la cabecera debe sumarse 1
        CHAR_CABECERA=`expr $CHAR_CABECERA + 1`
        CHAR_REGISTRO=1850
        # Si se elimina el úo campo no habra separador al final
        # Estos datos se obtienen con el valor de cada campo, empezando por el primero sin modificar, y los siguiente, sumandoles el anterior + 1 (el caracter de separacion)
        set -A SEPARADORES 14 21 31 41 49 59 62 75 86 97 104 1305 1406 1707 1709 1714 1723 1727 1731 1736 1739 1742 1758 1774 1778 1782 1784 1789 1805 1808 1812 1815 1818 1823 1827 1829 1831 1833 1835 1837 1848 1889 1892
        # Bucle que va pasando por todos los ficheros encontrados con un patron
        for i in `cat $LST_TMP_PUS`
        do
                # Ficheros temporales donde se realizaran los mapeos
                CABECERA=$TMP_DIR/$i.Ncab
                TMP_MAP_FILE=$TMP_DIR/$i.Tmap
        ## CAMBIAR DIRECTORIO AL QUE SE VUELCAN LOS DATOS
                MAP_FILE=$RES_DIR/$i.map
                # Eliminamos la cabecera, cortamos el fichero por longitudes de registro, dejamos un registro por linea y eliminamos la ultima linea en blanco
                # cut -b$CHAR_CABECERA- $i | sed 's/\(.\{'$CHAR_REGISTRO'\}\)/&\n/g' > $CABECERA
                cut -b$CHAR_CABECERA- $i | sed 's_\(.\{'$CHAR_REGISTRO'\}\)_&\n_g' > $CABECERA
                # Volcamos el contenido del fichero sin fragmentar el registro al fichero que contendra los datos finales
                cat $CABECERA > $TMP_MAP_FILE
                # Bucle en el que formateamos el fichero
                        echo "Comienzo fichero $i separadores"
                for j in ${SEPARADORES[*]}
                do
                        echo "Problema en los separadores"
                        # Formateo del fichero registro a registro, se ha de tener en cuenta que cada registro se ha de sumar al anterior + el caracter de separacion
                        # cat $TMP_MAP_FILE | sed 's/^\(.\{'$j'\}\)/&;/g' > $MAP_FILE
                        cat $TMP_MAP_FILE | sed 's_^(.{$j})_&;_g' > $MAP_FILE
                        cat $MAP_FILE > $TMP_MAP_FILE
                        cat $MAP_FILE >> $RES_PUS
                # Fin - Bucle en el que formateamos el fichero
                done
                # Se eliminan los ficheros temporales
                rm $CABECERA $TMP_MAP_FILE $MAP_FILE
        # Fin - Bucle que va pasando por todos los ficheros encontrados con un patron
        done

Franklin52 · February 10, 2011, 12:47pm

If the values of the array SEPARADORES are the positions of the semicolon, you can create a file with the the values of the array SEPARADORES separated by spaces and run the awk command.

The file SEPARADORES_file should looks like:

14 21 31 41 ... 1848 1889 1892

awk 'NR==FNR{c=split($0,a); next} {
  s=substr($0,1,a[1])
  for(i=2;i<=c;i++) {
    s=s ";" substr($0,a[i-1]+1,a-a[i-1])         
  }
  s=s ";" substr($0, a[c]+1)
  print s
}' SEPARADORES_file main_file

nemesis.spa · February 11, 2011, 6:20am

Hi, I am now with another problem and I can not prove that mentioned.

As do I comment, but thank you very much for helping me