Help with appending random sequence to huge CDR file

Hi,

I am in a terrible emergency. I have multiple cdr files with line count >6000.
I need to append |0| | | | | | | |random| to end of each line. The random number should never repeat.

Please help with a shell script to process all cdr's in a directory with above requirement.

What have you tried?

What should be the range of the random numbers?

anyrange would be file, all that is need is it has to be unique across 3 million cdrs

Moderator comments were removed during original forum migration.

random number, I have figured as $date +%N, which can be used. I am not able to come out with the final working script.

---------- Post updated at 04:14 PM ---------- Previous update was at 04:13 PM ----------

this is what , I am trying now

cntLoop=$date +%N
INP_CSV_FILE="/data101/rating/cs5_upload/med_dir/postpaid/dupgprs/data/testcdr
OUT_CSV_FILE="/data101/rating/cs5_upload/med_dir/postpaid/dupgprs/data/outfile.csv"
pattern='|0| | | | | | | |'
rm -f $OUT_CSV_FILE
for line in `cat $INP_CSV_FILE`
do
        cntloop=$date +%N
        echo $line|0| | | | | | | |$cntloop| >> $OUT_CSV_FILE
done

---------- Post updated at 04:15 PM ---------- Previous update was at 04:14 PM ----------

I am getting below errors

./test.sh: line 1: +%N: command not found
./test.sh: line 9: syntax error near unexpected token `|'
./test.sh: line 9: `        echo "$line|0| | | | | | | |$cntloop| " >> $OUT_CSV_FILE'
1 Like

How about a number like 0000(number of file)0000(line number)? That's going to be unique. A truly random one runs the risk of being not.

That would be perfect

How does this work on one file:

awk 'FNR==1 {
        FNUM++
        if(LF) close(LF);
        LF=FILENAME".out"
}
{    printf("%s|0| | | | | | | |%08d%08d|\n", FNUM, FNR) > LF; }' input.cdr

awk: cmd. line:5: { printf("%s|0| | | | | | | |%08d%08d|\n", FNUM, FNR);> LF }
awk: cmd. line:5: ^ syntax error

---------- Post updated at 04:34 PM ---------- Previous update was at 04:31 PM ----------

sample cdr format : i need to append to the end of each line like below mentioned extra columns and a random no followed by |. so as to make each cdr undoubtedly unique

0|1|0|20140406020532| |205| |5|0|620| |502| | |999933992| |3| | | |0|V:11:620:74043720:74043100|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0|0:0:0:0:0| |550|internet|502|0|0| |3333| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |

There were some mistakes in my original try, how about this:

awk 'FNR==1 {
        FNUM++
        if(LF) close(LF);
        LF=FILENAME".out"
}
{    printf("%s|0| | | | | | | |%08d%08d|\n", $0, FNUM, FNR) > LF; }' input.cdr

Looks perfect friend, Just please tell me, how to pick all files in a dir and do above appending. I am asking for this because, I have to deal with exact 45-50 files with 60k cdr's in each file.

Do these files have anything in common with each other? What do they look like?

After %s , space was needed as to provide a column width between last | and the first | we added. I made that change to the script you provided

---------- Post updated at 04:45 PM ---------- Previous update was at 04:44 PM ----------

In post#10, i provided a sample cdr, these files will have n number of lines with same cdr format

Their names, I mean. How can I tell the files you want?

For getting all files in folder , cant we use

cd /dupgprs/data/
for file in `ls -ltr |head -6000 |awk '{print $9}'`

It's not my brain, its some existing code for merging files. instead of 6000, if we use 1, will it work. I tried though, but the entire script doesnt give the needed result.

---------- Post updated at 04:51 PM ---------- Previous update was at 04:50 PM ----------

filename, I can put in like consolidated00[n] , where n is 1,2,3 etc. I have flexibility on file names

Can we? That's the question. Do the folders contain only files you want changed? Or do they have anything you want left alone?

And why stop at 6000?

the folder will only contain what i need to change. nothing else

I wish you had showed me, rather than described, the changes you made... Now I have to guess where you put the space.

find /dupgprs/data/ -type f |
        grep -v "\.out" | # Ignore previously parsed files
        xargs awk 'BEGIN { getline FNUM < "/tmp/FNUM"; close("/tmp/FNUM"); }
END   { printf("%d\n", FNUM) > "/tmp/FNUM";     }
FNR==1{ FNUM++; }
{ printf("%s|0| | | | | | | |%08d%08d |\n", FNUM, FNR) > FILENAME".out"; }'

The BEGIN { } and END {} code are to load/store the file number in /tmp/FNUM, so it gets saved and loaded between different calls of awk. (of which there will likely be several, to accomodate several thousand files.)

Use nawk on solaris.

let me try