Shell Programming and Scripting

Hi,

iam having the file as follows:

ABCDEFGH|0987654321234567
ABCDEFGH|0987654321234523
ABCDEFGH|0987654321234556
ABCDEFGH|0987654321234545
POIUYTRE|1234567890890678
POIUYTRE|1209867757352567
POIUYTRE|5463879088797131
POIUYTRE|5468980091344456

pls provide me the split command

I want to split this abouve file as per the 1-8 character.

o/p file should

file1

ABCDEFGH|0987654321234567
ABCDEFGH|0987654321234523
ABCDEFGH|0987654321234556
ABCDEFGH|0987654321234545
file2
POIUYTRE|1234567890890678
POIUYTRE|1209867757352567
POIUYTRE|5463879088797131
POIUYTRE|5468980091344456

do u want ur file in two parts from the middle??

ur req "1-8 character." is not clear...
Pl explain..

as i said above after spliting process i need 2 seperate files file1 and file2

for i in $(cut -d'|' -f 1 test | sort | uniq); do grep ^$i test > ${i}.txt; done

if ur file is same as u said..
thn use this..
split -l 4 file.txt

now there 'll be two files with names xaa and xab with the contents u want

Anchal.

For testing purpose i have given only 4 records,

actually it may be in lacs...what is the method in that case??

split -l is the best options, do a man on split and read the -l option.

The following options are supported:

 -linecount | -l linecount
       Number of lines in each piece. Defaults to 1000 lines.

1) wc filename ( one you know the total # of lines you can decide on splitting them in either 2 or 2 or n number of smaller files.

For example if your file has 100000 lines

split -l 50000 filename will give you two file xaa and xab with 50000 lines each.

if you do

split -l 20000 filename it will give you xaa, xab , xac, xad, xae files with exactly 20000 lines in each file.

I hope this helps.

check this out

#!/bin/sh

while read myline
do
    var=`echo ${myline} | cut -d'|' -f 1`;
    echo ${myline} >> ${var}.txt;
done < ./input.txt

cheers
maverix

my input file will be

ABCDEFGH|0987654321234567
ABCDEFGH|0987654321234523
ABCDEFGH|0987654321234556
ABCDEFGH|0987654321234545
POIUYTRE|1234567890890678
POIUYTRE|1209867757352567
POIUYTRE|5463879088797131
POIUYTRE|5468980091344456
POIUYTRE|1234567890890678
ABCDEFGH|0987654321234556
ABCDEFGH|0987654321234545
POIUYTRE|5463879088797131
POIUYTRE|5468980091344456

i need to split it into

file1
ABCDEFGH|0987654321234567
ABCDEFGH|0987654321234523
ABCDEFGH|0987654321234556
ABCDEFGH|0987654321234545
ABCDEFGH|0987654321234556
ABCDEFGH|0987654321234545

file2
POIUYTRE|1234567890890678
POIUYTRE|1209867757352567
POIUYTRE|5463879088797131
POIUYTRE|5468980091344456
POIUYTRE|1234567890890678
POIUYTRE|5463879088797131
POIUYTRE|5468980091344456

by giving grep in the whole file like
grep 'ABCDEFGH' file name | wc -l will give the counts.

but for POIUYTRE it may differ then how can i split as above from the big file...

can anybody help.please....

maverix alredy give you the solution.

Did you check my earlier post??

The script I gave works fine for the type of file mentioned by you.
However, there's a catch in that script. If there's an empty line in the input file, a file by name ".txt" would be created. You can modify the script to address this or you can just delete it :slight_smile:

cheers
maverix

sed -e "s/^\(.*\)|.*/\1/g" bigfile | sort -u > entries.txt
for file in $(entries.txt)
do 
  grep "${file}" bigfile > "${file}.txt"
done

entries.txt is creating with 2 redords as below.

ABCDEFGH
POIUYTRE

while running the for loop in seperate script entries.txt.txt is creating and it is empty. splitting is not happening.

My bad.
It should look like

sed -e "s/^\(.*\)|.*/\1/g" bigfile | sort -u > entries.txt
for file in $(<entries.txt)
do 
  grep "${file}" bigfile > "${file}.txt"
done

Thanks for your valuable time vino and maverix.

Both are working great.......i have to see how much time it is taking and i need to use one.