Extract patterns and copy them in different files

Hi All,

I have a file which looks like this:

Name1;A01
Name2;A01.047
Name3;A01.047.025
Newname1;B01
NewName2;B01.056.32
NewName3;B04.09.43
NewNewName1;C01.03
NewNewName2;C01.034.44

As you can see, in the file there is some name and followed by the name is some identifier. These identifiers have same identification alphabet (A, B, C until Z) if they belong to the same type.

My task is to extract all the names which belong to the same alphabet and store them in separate files.

For example,
A.dat will look like this:

Name1;A01
Name2;A01.047
Name3;A01.047.025

B.dat will look like this:

Newname1;B01
NewName2;B01.056.32
NewName3;B04.09.43

and C.dat

NewNewName1;C01.03
NewNewName2;C01.034.44

I am using Linux with BASH shell.

awk -F';' '{ print $0 >> substr($2, 1, 1) }' INPUTFILE                       
1 Like

And this is how I would add the extensions to files:

for i in `ls PATH`
do
        mv $i $i.dat
done

Thanks for your help and this completes everything. :slight_smile:

Well, sorry, I didn't notice:

awk -F';' '{ print $0 >> substr($2, 1, 1) ".dat" }' INPUTFILE
1 Like

Even better :slight_smile:

---------- Post updated at 11:11 AM ---------- Previous update was at 10:17 AM ----------

I am now trying to find those lines which have ZERO dot and store those lines in A.dat and then the lines which have 1 dot store them in B.dat and two dots store them in C.dat and so on until I reach the final number of dots at maximum they are 11 and less.

This is an illustration:

main_file.txt which looks like this:

Name1;A01
Name2;A01.047
Name3;A01.047.025
Newname1;B01
NewName2;B01.056.32
NewName3;B04.09.43
NewNewName1;C01.03
NewNewName2;C01.034.44

So, this is what I expect:
A.dat

Name1;A01
Newname1;B01

B.dat

Name2;A01.047

C.dat

Name3;A01.047.025
NewName2;B01.056.32
NewName3;B04.09.43
NewNewName2;C01.034.44

This is what I tried but to no success:

perl -lne '$c=1 while /./g; END { print $c."dat"; }'

Even the part that gives file names is not perfect.

nawk 'BEGIN{FS="";count=0}  {for(i=1;i<=length($0);i++) {if(substr($0,i,1)==".") count++;} {print $0 >> count;count=0}}' main_file.txt

This will create a files 0,1,2...etc depends upon the .(dot) which is in the line

1 Like