How to extract subset file from dataset?

Hello
I have a data set which looks like this :

progeny      sire          dam        gender
12                  1             3                M
13                  2             4                F
14                  2              5               F
15                  6              5               M 

I need a subset data which separate the gender (M and F) to two files.
I want something like this:
file 1 output:

progeny      sire          dam        gender
13                  2             4                F
14                  2              5               F

file2 output:

progeny      sire          dam        gender
12                  1             3                M
15                  6              5               M

Thanks

awk 'NR==1 { print > "M" ; print > "F"; next }
{ print > $4 }' inputfile
1 Like

@ Corona
Thanks for your suggestion. However, this command do not solve my problem.

In what way did it not solve your problem? Be specific or I won't know what problem to fix.

@ COrona
To be clear my problem, I have a data set :

progeny            sire          dam        gender 
12                             1                  3                     M 
13                             2                  4      F 
14                             2                   5                    F 
15                             6      5                   M  

I want the subset data based on selecting the gender which looks like this:

progeny            sire          dam        gender 
13                           2                   4                      F 
14                           2                    5                     F

That is what my suggestion does, yes.

In what way does it not work for you? Be specific. What exactly did you do, and what precisely happened?

@ Corona:
When I run the program, it gives me the empty file.

awk 'NR==1 { print > "M" ; print > "F"; next }{ print > $4 }' aa > bb

The output file was not included in my instructions, for the reason that it would be empty. It doesn't use it.

Check for the files 'M' and 'F' in the same directory, they will not be empty.

When I run the program I had M, F file but there is just one line.
What I have in my data set is more lines than the example. I have 2600 lines which contains M and F which are genders. What I want is how to separate 2 files from the data set in 2 file that have separate gender M and gender F.

That is what my example does, yes. It writes to different file names depending on what the value of the fourth column is.

If the fourth column isn't what you showed it to be in your example data, it won't do what I expect. Check the contents of your folder with 'ls', it may have made weird names.

Could you show a more complete example of your input data please?

you can find my data set which I want to subset base on gender M and F in 2 separate file.

The data you posted clearly shows M/F in the fifth column, not the fourth.

Also, the data you posted has no header row, which your original data did. I can simplify my code a lot knowing it's not there.

awk '{ print > $5 }' inputfile

This is really bad, but seems to work.
Making the assumption that M or F will only appear once on each line
and separated by white space.

while read line
	do
	    if [[ $line == *M* ]]; then  
	    echo "$line"
	    ## cat to file	
	    fi
	    if [[ $line == *F* ]]; then
	    echo "$line"
	    ## cat to file
	    fi
	done < file

The solution works

---------- Post updated at 11:57 AM ---------- Previous update was at 11:52 AM ----------

grep M aa.txt > M
grep F aa.txt > F

This will get you what you need