sajmar
September 4, 2013, 10:50am
1
Hello
I have a data set which looks like this :
progeny sire dam gender
12 1 3 M
13 2 4 F
14 2 5 F
15 6 5 M
I need a subset data which separate the gender (M and F) to two files.
I want something like this:
file 1 output:
progeny sire dam gender
13 2 4 F
14 2 5 F
file2 output:
progeny sire dam gender
12 1 3 M
15 6 5 M
Thanks
awk 'NR==1 { print > "M" ; print > "F"; next }
{ print > $4 }' inputfile
1 Like
sajmar
September 4, 2013, 12:28pm
3
@ Corona
Thanks for your suggestion. However, this command do not solve my problem.
In what way did it not solve your problem? Be specific or I won't know what problem to fix.
sajmar
September 4, 2013, 12:37pm
5
@ COrona
To be clear my problem, I have a data set :
progeny sire dam gender
12 1 3 M
13 2 4 F
14 2 5 F
15 6 5 M
I want the subset data based on selecting the gender which looks like this:
progeny sire dam gender
13 2 4 F
14 2 5 F
That is what my suggestion does, yes.
In what way does it not work for you? Be specific. What exactly did you do, and what precisely happened?
sajmar
September 4, 2013, 1:19pm
7
corona688:
That is what my suggestion does, yes.
In what way does it not work for you? Be specific. What exactly did you do, and what precisely happened?
@ Corona:
When I run the program, it gives me the empty file.
awk 'NR==1 { print > "M" ; print > "F"; next }{ print > $4 }' aa > bb
The output file was not included in my instructions, for the reason that it would be empty. It doesn't use it.
Check for the files 'M' and 'F' in the same directory, they will not be empty.
sajmar
September 4, 2013, 1:38pm
9
corona688:
The output file was not included in my instructions, for the reason that it would be empty. It doesn't use it.
Check for the files 'M' and 'F' in the same directory, they will not be empty.
When I run the program I had M, F file but there is just one line.
What I have in my data set is more lines than the example. I have 2600 lines which contains M and F which are genders. What I want is how to separate 2 files from the data set in 2 file that have separate gender M and gender F.
That is what my example does, yes. It writes to different file names depending on what the value of the fourth column is.
If the fourth column isn't what you showed it to be in your example data, it won't do what I expect. Check the contents of your folder with 'ls', it may have made weird names.
Could you show a more complete example of your input data please?
sajmar
September 4, 2013, 2:04pm
11
corona688:
That is what my example does, yes. It writes to different file names depending on what the value of the fourth column is.
If the fourth column isn't what you showed it to be in your example data, it won't do what I expect. Check the contents of your folder with 'ls', it may have made weird names.
Could you show a more complete example of your input data please?
you can find my data set which I want to subset base on gender M and F in 2 separate file.
The data you posted clearly shows M/F in the fifth column, not the fourth.
Also, the data you posted has no header row, which your original data did. I can simplify my code a lot knowing it's not there.
awk '{ print > $5 }' inputfile
This is really bad, but seems to work.
Making the assumption that M or F will only appear once on each line
and separated by white space.
while read line
do
if [[ $line == *M* ]]; then
echo "$line"
## cat to file
fi
if [[ $line == *F* ]]; then
echo "$line"
## cat to file
fi
done < file
w020637
September 10, 2013, 11:57am
14
The solution works
---------- Post updated at 11:57 AM ---------- Previous update was at 11:52 AM ----------
grep M aa.txt > M
grep F aa.txt > F
This will get you what you need