Some lines....
Question no: 1
The question?
A. Answer 1
B. Answer 2
C. Answer 3
D. Answer 4
Answer:B
Some lines....
Question no: 2
The question? (choose 2)
A. Answer 1
B. Answer 2
C. Answer 3
D. Answer 4
Answer:C,D
Some lines....
Question no: 3
The question?
A. Answer 1
B. Answer 2
C. Answer 3
D. Answer 4
E. Answer 5
Answer:A
Some lines....
and so on..
There can be 3 correct answers and there could be 5 answers to choose from. The lines of the answers can also be multiple lines. What I wanted to do is to extract all correct answers like this (based on example above):
Question no: 1
B. Answer 2
Question no: 2
C. Answer 3
D. Answer 4
Question no: 3
A. Answer 1
awk -F'[.,:]' ' # Use dot, command and colon as fields separator
/^[A-E]\./ { # Store the choices in array A with index the first field (letter A-E)
i=$1
$1=x
A=$0
}
/^Answer/ { # If line starts with "Answer" then print the valid choices
for(i=2; i<=NF; i++)
print $i"." A[$i]
print x # Print an empty line
}
/^Question/ # If line starts with "Question" then print the question
' file
It just that in the actual file, the format is really like this:
The question?
A. aaaaaa
B. bbbbb
C. ccccc
D. ddddddd
There is no word "answer" after the letter so when I run your script on the actual file, the first answer is showing the letter only. How can I fix this?
Some lines......
QUESTION NO: 89
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx?
A. 0
B. 1
C. 4
D. 8
Answer: B
Explanation:
QUESTION NO: 90
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx?
A. aaaaaaaaaaaa
B. bbbbbbbbbbbb
C. ccccccccccccc
D. dddddddddd
E. eeeeeeeeeeeeeeeee
Answer: A,B,E
Explanation:
QUESTION NO: 97
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq?
A. aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
A. aaaaaaaaaaaa
B. bbbbbbbbbbbb
C. ccccccccccccc
D. dddddddddd
E. eeeeeeeeeeeeeeeee
Answer: B
Explanation:
The difference is that apart from the capital letters for QUESTION , there is a space after the colon with zero or more spaces behind it...
So, try:
awk -F'[.,:] *' ' # Use dot, command and colon as fields separator
/^[A-E]\./ { # Store the choices in array A with index the first field (letter A-E)
i=$1
$1=x
A=$0
}
/^Answer/ { # If line starts with "Answer" then print the valid choices
for(i=2; i<=NF; i++)
print $i"." A[$i]
print x # Print an empty line
}
/^QUESTION/ # If line starts with "QUESTION" then print the question
' file
Alright, but the input specification keeps changing, are you sure this is that last exception?
Try:
awk -F'[.,:] *' ' # Use dot, command and colon as fields separator
/^[A-E]\./ { # Store the choices in array A with index the first field (letter A-E)
i=$1
$1=x
A=$0
choice=1
next
}
choice {
A=A RS $0
}
/^Answer/{ # If line starts with "Answer" then print the valid choices
choice=0
for(i=2; i<=NF; i++)
print $i"." A[$i]
print x # Print an empty line
}
/^QUESTION/ # If line starts with "QUESTION" then print the question
' file