Select answers from multiple questions using shell script

I have a text file in this format

Some lines....
Question no: 1

The question?

A. Answer 1
B. Answer 2
C. Answer 3
D. Answer 4

Answer:B
Some lines....

Question no: 2

The question? (choose 2)

A. Answer 1
B. Answer 2
C. Answer 3
D. Answer 4

Answer:C,D
Some lines....

Question no: 3

The question?

A. Answer 1
B. Answer 2
C. Answer 3
D. Answer 4
E. Answer 5

Answer:A
Some lines....

and so on..

There can be 3 correct answers and there could be 5 answers to choose from. The lines of the answers can also be multiple lines. What I wanted to do is to extract all correct answers like this (based on example above):

Question no: 1
B. Answer 2

Question no: 2
C. Answer 3
D. Answer 4

Question no: 3
A. Answer 1

How can I do this using shell script maybe...awk?

Thanks

Try:

awk -F'[.,:]' '           # Use dot, command and colon as fields separator
  /^[A-E]\./ {            # Store the choices in array A with index the first field (letter A-E)
    i=$1
    $1=x
    A=$0
  }
  /^Answer/ {             # If line starts with "Answer" then print the valid choices
    for(i=2; i<=NF; i++)
      print $i"." A[$i]
    print x               # Print an empty line
  }
  /^Question/             # If line starts with "Question" then print the question
' file
1 Like

Thank you Scrutinizer. It works.

It just that in the actual file, the format is really like this:

The question?

A. aaaaaa
B. bbbbb
C. ccccc
D. ddddddd

There is no word "answer" after the letter so when I run your script on the actual file, the first answer is showing the letter only. How can I fix this?

Thanks

My suggestion is not relying on the word answer after the letter, so I don't understand what you mean. Can you provide a better sample?

to be clearer the output is this:

QUESTION NO: 89
 B.

QUESTION NO: 90
 A.
B.  bbbbbbbbbbbb
E.  eeeeeeeeeeeeeeeee

QUESTION NO: 97
 B.

And what is the input that should lead to that output?

Here you go:

Some lines......
QUESTION NO: 89

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx?

A. 0
B. 1
C. 4
D. 8

Answer: B
Explanation:

QUESTION NO: 90

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx?


A. aaaaaaaaaaaa
B. bbbbbbbbbbbb
C. ccccccccccccc
D. dddddddddd
E. eeeeeeeeeeeeeeeee

Answer: A,B,E
Explanation:

QUESTION NO: 97

qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq?

A. aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
A. aaaaaaaaaaaa
B. bbbbbbbbbbbb
C. ccccccccccccc
D. dddddddddd
E. eeeeeeeeeeeeeeeee

Answer: B
Explanation:

The difference is that apart from the capital letters for QUESTION , there is a space after the colon with zero or more spaces behind it...

So, try:

awk -F'[.,:] *' '         # Use dot, command and colon as fields separator
  /^[A-E]\./ {            # Store the choices in array A with index the first field (letter A-E)
    i=$1
    $1=x
    A=$0
  }
  /^Answer/ {             # If line starts with "Answer" then print the valid choices
    for(i=2; i<=NF; i++)
      print $i"." A[$i]
    print x               # Print an empty line
  }
  /^QUESTION/             # If line starts with "QUESTION" then print the question
' file
1 Like

Thank you very much again sir. It is working fine now. However, when one answer has multiple lines, for example:

A. [edit schedulers] 
user@host# show 
scheduler no-weekends { 
daily all-day; 
sunday exclude; 
saturday exclude; 
} 

the script output is only the first line:

A. [edit schedulers]

How can I show the entire lines for that answer?

Alright, but the input specification keeps changing, are you sure this is that last exception?

Try:

awk -F'[.,:] *' '         # Use dot, command and colon as fields separator
  /^[A-E]\./ {            # Store the choices in array A with index the first field (letter A-E)
    i=$1
    $1=x
    A=$0
    choice=1
    next
  }
  choice {
    A=A RS $0
  }
  /^Answer/{             # If line starts with "Answer" then print the valid choices
    choice=0
    for(i=2; i<=NF; i++)
      print $i"." A[$i]
    print x               # Print an empty line
  }
  /^QUESTION/             # If line starts with "QUESTION" then print the question
'  file
1 Like

Thank you sir. I think that is all. I really appreciate your help.