uxnoob
June 16, 2009, 9:11am
1
Hi,
I'm trying to pick out a data field eg. from below. I need the required field as below but they are filled sometimes with weird chars like \-(. or watever. How can I accurately extract the 3rd field in shell?
ID IDNO - REQUIRED FIELD
ID 1447 - MAT620BR.
ID 1452 - FGI-LOM3100R \ LOM FGI REPORT (.
ID 1453 - FGI-LOM3101R \ LOM FGI REPORT (.
ID 1512 - SAM05TRR.
ID 1514 - SAM6220R.
ID 1515 - SAM07R.
ID 1516 - SAM07R00.
ID 1517 - SAM10R.
ID 1518 - SAM10R00.
ID 1521 - SAM13R.
ID 1536 - MONJ001R.
ID 1537 - MONJ004R.
ID 1541 - FROLPS.
ID 1542 - FROAPD.
ID 1548 - MOS5610R.
ID 1550 - C009LP \ DAILY INVOICE.
ID 1554 - SAM49R.
ID 1559 - MAT310AR.
You have various ways to extract lines from a text.
You could use head/tail
head -n 3 filename| tail -n 1
You can use sed
sed -n '3p' filename
You can use awk
awk 'NR == 3 {print}' filename
Etc...
To remove those chars, you can use tr
redoubtable@Tsunami ~ $ awk 'NR == 3 {print}' filename|tr '.(\\' '\000'
ID 1453 - FGI-LOM3101R LOM FGI REPORT
redoubtable@Tsunami ~ $
awk -F"-" '{print $NF}' file
Can you show us what have you tried so far and where you are stuck?
Regards
uxnoob
June 16, 2009, 9:38am
5
Hi Ghost,
I was checking the recommendation you gave. But there was a problem.The output with your recommendation gave:
LOM3100R \ LOM FGI REPORT (.
The correct output should be:
FGI-LOM3100R \ LOM FGI REPORT (
without the fullstop but includes the FGI-
ID 1452 - FGI-LOM3100R \ LOM FGI REPORT (.
ID 1453 - FGI-LOM3101R \ LOM FGI REPORT (.
-----Post Update-----
Hi Franklin,
i've tried basic awk and cut to print 3rd field but they do not work. i used spaces and - as delimiters but they do not give the full output i require
awk -F" - " '{print $NF}' file
uxnoob
June 16, 2009, 12:40pm
7
Are there better ways of making the extraction more accurate? it works with the " - " now, however, there could still be mistakes if a field would contain something like below.
FGI-LOM3100R - LOM FGI REPORT (.
Try this:
awk -F" - " '{gsub("[(\.]","")}{print $2}'
You can place "weird" characters within the brackets [], special characters must be escaped with a backslash.
uxnoob
June 16, 2009, 10:25pm
9
franklin52:
Try this:
awk -F" - " '{gsub("[(\.]","")}{print $2}'
You can place "weird" characters within the brackets , special characters must be escaped with a backslash.
Hi Franklin,
the expected output should be the whole field as what it is.
eg. FGI-LOM3100R \ LOM FGI REPORT (.
with your codes, the result is something like
FGI-LOM3100R LOM FGI REPORT
which is inaccurate.
again how can we extract the 3rd field totally without being affected by the chars within the field as delimiters?
i've tried using awk " - " but its giving me inaccurate answers if the field has a " - " within.
eg,
data: ID 123 - testing
output using awk comamnd: testing
desired output: testing
data: ID 456 - abc-abc.(
output using awk comamnd: abc-abc.(
desired output: abc
data: ID 7111 - abc - def
output using awk comamnd: def
desired output: abc - def
I've misread the question, try this:
awk -F" - " '{print $2}' file
Regards
panyam
June 17, 2009, 3:38am
11
awk '{$1="";$2="";$3=""; print }' input_file.txt | sed 's/^[ ]*//g'
uxnoob
June 17, 2009, 3:50am
12
Hi Guys,
Thanks for the replies so far. Haven't really quite hit the nail on the head yet though
data: ID 7111 - abc - def
output using awk -F" - " '{print $2}' file comamnd: abc
desired output: abc - def
its simple...
cat inputfile | cut -d"-" -f2-