Copy a string to another file

OS version: RHEL 6.7
Shell : Bash

I have a file like below. It has 500K lines.
I want to extract TAG_IDs shown in single quote at the end to copied to another file. As if I had copied the TAG_IDs using block select (Column Select) in modern text editor

$ cat file.txt

UPDATE TAGREF  SET TAG_VAL = NULL  WHERE ATTR_NAME IN   ('Span', 'SpanUnitOfMeasure')   AND TAG_ID = '1-19KYTMQY';
UPDATE TAGREF  SET TAG_VAL = NULL  WHERE ATTR_NAME IN   ('Span', 'SpanUnitOfMeasure')   AND TAG_ID = '1-19KZQV5B';
UPDATE TAGREF  SET TAG_VAL = NULL  WHERE ATTR_NAME IN   ('Span', 'SpanUnitOfMeasure')   AND TAG_ID = '1-19N01VZQ';
.
.
.
.
.

expected output in the output file (with or without single quotes is fine)

1-19KYTMQY
1-19KZQV5B
1-19N01VZQ
.
.
.
.
awk '{print $NF}' file
'1-19KYTMQY';
'1-19KZQV5B';
'1-19N01VZQ';
1 Like

Hello John K,

Could you please try following too(without quotes etc).

awk -F"'|;" '{print $6}'   Input_file

Thanks,
R. Singh

1 Like

Hi Ravinder,
Thank You. What is $6 ?

Hi Rudic,
Thank You. What does NF do ?

man awk :

In awk , Fields are numbered 1 .. NF, so $1 references the first field, and $NF the last one.

1 Like

Hello John K,

In awk language $ means value of any field(we could setup field's separator as per our need like in my solution I have made them to '; and in Rudi's case it is default which is space), so $16 means first field of the line and NF denotes the number of fields in a line, which will print the value of last field

Thanks,
R. Singh

1 Like

Thanks Ravinder. So, in your solution, you are asking awk utility to start reading from right to left . Right ? ( $6 being the first field)

Hello John K,

Following is the explanation for solution in more details.
I- Field first will always start from left only.
II- Though you could read fields either from left to right OR right to left.
III- Let's take a look after setting field separator to -F"'|;" and go through all the fields in the line by a loop to see what are the values we have:

awk -F"'|;" '{for(i=1;i<=NF;i++){print "Field number=",i,"Field value=",$i}}'   Input_file
 
Field number= 1 Field value= UPDATE TAGREF  SET TAG_VAL = NULL  WHERE ATTR_NAME IN   (
Field number= 2 Field value= Span
Field number= 3 Field value= ,
Field number= 4 Field value= SpanUnitOfMeasure
Field number= 5 Field value= )   AND TAG_ID =
Field number= 6 Field value= 1-19KYTMQY
Field number= 7 Field value=
Field number= 8 Field value=
Field number= 1 Field value= UPDATE TAGREF  SET TAG_VAL = NULL  WHERE ATTR_NAME IN   (
Field number= 2 Field value= Span
Field number= 3 Field value= ,
Field number= 4 Field value= SpanUnitOfMeasure
Field number= 5 Field value= )   AND TAG_ID =
Field number= 6 Field value= 1-19KZQV5B
Field number= 7 Field value=
Field number= 8 Field value=
Field number= 1 Field value= UPDATE TAGREF  SET TAG_VAL = NULL  WHERE ATTR_NAME IN   (
Field number= 2 Field value= Span
Field number= 3 Field value= ,
Field number= 4 Field value= SpanUnitOfMeasure
Field number= 5 Field value= )   AND TAG_ID =
Field number= 6 Field value= 1-19N01VZQ
Field number= 7 Field value=
Field number= 8 Field value=
 

IV- So here you could see above 6th field's value is the one which you need to print.
Let me know if you have any more queries on same.

Thanks,
R. Singh

1 Like

Thank You for the brilliant demo , Ravinder.

I didn't know that awk can have multiple field delimiters . So, -F"'|;" means , the field delimiter can be either a single quote or a semicolon .
Why was semicolon included as the delimiter ? Because, the following command without semicolon as the delimiter works fine too.

$ awk -F"'" '{print $6}' somePattern.txt
1-19KYTMQY
1-19KZQV5B
1-19N01VZQ
1-18N01VZQ

Hello John K,

Yes, in this case you can remove ; . But in any other cases you could use multiple delimiters as per your requirements too.

Thanks,
R. Singh

1 Like