Hi Team,
I have a file a1.txt with data as follows.
dfjakjf...asdfkasj</EnableQuotedIDs><SQL><SelectStatement modified='1' type='string'><![CDATA[ SELECT
The delimiter string:
<SelectStatement modified='1' type='string'><![CDATA[
dlm="<SelectStatement modified='1' type='string'><![CDATA["
head -1 a1.txt | awk -F"$dlm" '{print $2}'
The above command is not working if we have multiple chars + special chars as delimiter.
Expected output is as follows.
SELECT
Can anyone please me to fix this issue?
Thanks
Krishna
RudiC
2
WHAT "is not working"? Any error messages?
If it's about awk
not being happy with the field separator try escaping the square brackets:
dlm="<SelectStatement modified='1' type='string'><\!\\[CDATA\\["
awk -F"$dlm" '{print $2}' file
SELECT
Yes Sir. Still not working. Here's the exec msg's.
-sh-4.2$ dlm="<SelectStatement modified='1' type='string'><\!\\[CDATA\\["
-sh-4.2$ head -1 T24CustAuthSignerRlshpToXfmLoad.sql.txt
<?xml version='1.0' encoding='UTF-16'?><Properties version='1.1'><Common><Context type='int'>1</Context><![CDATA[0]]></EnableQuotedIDs><SQL><SelectStatement modified='1' type='string'><![CDATA[SELECT
-sh-4.2$ head -1 T24CustAuthSignerRlshpToXfmLoad.sql.txt | awk -F"${dlm}" '{print $2}'
awk: warning: escape sequence `\!' treated as plain `!'
awk: warning: escape sequence `\[' treated as plain `['
awk: fatal: Unmatched [ or [^: /<SelectStatement modified='1' type='string'><![CDATA[/
-sh-4.2$
-sh-4.2$ uname -a
Linux 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
-sh-4.2$
rdrtx1
4
dlm="<SelectStatement modified='1' type='string'><![CDATA["
awk 'NR==1 && index($0, dlm) {print substr($0, index($0, dlm) + length(dlm))}' dlm="$dlm" a1.txt
Thank you for your reply rdxtr1. It still did not work.
Here's the error.
-sh-4.2$ dlm="<SelectStatement modified='1' type='string'><![CDATA["
-sh: ![CDATA[": event not found
-sh-4.2$
In bash - when used interactively - you need to turn off history expansion/substitution:
set +H
to keep the shell from interpreting the !
character within double quotes
1 Like
Thank you Scrutinizer. It worked.
Similarly I have another scenario. The last line of the record is as follows.
FROM XYZ.[dbo].[STG_PRQ_UVW] ]]><ReadStatementFromFile type='bool'><![CDATA[0]]></ReadStatementFromFile><Tables collapsed='1'></Tables><Parameters collapsed='1'></Parameters><Columns collapsed='1'></Columns></SelectStatement><EnablePartitioning collapsed='1' type='bool'><![CDATA[0]]></EnablePartitioning></SQL><Transaction><RecordCount modified='1' type='int'><![CDATA[20000]]></RecordCount><EndOfWave collapsed='1' type='int'><![CDATA[0]]></EndOfWave></Transaction><Session><IsolationLevel type='int'><![CDATA[1]]></IsolationLevel><AutocommitMode type='int'><![CDATA[0]]></AutocommitMode><ArraySize modified='1' type='int'><![CDATA[20000]]></ArraySize><SchemaReconciliation><FailOnSizeMismatch type='bool'><![CDATA[1]]></FailOnSizeMismatch><FailOnTypeMismatch type='bool'><![CDATA[1]]></FailOnTypeMismatch><FailOnCodePageMismatch type='bool'><![CDATA[0]]></FailOnCodePageMismatch></SchemaReconciliation><PassLobLocator collapsed='1' type='bool'><![CDATA[0]]></PassLobLocator><CodePage collapsed='1' type='int'><![CDATA[0]]></CodePage></Session><BeforeAfter collapsed='1' type='bool'><![CDATA[0]]></BeforeAfter><LimitRows collapsed='1' type='bool'><![CDATA[0]]></LimitRows></Usage></Properties >
The output should be as follows.
FROM XYZ.[dbo].[STG_PRQ_UVW]
We need to look for the code snippet "]]><ReadStatementFromFile type" and strip out the text before the snippet.
I have tried tweaking the suggested awk command, it did not work. Can you please help me out?
rdrtx1
8
dlm="<SelectStatement modified='1' type='string'><![CDATA["
dlm2="]]><ReadStatementFromFile type"
awk '
NR==1 && index($0, dlm) {print substr($0, index($0, dlm) + length(dlm))}
index($0, dlm2) {print substr($0, 1, index($0, dlm2)-1)}
' dlm="$dlm" dlm2="$dlm2" a1.txt