Print between multiple patterns

Hello Gurus,

I have a file this

Dir Path 1
Connection pool="somename"; "DataSource Name"="DS name"; Password="pwd"; User Id="uid";some other fields
 
Dir Path2
Password="pwd2"; User id="uid2"; Connection pool="somename2"; "datasource name"="DS name2";some other fields.
 

Under each dir path i can have multiple lines

My task is to print dir path & below only Data source name & user Id, these field names are case sensitive like DataSource or datasource, occurances are not symmetric, i meant DataSource field can be 1st or any field.

I want o/p like

Dir path1
"DataSource Name"="DS name1" "User Id"="uid1"
Dir path2
"datasource name"="DS name2" "User Id"="uid2"
 

Any help appreciated.

Reg
Venkat

I guess this is a typo:

Dir Path2
Password="pwd2"; User id="uid2"; Connection pool="somename2"; "datasource dame"="DS name2";some other fields.

This awk code should help:

awk '
        /Dir Path/ {
                print $0
        }
        !/Dir Path/ && NF {
                match ($0, /[Dd][Aa][Tt][Aa][Ss][Oo][Uu][Rr][Cc][Ee][ ][Nn][Aa][Mm][Ee][^;]*/ )
                nme = sprintf ( "%s", substr ( $0, RSTART, RLENGTH ) )
                match ($0, /[Uu][Ss][Ee][Rr][ ][Ii][Dd][^;]*/ )
                usr = sprintf ( "%s", substr ( $0, RSTART, RLENGTH ) )
                print nme, usr
        }
' file
1 Like

Dear Yoda..

I am interested to understand your code, if you have enough time please explain

The code simply matches pattern datasource name & user id (any case) followed by zero or more occurrence of any character other than semi-colon.

If pattern found, the match function sets the built-in variable RSTART & RLENGTH which is used to define variables nme & usr and finally print them.

Hi,
Sed version:

sed 's/^.*[^"]\("*user id[^;]*;\)/ \1&/i;s/^.*[^"]\("*datasource name[^;]*\)/\1&/i;s/\([^;]*\);.*/\1/' file

-replace line by space + field user id + semicolon + line
and
-replace line by field datasource + line
and
-replace line by field newly build.

s/.../.../i for case insensitive
Regards.

1 Like
grep -Eio "dir path.*|\"data[^;]*|user id[^;]*" file
Dir Path 1
"DataSource Name"="DS name"
User Id="uid"
Dir Path2
User id="uid2"
"datasource dame"="DS name2"

Its only printing user id not data source name.

Also dir name is not common

ex: c:\temp\abc
d:\exit\xyz
c:\abc\test\one

Its working, tx all

Reg
Venkat

Why then don't you post a representative sample file in the first place?

Getting "cann't be parsed error"

Work fine at home with input example (gnu sed 4.2.1):

$ cat dir_path
Dir Path2
Password="pwd2"; User id="uid2"; Connection pool="somename2"; "datasource dame"="DS name2";some other fields.
Dir Path 1
Connection pool="somename"; "DataSource Name"="DS name"; Password="pwd"; User Id="uid";some other fields
$ sed 's/^.*[^"]\("*user id[^;]*;\)/ \1&/i;s/^.*[^"]\("*datasource name[^;]*\)/\1&/i;s/\([^;]*\);.*/\1/' dir_path
Dir Path2
 User id="uid2"
Dir Path 1
"DataSource Name"="DS name" User Id="uid"

Regards.

Thanks, yes u r right but if the line is like this

---------- D:\ABC5\XYZ\ONE.CONFIG
<sessionState mode="InProc" stateConnectionString="tcpip=127.0.0.1:42424" sqlConnectionString="data source=127.0.0.1;Trusted_Connection=yes" cookieless="false" timeout="20"/>

it outputs as
---------- D:\ABC5\XYZ\ONE.CONFIG
"data source=127.0.0.1 <sessionState mode="InProc" stateConnectionString="tcpip=127.0.0.1:42424" sqlConnectionString="data source=127.0.0.1

Reg
Venkat

I've not same error:

File:

$ cat dir_path
Dir Path2
Password="pwd2"; User id="uid2"; Connection pool="somename2"; "datasource dame"="DS name2";some other fields.
Dir Path 1
Connection pool="somename"; "DataSource Name"="DS name"; Password="pwd"; User Id="uid";some other fields
---------- D:\ABC5\XYZ\ONE.CONFIG
<sessionState mode="InProc" stateConnectionString="tcpip=127.0.0.1:42424" sqlConnectionString="data source=127.0.0.1;Trusted_Connection=yes" cookieless="false" timeout="20"/>

sed with problem (last line is truncate):

$ sed 's/^.*[^"]\("*user id[^;]*;\)/ \1&/i;s/^.*[^"]\("*datasource name[^;]*\)/\1&/i;s/\([^;]*\);.*/\1/' dir_path
Dir Path2
 User id="uid2"
Dir Path 1
"DataSource Name"="DS name" User Id="uid"
---------- D:\ABC5\XYZ\ONE.CONFIG
<sessionState mode="InProc" stateConnectionString="tcpip=127.0.0.1:42424" sqlConnectionString="data source=127.0.0.1

sed with issue:

$ sed 's/^.*[^"]\("*user id[^;]*;\)/ \1&/i;s/^.*[^"]\("*datasource name[^;]*\)/\1&/i;s/\("*\(datasource\|user id\)[^;]*\);.*/\1/i' dir_path
Dir Path2
 User id="uid2"
Dir Path 1
"DataSource Name"="DS name" User Id="uid"
---------- D:\ABC5\XYZ\ONE.CONFIG
<sessionState mode="InProc" stateConnectionString="tcpip=127.0.0.1:42424" sqlConnectionString="data source=127.0.0.1;Trusted_Connection=yes" cookieless="false" timeout="20"/>

Regards.

Thanks, is there a way to delete duplicate lines(case sensitively) of the output

Reg
Venkat

You can use an Associative Array to remove duplicates, but note that the order is not preserved in this code:

awk '
        /Dir Path/ {
                hdr = $0
        }
        !/Dir Path/ && NF {
                match ($0, /[Dd][Aa][Tt][Aa][Ss][Oo][Uu][Rr][Cc][Ee][ ][Nn][Aa][Mm][Ee][^;]*/ )
                nme = sprintf ( "%s", substr ( $0, RSTART, RLENGTH ) )
                match ($0, /[Uu][Ss][Ee][Rr][ ][Ii][Dd][^;]*/ )
                usr = sprintf ( "%s", substr ( $0, RSTART, RLENGTH ) )
                A[hdr RS nme OFS usr]
        }
        END {
                for ( k in A )
                        print k
        }
' file

What i meant duplicate means, dups between Dir Path, like under Dir Path2 there are 3 common entries, i need to print only once
Your code is removing dups overall.

 
cat dir_path
Dir Path1
Password="pwd2"; User id="uid2"; Connection pool="somename2"; "datasource same"="DS name2";some other fields.
Dir Path 2
Connection pool="somename"; "DataSource Name"="DS name"; Password="pwd"; User Id="uid";some other fields
Connection pool="somename"; "DataSource Name"="DS name"; Password="pwd"; User Id="uid";some other fields
Connection pool="somename"; "DataSource Name"="DS name"; Password="pwd"; User Id="uid";some other fields
Dir Path 3
Connection pool="somename"; "DataSource Name"="DS name3"; Password="pwd3"; User Id="uid3";some other fields
Connection pool="somename"; "DataSource Name"="DS name3"; Password="pwd3"; User Id="uid3";some other fields

& also is there a way to print

Dir path in one column & other entries in other column

Thanks
Venkat