Cannot extract libraries using sed

sand1234 · August 12, 2019, 11:32pm

Hi,

I am attempting to extract the /lib/ paths using sed but it does not appear to work.

./copy_chroot_lib.sh ls echo | sed s#*\(/lib\).*#\1#g
        linux-vdso.so.1 (0x00007fff77df1000)
        libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f28190ac000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2818ec2000)
        libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f2818e4e000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f2818e48000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2819303000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f2818e27000)
        linux-vdso.so.1 (0x00007ffca8342000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f485dc28000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f485de28000)

 ./copy_chroot_lib.sh ls echo | sed -n s#\(/lib\).*#\1#pg

Desired output

/lib/x86_64-linux-gnu/libselinux.so.1
...
/lib64/ld-linux-x86-64.so.2

Can you please tell me what I am doing wrong?

Thanks.

Neo · August 12, 2019, 11:41pm

Please post the output of this command only:

./copy_chroot_lib.sh ls echo

sand1234 · August 12, 2019, 11:43pm

UPDATE: I have managed to get the desired output using tr/awk, but continue to leave it open for alternative solutions. Thanks!

./copy_chroot_lib.sh ls echo | tr '=>' '\n' | awk '/\/lib/{print $1}' | sort | uniq
/lib64/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libdl.so.2
/lib/x86_64-linux-gnu/libpcre.so.3
/lib/x86_64-linux-gnu/libpthread.so.0
/lib/x86_64-linux-gnu/libselinux.so.1

--- Post updated at 04:43 AM ---

Hi Neo,

Sure.

 ./copy_chroot_lib.sh ls echo
        linux-vdso.so.1 (0x00007ffeba118000)
        libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f9b0d737000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9b0d54d000)
        libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f9b0d4d9000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9b0d4d3000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f9b0d98e000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9b0d4b2000)
        linux-vdso.so.1 (0x00007ffd9c8cf000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2f31a69000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2f31c69000)

Thanks.

MadeInGermany · August 13, 2019, 2:04am

The sed expression should be in quotes, so the shell does no substitutions on special characters.

sed -n 's#\(/lib\).*#\1#p'

Or

sed -n 's#.* => ##p'

sand1234 · August 13, 2019, 8:14am

Hi MadeinGermany,

Thanks for the tips.

The first solution does not work,

 ./copy_chroot_lib.sh ls echo | sed -n 's#\(/lib\).*#\1#p'
        libselinux.so.1 => /lib
        libc.so.6 => /lib
        libpcre.so.3 => /lib
        libdl.so.2 => /lib
        /lib
        libpthread.so.0 => /lib
        libc.so.6 => /lib
        /lib

While the output from the second solution does not capture only file path, and also captures (0x00...)

 ./copy_chroot_lib.sh ls echo | sed -n 's#.* => ##p'
/lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f73bfead000)
/lib/x86_64-linux-gnu/libc.so.6 (0x00007f73bfcc3000)
/lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f73bfc4f000)
/lib/x86_64-linux-gnu/libdl.so.2 (0x00007f73bfc49000)
/lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f73bfc28000)
/lib/x86_64-linux-gnu/libc.so.6 (0x00007f2cfa01c000)

I suppose we can exclude the hex value via the following, then capture lines starting with /lib.

 ./copy_chroot_lib.sh ls echo | sed -E 's#.* => ##;s/[(]0.*//'
        linux-vdso.so.1   <----- should not be included
/lib/x86_64-linux-gnu/libselinux.so.1
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libpcre.so.3
/lib/x86_64-linux-gnu/libdl.so.2
        /lib64/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/libpthread.so.0
        linux-vdso.so.1
/lib/x86_64-linux-gnu/libc.so.6
        /lib64/ld-linux-x86-64.so.2

Do let me know if you have any better ideas.

Thanks!

MadeInGermany · August 13, 2019, 9:16am

Sorry, in my first sample the .* (capture any characters) was wrongly at the end, belongs to the beginning.
You can add a second substitution, the trick is the order.

sed -n 's# *[(].*##; s#.* => ##p'

With awk

awk '$2 == "=>" {print $3}'

Don_Cragun · August 13, 2019, 8:16pm

There are a few reasons why your original sed script can't work.

Your first asterisk in your RE needs a period before it.
There is nothing in your script to avoid printing lines that do not contain /lib .
There is nothing in your RE that will stop looking when the first occurrence of /lib is found (and if you're looking for /lib instead of something like => you're going to have more problems because /lib occurs twice in most of the fields you want to print.

Since your sed command contains unquoted shell pathname matching meta characters there is also a slight danger that the RE in your sed substitute command could be destroyed by actually matching the pathname of an existing file.

Note that in post #3 where you said you had the output you wanted, there was a line in the output:

/lib64/ld-linux-x86-64.so.2

that does not appear in your subsequent posts nor in the output produced by MadeInGermany's suggestions. Note that there is no => in the input lines:

        /lib64/ld-linux-x86-64.so.2 (0x00007f2819303000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f485de28000)

which are the two input lines that end up producing the missing line of output.

If you still want that line of output try the command:

../copy_chroot_lib.sh ls echo |
	sed -n '/\/lib/s#.* \(/lib[^ ]*\) .*#\1#p' |
	sort -u

and see if it works for you. It works OK for me when using ksh or bash on macOS Mojave (version 10.14.6) with the sample data you provided. If the leading spaces shown in your sample data is really a tab character instead of being eight spaces, you will need to make a minor adjustment to the BRE. Note that invoking tr and uniq is not needed. The work that they do can all be done by sed and sort .

If you don't need the output sorted, but just want to get rid of duplicates, you can also try the following which just needs awk (with no sort required):

./copy_chroot_lib.sh ls echo |
	awk '	$(NF-1) ~ "^/lib" {	o[$(NF- 1)]		}
		END {			for(f in o) print f	}'

sand1234 · August 15, 2019, 9:42am

Hi MadeinGermany,

Thanks for the responses, nice idea(s). The line without => is not captured via your regex, however DonCragun has provided some details around this.

 /lib64/ld-linux-x86-64.so.2 (0x00007fc21e165000)

Hi DonCragun,

Thanks for the response.

The initial solution with awk/tr/sort works. Looks like it is successful as the lines which do not have => as delimiter are left untouched by tr anyway.

./copy_chroot_lib.sh ls echo | tr '=>' '\n' | awk '/\/lib/{print $1}' | sort -u
/lib64/ld-linux-x86-64.so.2   <------------ line without => is included
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libdl.so.2
/lib/x86_64-linux-gnu/libpcre.so.3
/lib/x86_64-linux-gnu/libpthread.so.0
/lib/x86_64-linux-gnu/libselinux.so.1

The awk solution is nice and works, as I only need to remove duplicates. Sed is also good, but a little complex to understand.

Can you please explain the following awk code?

 ./copy_chroot_lib.sh ls echo | awk '$(NF-1) ~ "^/lib" {o[$(NF- 1)]} END {for(f in o) print f}'

From what I understand, if the second last field starts with /lib, the field is put into an array named o. Then the array is traversed and the value of all array elements printed out. How does this remove duplicates?

Thanks.

MadeInGermany · August 15, 2019, 11:01am

The duplicates are removed because of the associative (string-indexed) array o[ ] .
If the string $(NF-1) occurs the second time it defines the same array element again.
You can make it visible by assigning a counter value to the array element, and printing it at the end:

awk '$(NF-1) ~ "^/lib" {o[$(NF- 1)]++} END {for(f in o) print f, "==>", o[f]}'