I suppose we can exclude the hex value via the following, then capture lines starting with /lib.
./copy_chroot_lib.sh ls echo | sed -E 's#.* => ##;s/[(]0.*//'
linux-vdso.so.1 <----- should not be included
/lib/x86_64-linux-gnu/libselinux.so.1
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libpcre.so.3
/lib/x86_64-linux-gnu/libdl.so.2
/lib64/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/libpthread.so.0
linux-vdso.so.1
/lib/x86_64-linux-gnu/libc.so.6
/lib64/ld-linux-x86-64.so.2
Sorry, in my first sample the .* (capture any characters) was wrongly at the end, belongs to the beginning.
You can add a second substitution, the trick is the order.
There are a few reasons why your original sed script can't work.
Your first asterisk in your RE needs a period before it.
There is nothing in your script to avoid printing lines that do not contain /lib .
There is nothing in your RE that will stop looking when the first occurrence of /lib is found (and if you're looking for /lib instead of something like => you're going to have more problems because /lib occurs twice in most of the fields you want to print.
Since your sed command contains unquoted shell pathname matching meta characters there is also a slight danger that the RE in your sed substitute command could be destroyed by actually matching the pathname of an existing file.
Note that in post #3 where you said you had the output you wanted, there was a line in the output:
/lib64/ld-linux-x86-64.so.2
that does not appear in your subsequent posts nor in the output produced by MadeInGermany's suggestions. Note that there is no => in the input lines:
which are the two input lines that end up producing the missing line of output.
If you still want that line of output try the command:
../copy_chroot_lib.sh ls echo |
sed -n '/\/lib/s#.* \(/lib[^ ]*\) .*#\1#p' |
sort -u
and see if it works for you. It works OK for me when using ksh or bash on macOS Mojave (version 10.14.6) with the sample data you provided. If the leading spaces shown in your sample data is really a tab character instead of being eight spaces, you will need to make a minor adjustment to the BRE. Note that invoking tr and uniq is not needed. The work that they do can all be done by sed and sort .
If you don't need the output sorted, but just want to get rid of duplicates, you can also try the following which just needs awk (with no sort required):
./copy_chroot_lib.sh ls echo |
awk ' $(NF-1) ~ "^/lib" { o[$(NF- 1)] }
END { for(f in o) print f }'
Thanks for the responses, nice idea(s). The line without => is not captured via your regex, however DonCragun has provided some details around this.
/lib64/ld-linux-x86-64.so.2 (0x00007fc21e165000)
Hi DonCragun,
Thanks for the response.
The initial solution with awk/tr/sort works. Looks like it is successful as the lines which do not have => as delimiter are left untouched by tr anyway.
./copy_chroot_lib.sh ls echo | tr '=>' '\n' | awk '/\/lib/{print $1}' | sort -u
/lib64/ld-linux-x86-64.so.2 <------------ line without => is included
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libdl.so.2
/lib/x86_64-linux-gnu/libpcre.so.3
/lib/x86_64-linux-gnu/libpthread.so.0
/lib/x86_64-linux-gnu/libselinux.so.1
The awk solution is nice and works, as I only need to remove duplicates. Sed is also good, but a little complex to understand.
Can you please explain the following awk code?
./copy_chroot_lib.sh ls echo | awk '$(NF-1) ~ "^/lib" {o[$(NF- 1)]} END {for(f in o) print f}'
From what I understand, if the second last field starts with /lib, the field is put into an array named o. Then the array is traversed and the value of all array elements printed out. How does this remove duplicates?
The duplicates are removed because of the associative (string-indexed) array o[ ] .
If the string $(NF-1) occurs the second time it defines the same array element again.
You can make it visible by assigning a counter value to the array element, and printing it at the end:
awk '$(NF-1) ~ "^/lib" {o[$(NF- 1)]++} END {for(f in o) print f, "==>", o[f]}'