Is there a simple way to find the longest common prefix of a space-separated list of strings, optionally by field?
For example, given input:
"aaa_b_cc aaa_b_cc_ddd aaa_b_cc aaa_b_cd"
with no field separator, output:
aaa_b_c
with _
field separator, output:
aaa_b
I have an awk solution which appears to work (although I haven't done much testing):
function get_common_prefix() {
list="$1"
sep="$2"
printf "$list" | awk '(NR==1) {pcount=split($0,prefix)}
(NR>1) {for (i=pcount;i>0;i--) {if ($i!=prefix) {pcount=i-1}}}
END {NF=pcount;print}' RS=' ' FS=$sep OFS=$sep
}
myprefix=$(get_common_prefix "$1" $2)
printf "[%s]\n" $myprefix
Searching didn't come up with anything more elegant that could handle both by character and by field. So, just wondering if the forum had any better solutions.
EDIT: The above (cygwin) doesn't seem to work very well on (non-gawk) AIX 6.1. Seems you can't fiddle with NF in END the way I've doing above (although just printing pcount fields of prefix works), and having a blank field separator seems equivalent to whitespace (i.e. it won't split by character).