Help with convert awk script into perl

perl_beginner · April 12, 2012, 7:30am

Input file (a list of input file name with *.txt extension):

campus.com_icmp_ping_alive.txt
data_local_cd_httpd.txt
data_local_cd.txt
new_local_cd_mysql.txt
new_local_cd_nagios_content.txt

Desired output file:

data   local_cd_httpd
data   local_cd
new   local_cd_mysql
new   local_cd_nagios_content

Awk command I try:

[home@user]ls *.txt | awk -F"_" '{print $1}' > tmp1
[home@user]cat tmp1
data   
data   
new   
new

I able to generate the command 1 of my desired output file with awk.
Column 2 of the desired output file is exclude those content that appear before the first "_" and remove the "*.txt" extension too.

It would be better that the script is written in perl.
Thanks for any advice!

zaxxon · April 12, 2012, 7:53am

Not perl but...

# ls -1 *txt| awk '!/^c/ {sub(/_/,"\t"); print}'
data    local_cd_httpd.txt
data    local_cd.txt
new     local_cd_mysql.txt
new     local_cd_nagios_content.txt

Or with sed:

# ls -1 *txt| sed -n '/^[^c]/ {s/_/\t/p}'
data    local_cd_httpd.txt
data    local_cd.txt
new     local_cd_mysql.txt
new     local_cd_nagios_content.txt

perl_beginner · April 12, 2012, 8:12am

Thanks, zaxxon
Looking forward for perl language to solve this question too ^^

Ygor · April 12, 2012, 10:18am

You can convert awk to perl using a2p...

$ echo '!/^c/ {sub(/_/,"\t"); print}'|a2p
#!/usr/bin/perl
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
    if $running_under_some_shell;
                        # this emulates #! processing on NIH machines.
                        # (remove #! line above if indigestible)

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z_0-9]+=)(.*)/ && shift;
                        # process any FOO=bar switches

$, = ' ';               # set output field separator
$\ = "\n";              # set output record separator

while (<>) {
    chomp;      # strip record separator
    if (!/^c/) {
        s/_/\t/;
        print $_;
    }
}

Ygor · April 12, 2012, 10:28am

So...

$ ls *.txt|perl -n -e 'if (!/^c/) { s/_/\t/; s/.txt//; print $_ }'
data    local_cd
data    local_cd_httpd
new     local_cd_mysql
new     local_cd_nagios_content

perl_beginner · April 12, 2012, 9:50pm

Hi zaxxon, do you mind to explain what is "^c" in your command?
Many thanks

---------- Post updated at 08:50 PM ---------- Previous update was at 08:49 PM ----------

Hi ygor,

Do you mind to explain what is the meaning of "^c" in your code?
Thanks

zaxxon · April 12, 2012, 10:22pm

^[^c]
The 1st ^ stands for start of line.
The other ^ inside the square brackets (group of characters) is a negation of the characters inside the bracket ie. the group of characters, which is here just the c. It just means "All characters but a c that start the line".

perl_beginner · April 12, 2012, 11:44pm

Hi zaxxon,

In other word, can I said that as long as the first character of my content is not start with "c", it will replace the first "_" with "\t", am I right?

Thanks for verification and sharing info

itkamaraj · April 13, 2012, 12:49am

$ perl -F_ -lane 'printf("%s\t",$F[0]); shift @F; for(@F){s/.txt//;}print join("_",@F)' input.txt
campus.com      icmp_ping_alive
data    local_cd_httpd
data    local_cd
new     local_cd_mysql
new     local_cd_nagios_content

zaxxon · April 13, 2012, 4:25am

Correct.