print only comments

I want to write a shell script which it takes as argument a java file or a c++ file (.java or .cpp).
It will check if the file is type of java or c++, else it ends with error message.
If all are ok, it will call awk that prints only the comments that the java or c++ file contains, grouping and puting them between tags.

For example, I have this code:

/*
The HelloWorld class implements an application that
simply prints "Hello World!" to standard output.
//You can add comments
*/
class HelloWorld {
    public static void main(String[] args) {
//Let's print Hello World
        System.out.println("Hello World!");
    } /* some comments */
} //end of class

I want to have this output:

<comments>
/*
The HelloWorld class implements an application that
simply prints "Hello World!" to standard output.
//You can add comments
*/
/* some comments */
//end of class
</comments>

I guess this one is also a comment ^^

Check this:

#!/usr/local/bin/perl

use strict;
use warnings;

my $mlcom=0; # multiline comment
my $line;
my @outarr;

if (@ARGV == 0) { print "Missing argument!\n"; exit 1; }

if (@ARGV >= 2) { print "Too many arguments!\n"; exit 1; }

my $infile="$ARGV[0]";
my $outfile='comments.dat';

open(I,$infile) or die "Error opening input file: $!\n";
my @data=<I>;
close(I);

push @outarr, "<comments>\n";

 foreach my $line (@data) {
 chomp($line);

 if ($mlcom == 1 && $line !~ /\*\//) {
  push @outarr, "$line\n";
  next;
 }
 elsif ($mlcom == 1 && $line =~ /\*\//) {
 push @outarr, "$line\n";
 $mlcom=0;
 next;
 }

 if ($line =~ /\/\*/ && $line =~ /\*\//) {
 $line =~ s!^[^/]+!! ;
 push @outarr, "$line\n";
 next;
 }
 elsif ($line =~ /\/\//) {
 $line =~ s!^[^/]+!! ;
 push @outarr, "$line\n";
 next;
 }
 elsif ($line =~ /^\/\*/ && $line !~ /\*\//) {
 $mlcom=1;
 push @outarr, "$line\n";
 next;
 }
 else {
 next;
 }
}

push @outarr, "</comments>\n";

open(O,">$outfile") or die "Error opening output file: $!\n";
print O @outarr;
close(O)

Script demonstration / Test run:

$ cat myfile.cpp
/*
The HelloWorld class implements an application that
simply prints "Hello World!" to standard output.
//You can add comments
*/
class HelloWorld {
    public static void main(String[] args) {
//Let's print Hello World
        System.out.println("Hello World!");
    } /* some comments */
} //end of class
$ 
$ ./pickcomm.pl
Missing argument!
$ 
$ ./pickcomm.pl myfile.cpp another.cpp
Too many arguments!
$ 
$ ./pickcomm.pl myfile.cpp 
$
$ cat comments.dat
<comments>
/*
The HelloWorld class implements an application that
simply prints "Hello World!" to standard output.
//You can add comments
*/
//Let's print Hello World
/* some comments */
//end of class
</comments>
$ 

---------- Post updated at 02:21 ---------- Previous update was at 02:17 ----------

Let me know if this Perl script is an option for you, if yes, I could add file type checking of the argument (file name).
Aiiyeh, I see you want them grouped too ... (that should also be no problem: simply creating 2 (or 3) arrays for 2 (or 3) types of comments)

Thank you for the answer, but I want something defferent (no Perl) , like this:

#!/bin/sh

if [ -f  $1 ]; then
echo " file $1 exists"
else
echo "file $1 does not exists"
fi
exit 1

#  but I want to check if file is .java or .c++ not a simple file (-f), I don't know how to do it..
# if all are ok I call awk

BEGIN awk'

regular expression for comments /*...*/, //.

END{print comments}'
#!/bin/sh

if [ -f  $1 ]; then
echo "file $1 exists"
else
echo "file $1 does not exists"
exit 1 #needs to be before the "fi" line, else the script will never run the remaining parts of the script.
fi

ext=$(echo $1 | sed 's/.*\.//')

if [ "$ext" = "c++" ] || [ "$ext" = "java" ]; then
echo "file $1 is a .$ext file"
else
echo "file $1 is neither a .c++ nor a .java file!"
exit 1
fi

When I run it , it prints this, not the comments:

./code.txt HelloWorld.java

file HelloWorld.java exists
file HelloWorld.java is a .java file

Yeah, because I (apart from the Perl script) only helped you to solve this problem:
# but I want to check if file is .java or .c++ not a simple file (-f), I don't know how to do it..

The part with awk and regex (the hardest part IMHO) is still open.

1 Like