Add comment on last line if found match

kttan · December 5, 2017, 2:07am

Hi All,

totally new on it , normally use it for just 1 line.
i'm looking for help.

i'm have 2 file.

file 1 :

-------------------------------------------------- 
c12
c1
c3
--------------------------------------------------

file 2:

other content 
--------------------------------------------------
 test "analog/c12" 
 test "analog/c13"
 test "analog/c1"
other content.
--------------------------------------------------

expected output.
file 3:\(if possible direct change at file 2)

--------------------------------------------------
other content 
 test "analog/c12"   ! comment 
 test "analog/c13"
 test "analog/c1"     ! comment
other content.
-------------------------------------------------

other question is:
if using awk , how to search at get data from file1 and store as variable for using at file 2.

Don_Cragun · December 5, 2017, 2:59am

Having two input files and hoping to get a third output file with no explanation of the logic to be used to make those changes is not at all likely to get you what you want.

Please tell us what operating system you're using.

Please tell us what shell you're using.

Please clearly explain exactly what text should be added to file 2 when certain conditions are met.

Please clearly explain exactly what conditions need to occur to cause a change to be made to file 2 .

Note that putting <space>, <tab>, or (especially) <newline> characters in a filename makes it less likely that people writing scripts to use your file will actually handle your file correctly.

kttan · December 5, 2017, 3:24am

thank you for reply.

OS
window 7 and window xp, using korn shell.

text need at at file 2 if meet the condition is > <space><space>!<space>comment<space>comment at the end of the line.

condition:
if found at file 1 if the line have match at file 2 then add comment.
exp:
file 1: have c12 , then search at file 2 , if found test "analog/c12" then modify this line at file 2, and so on.

RudiC · December 5, 2017, 5:36am

Making quite some assumtions as window's awk 's version's capabilities are not known (to me), nor is the pattern to search (is "analog/" compulsory?), nor WHAT to add (post#1 and #2 info don't match), nor does the thread title match its contents, would this come close to what you want?

awk 'NR==FNR {if (!/^--/) T[$1]; next} $3 in T {$0 = $0 "  ! comment"} 1' file1 FS='[/"]' file2
other content 
--------------------------------------------------
 test "analog/c12"   ! comment
 test "analog/c13"
 test "analog/c1"  ! comment
other content.
--------------------------------------------------

As you can see, a correct and detailed specification helps avoid ambiguities and keeps people from guessing. Be way more precise and specific in the future!

Don_Cragun · December 5, 2017, 6:48am

I agree with RudiC that your specification doesn't match the thread title and doesn't match the output you specified in post #1 in this thread. And you haven't said, since you're working on Windows, whether the input files you're using are UNIX text files or DOS text files. The following should work no matter which type of text file you have and produce the same type of output in file 2 that it found in the the original file 2 . But, of course, it is making lots of assumptions based on your conflicting requirements. (Note that I still think it is a horrible idea to use filenames containing <space> characters. but this script uses the filenames you specified in post #1.

#!/bin/ksh
awk '
{	# Get rid of <carriage-return> at end of line if there
	# is one.  Set cr to <carriage-return> if there was one; otherwise
	# set it to an empty string.
	cr = sub(/\r$/, "") ? "\r" : ""
}
FNR == NR {
	# For all lines in the first input file...
	# Set a search string as an index in the add_com[] array corresponding
	# to this input line.
	add_com["test \"analog/" $0 "\""]
	next
}
{	# For all lines in the second input file, look for a match in add_com[].
	for(i in add_com)
		if(index($0, i)) {
			# Match found.
			# Set this line in the output buffer to include a
			# comment and put back the <carriage-return> if there
			# was one.
			o[FNR] = $0 "  ! comment" cr
			# Note that a modification was made.
			mod = 1
			next
		}
	# No match found.
	# Copy this line to output buffer unchanged (restoring the
	# <carriage-return> if there was one).
	o[FNR] = $0 cr
}
END {	# If any changes were made, copy the new contents of the second file
	# back into that file.
	if(mod)
		for(i = 1; i <= FNR; i++)
			print o > FILENAME
}' "file 1" "file 2"

Note that when you store the above script in a file in Windows, you absolutely must make the text in this file be in UNIX text file format with no <carriage-return> characters in the file. If the location of the Korn shell on Windows is not /bin/ksh , you'll need to change the end of the first line of the script to an absolute path to its location on your systems.

kttan · December 5, 2017, 8:51pm

don cragun:

I agree with RudiC that your specification doesn't match the thread title and doesn't match the output you specified in post #1 in this thread. And you haven't said, since you're working on Windows, whether the input files you're using are UNIX text files or DOS text files. The following should work no matter which type of text file you have and produce the same type of output in file 2 that it found in the the original file 2 . But, of course, it is making lots of assumptions based on your conflicting requirements. (Note that I still think it is a horrible idea to use filenames containing <space> characters. but this script uses the filenames you specified in post #1.
#!/bin/ksh
awk '
{	# Get rid of <carriage-return> at end of line if there
	# is one.  Set cr to <carriage-return> if there was one; otherwise
	# set it to an empty string.
	cr = sub(/\r$/, "") ? "\r" : ""
}
FNR == NR {
	# For all lines in the first input file...
	# Set a search string as an index in the add_com[] array corresponding
	# to this input line.
	add_com["test \"analog/" $0 "\""]
	next
}
{	# For all lines in the second input file, look for a match in add_com[].
	for(i in add_com)
		if(index($0, i)) {
			# Match found.
			# Set this line in the output buffer to include a
			# comment and put back the <carriage-return> if there
			# was one.
			o[FNR] = $0 "  ! comment" cr
			# Note that a modification was made.
			mod = 1
			next
		}
	# No match found.
	# Copy this line to output buffer unchanged (restoring the
	# <carriage-return> if there was one).
	o[FNR] = $0 cr
}
END {	# If any changes were made, copy the new contents of the second file
	# back into that file.
	if(mod)
		for(i = 1; i <= FNR; i++)
			print o > FILENAME
}' "file 1" "file 2"
Note that when you store the above script in a file in Windows, you absolutely must make the text in this file be in UNIX text file format with no <carriage-return> characters in the file. If the location of the Korn shell on Windows is not /bin/ksh , you'll need to change the end of the first line of the script to an absolute path to its location on your systems.

Thank you for reply , it work.

Sorry for not clear explanation given.
i'm a test engineer , sorry not very clear about unix script, i'm looking for for help due to my system running at window with kornshell base.

i have try it , few way search website before post.

i'm looking on how it work on 2 file.
may i know more clear on this :
}' "file 1" "file 2"
2.
may i know is that awk are reading from top to bottom file ?
if this 2 awk, is that it finish the 1st 1 before go to second ?
awk '
{
content
}

awk '
{
content
}

most of the example given are just 1 file on internet.

Don_Cragun · December 6, 2017, 2:35am

You can find lots of examples in the UNIX & Linux Forums of awk scripts that work on two or more input files and that produce two or more output files (although we don't need to do the latter in this case).

In an awk program, each group of statements is of the general form:

condition { action }

Before any lines are read from any of the input files named as operands, the commands specified in the actions of all groups with the condition BEGIN (if there are any) are executed in the order in which they appear in the awk program. There aren't any BEGIN sections in my code for this thread.

After all files of the input files named as operands have been processed, all commands specified in the actions of all groups with the condition END (if there are any) are executed in the order in which they appear in the awk program.

All other groups are processed in the order in which they appear in the awk program for every record (with default options, each input line in each file is a record) is processed. If the condition for a group evaluates to a non-zero numeric value or to a non-empty string string value (i.e., evaluates to TRUE), the statements in the action for that group are executed in order; otherwise the statements in that group are skipped for that input record. If there is no condition at the start of a group, the commands in the action in that group are always executed. If the condition evaluates to TRUE and the action and braces ( { and } ) are omitted, a default action of print (which prints the current state of the current input record) is performed.

I will assume that you can read the manual page on for awk (by giving the command man awk at a ksh primary prompt in your shell window) to see what the standard awk variables, functions, and statements do. I would hope that the comments I supplied in each group explain what that group is trying to do.

The first group:

{	# Get rid of <carriage-return> at end of line if there
	# is one.  Set cr to <carriage-return> if there was one; otherwise
	# set it to an empty string.
	cr = sub(/\r$/, "") ? "\r" : ""
}

(with no condition is executed for every record read from both input files and does exactly what the comments say it does.

The second group:

FNR == NR {
	# For all lines in the first input file...
	# Set a search string as an index in the add_com[] array corresponding
	# to this input line.
	add_com["test \"analog/" $0 "\""]
	next
}

is executed when the condition FNR == NR evaluates to TRUE. It evaluates to TRUE when the Number of Records read from the current File ( FNR ) is equal to the Number of Records read from all files ( NR ) which happens when any line from the 1st input file is being processed. The next statement in this action causes all remaining statements in the current action (if there are any) and in any following groups to be skipped for this input record, causes the next available input record to be read, and starts processing groups in order for that new input record. The combination of the action and the next statement guarantee that the following group will not be performed for records read from the 1st input file.

The third group:

{	# For all lines in the second input file, look for a match in add_com[].
	for(i in add_com)
		if(index($0, i)) {
			# Match found.
			# Set this line in the output buffer to include a
			# comment and put back the <carriage-return> if there
			# was one.
			o[FNR] = $0 "  ! comment" cr
			# Note that a modification was made.
			mod = 1
			next
		}
	# No match found.
	# Copy this line to output buffer unchanged (restoring the
	# <carriage-return> if there was one).
	o[FNR] = $0 cr
}

even though there is no condition is only executed for input files after the 1st input file (and in this code there are only two input files). This group copies the input records as they are read into an output buffer array ( o[] ) with the index in the array being the current input file record number after searching for and updating any lines that contain the key strings created from lines found in the 1st input file.

The fourth group:

END {	# If any changes were made, copy the new contents of the second file
	# back into that file.
	if(mod)
		for(i = 1; i <= FNR; i++)
			print o > FILENAME
}

with the action END (as described before) evaluates to FALSE for every line read from the two input files and is only processed after end-of-file is reached on both input files. As noted in the comments, this group copies the accumulated output buffer back into the last input file. The number of lines found in the last input file ( FNR ) and the pathname of the last input file ( FILENAME ) remain valid during any END actions.

kttan · December 13, 2017, 1:12am

don cragun:

You can find lots of examples in the UNIX & Linux Forums of awk scripts that work on two or more input files and that produce two or more output files (although we don't need to do the latter in this case).

In an awk program, each group of statements is of the general form:
condition { action }
Before any lines are read from any of the input files named as operands, the commands specified in the actions of all groups with the condition BEGIN (if there are any) are executed in the order in which they appear in the awk program. There aren't any BEGIN sections in my code for this thread.

After all files of the input files named as operands have been processed, all commands specified in the actions of all groups with the condition END (if there are any) are executed in the order in which they appear in the awk program.

All other groups are processed in the order in which they appear in the awk program for every record (with default options, each input line in each file is a record) is processed. If the condition for a group evaluates to a non-zero numeric value or to a non-empty string string value (i.e., evaluates to TRUE), the statements in the action for that group are executed in order; otherwise the statements in that group are skipped for that input record. If there is no condition at the start of a group, the commands in the action in that group are always executed. If the condition evaluates to TRUE and the action and braces ( { and } ) are omitted, a default action of print (which prints the current state of the current input record) is performed.

I will assume that you can read the manual page on for awk (by giving the command man awk at a ksh primary prompt in your shell window) to see what the standard awk variables, functions, and statements do. I would hope that the comments I supplied in each group explain what that group is trying to do.

The first group:
{	# Get rid of <carriage-return> at end of line if there
	# is one.  Set cr to <carriage-return> if there was one; otherwise
	# set it to an empty string.
	cr = sub(/\r$/, "") ? "\r" : ""
}
(with no condition is executed for every record read from both input files and does exactly what the comments say it does.

The second group:
FNR == NR {
	# For all lines in the first input file...
	# Set a search string as an index in the add_com[] array corresponding
	# to this input line.
	add_com["test \"analog/" $0 "\""]
	next
}
is executed when the condition FNR == NR evaluates to TRUE. It evaluates to TRUE when the Number of Records read from the current File ( FNR ) is equal to the Number of Records read from all files ( NR ) which happens when any line from the 1st input file is being processed. The next statement in this action causes all remaining statements in the current action (if there are any) and in any following groups to be skipped for this input record, causes the next available input record to be read, and starts processing groups in order for that new input record. The combination of the action and the next statement guarantee that the following group will not be performed for records read from the 1st input file.

The third group:
{	# For all lines in the second input file, look for a match in add_com[].
	for(i in add_com)
		if(index($0, i)) {
			# Match found.
			# Set this line in the output buffer to include a
			# comment and put back the <carriage-return> if there
			# was one.
			o[FNR] = $0 "  ! comment" cr
			# Note that a modification was made.
			mod = 1
			next
		}
	# No match found.
	# Copy this line to output buffer unchanged (restoring the
	# <carriage-return> if there was one).
	o[FNR] = $0 cr
}
even though there is no condition is only executed for input files after the 1st input file (and in this code there are only two input files). This group copies the input records as they are read into an output buffer array ( o[] ) with the index in the array being the current input file record number after searching for and updating any lines that contain the key strings created from lines found in the 1st input file.

The fourth group:
END {	# If any changes were made, copy the new contents of the second file
	# back into that file.
	if(mod)
		for(i = 1; i <= FNR; i++)
			print o > FILENAME
}
with the action END (as described before) evaluates to FALSE for every line read from the two input files and is only processed after end-of-file is reached on both input files. As noted in the comments, this group copies the accumulated output buffer back into the last input file. The number of lines found in the last input file ( FNR ) and the pathname of the last input file ( FILENAME ) remain valid during any END actions.

thank you for very details explanation.

sorry i'm still have 1 newb question.

is that able to skip awk and move to other awk if meet condition ?
because im still new and very confuse on those syntax, now i'm doing multiple awk to let it work.

i want something like

  if(A == 1) then 
     awk '{content}' 
  else 
     awk '{content}'
  end if

currently i'm use the method is write a file with some information and let next awk code use as variable, and then only remove the file.

Don_Cragun · December 13, 2017, 3:42am

kttan:

thank you for very details explanation.

sorry i'm still have 1 newb question.

is that able to skip awk and move to other awk if meet condition ?
because im still new and very confuse on those syntax, now i'm doing multiple awk to let it work.

i want something like
  if(A == 1) then 
   awk '{content}' 
  else 
   awk '{content}'
  end if 
currently i'm use the method is write a file with some information and let next awk code use as variable, and then only remove the file.

I don't know what:

means. I have no idea why you would want to run two seemingly identical awk programs instead of using A == 1 as a condition for the single action (named content in your pseudo-code) in both of your awk scripts.

You said that you're writing a file and let the next awk use something unspecified as a variable, but your code doesn't show that any variables or files are being passed to either of the awk scripts in your pseudo-code.

If there is something in the code I suggested that you don't understand, I'll be happy to try to answer specific questions. If you're just saying that my code is too complex and you don't want to try to understand what it is doing, I'm sorry that I have wasted your time and mine trying to help you.

kttan · December 14, 2017, 2:12am

sorry for poor english , i'm din't mean that your code is complex and i'm dont want to understand, it very great explanation for your code.

the question i'm asking is difference question regrading the question i'm asking previously, but i'm don't want open another topic , scare it become spam topic.

The question is i'm try pass any information to another awk code, since too much condition, im still on learning status , i'm not well to handle with it , so currently i'm doing is split it to multiple awk , and create file and read the file content as variable.

so i'm asking is that any method for 2 awk share same variable, instate of create new file as a variable.

below is a part from my script.
i have 2 type of condition.

if temp.rpt is empty then it will empty the content for board_temp (by using print "" >board_temp, so with this step will making the next awk "look like skip" , and go to 3rd awk.
if temp.rpt have content then proceed with next awk.

#
awk 'END {

if (NR < 2) {
print "" > "board_temp"
}
}' temp.rpt

awk '
{
    x = 0
    Flag = 0
   
    while( getline < "temp.rpt" > 0){
              
        x++ 
        model[x] = $2
        board_number[x] = $1
          
      #  print model[x]
        while( getline < "board_temp" > 0){   
               if($0 == "BOARD " model[x]){
               #       print model[x]
                  FlagFound = 1
               }
              if($0 == "END BOARD"){
                FlagFound = 0
            }
              if(FlagFound == 1){
                  print $0 >> "temp"x
                  if($2 == "1p" && $3 == "100")
                  {
                      print board_number[x]"%" tolower($1) >> "cap_noload.log"
                  }
              }
              
           }     
           close("board_temp")
    }      
        
}' board_temp  


awk 'END {

if (NR < 2) {
   system("cp board board_temp")
   print "" > "cap_noload.log"
}
}' temp.rpt

awk '
{

# find BOARDS at board file and give a flag    
   if($1 == "BOARDS"){
     flag = 1
   }

  if($2 == "1p" && $3 == "100" && flag != 1){
    print tolower($1) >> "cap_noload.log"  
  }
}' board_temp   

#

RudiC · December 14, 2017, 4:38am

Now this is extremely tricky to answer as it has many aspects to cover.

You definitely don't need to be perfect nor fluent in English as this is an international site and erveryone in here is aware of possible language barriers and thus deploys maximum tolerance. Still you should target for best depiction of your problem in terms of precision and details.

Don't be afraid. Moderators in here do their best to tell SPAM from meaningful contributions. You also can contact moderators in a dedicated forum if you feel there could be a misjudgement.

You can't share across awk invocations. You'll have to take the detour via files (or named pipes) or shell variables.

Methinks you're missing the initial part - where temp.rpt is being written - or not. Without digging deeper into your code, why not use one single awk script, mayhap supplying e.g. shell variable having a boolean value indicating that file being empty or not, like e.g.

[ -s temp.rpt ] && FS=1 || FS=0
awk -vSIZE=$FS ' . . . '

or, check within awk with a system function call...