Extract string between two delimiter

I want a string between two delimeter like ( ) from file.

Input File,

2007_08_07_IA-0100-014_(January).PDF 
2007_08_07_IA-0100-031_(January February March
April June July).PDF 
2008-02-28_KR-1022-003_(January
febuary
march 
april
may).CSV

Output File,

January
January February March April june july
January February March April May

Can somebody help me with the above situation,

Are these file names in different lines? or in a single line?

--ahamed

Try

awk '{while (!($0~/\.... *$/)) {getline x;$0=$0" "x}}{gsub (/^[^(]*\(|\).*$/,"")}1' file
January
January February March April June July
January febuary march  april may

Another approach:

awk '{v=$0;gsub(/.*\(|\).*/,x,v);s=s?s OFS v:v}/\./{print s;s=""}' file
1 Like

yes this file names in different lines.

Hello Yoda,

Could you please explain this code, I will be grateful to you.

Thanks,
R. Singh

Thank you. I need one more help, I have another file with just delimeter change( ) to " ".
so in this code where i have to change?

@Pratik Majithia

This will make it more complicate to know where to start/stop. Please post a complete example of the real file.

@R. Singh

awk '
	{
	v=$0			# stor line $0 in variable v
	gsub(/.*\(|\).*/,x,v)	# remove all text before "(" and after ")" in variable v
	s=s?s OFS v:v		# if s does not exist s=v, if s exist append v to s using OFS a separator s=s OFS v (This prevent a blank space in front of line)
	}
	/\./ {			# if line does contain a "." we are on last line.
		print s		# print the created s
		s=""		# clear s
		}
	' file
1 Like

Could you please explain me how this command works??

Thanks is advance

Changing RudiCs code we can do this regarding () is changed to ""

cat file
2007_08_07_IA-0100-014_"January".PDF
2007_08_07_IA-0100-031_"January February March
April June July".PDF
2008-02-28_KR-1022-003_"January
febuary
march april
may".CSV
awk -F\" '{while ($0!~/\./) {getline x;$0=$0" "x}} {print $2}'
January
January February March April June July
January febuary march april may

Modified post#3 RudiC's solution

awk '{while (!($0~/\.... *$/)){getline x;$0=$0" "x}gsub(/.*_"|".*/,y)}1' infile

--ahamed

---------- Post updated at 03:33 PM ---------- Previous update was at 03:33 PM ----------

Thank you, Code is working fine with above file, but i have another file, and this code is not work with this file, I want to extract data from this file,
INPUT FILE

db2 +c "alter table $SCHEMA.$tabname activate not logged initially" >>$LOG 2>&1
db2 +c "alter table $SCHEMA.$tabname activate not logged initially & 
$SCHEMA.$tabname activate not logged initially JOIN
$SCHEMA.$tabname activate not logged initially INNER
$SCHEMA.$tabname activate not logged initially" >>$LOG 2>&1
db2 +c "alter table $SCHEMA.$tabname activate not logged initially" >>$LOG 2>&1
db2 +c "alter table $SCHEMA.$tabname 
activate not logged initially" >>$LOG 2>&1

And I want data between two delimeter " " in one line only.

Each command/solution we give is coded for a different pattern. If you come with new patterns every time and say that it is not working, I don't think we can do much!
Try to understand the solutions which are given and extend/modify for your needs.

--ahamed

just check both the file are same, just content was different.
but still code is not working.
It's not a different pattern.

Your are fixing the symptoms. Fix the program that gives wrong output.

awk     '       {while (!($0~/\.... *$/)) {             # all filenames have extensions: dot + three chars = \.+... before spaces and EOL 
                        getline x;$0=$0" "x}}           # while no ext: get next line; append to $0
                {gsub (/^[^(]*\(                        # subst: BOL, any chars except "(", and "("
                                |                       # OR
                                 \).*$/                 # ")" any chars EOL
                                        ,"")            # with nothing (i.e. remove); leaving pure filename behind
                }
         1                                              # print (modified?) $0
        ' file                                          # work on file

---------- Post updated at 18:12 ---------- Previous update was at 18:03 ----------

EDIT: I have to second ahamed101 in post#13 that moving targets are really annoying! Still, try this:

awk '{while (gsub(/\"/,"&")%2) {getline x; $0=$0" "x}}1' file 
db2 +c "alter table $SCHEMA.$tabname activate not logged initially" >>$LOG 2>&1
db2 +c "alter table $SCHEMA.$tabname activate not logged initially &  $SCHEMA.$tabname activate not logged initially JOIN $SCHEMA.$tabname activate not logged initially INNER $SCHEMA.$tabname activate not logged initially" >>$LOG 2>&1
db2 +c "alter table $SCHEMA.$tabname activate not logged initially" >>$LOG 2>&1
db2 +c "alter table $SCHEMA.$tabname  activate not logged initially" >>$LOG 2>&1

You now have the full lines in $0 and can operate on it as need be.

@RudiC you missed this And I want data between two delimeter " " in one line only.

This should give correct result:

awk -v RS='"' '/LOG/ {f=1}  NR%2==0 {gsub(/\n/,x,$0);s=s $0} f{print s;s=x;f=0}'
alter table $SCHEMA.$tabname activate not logged initially
alter table $SCHEMA.$tabname activate not logged initially & $SCHEMA.$tabname activate not logged initially JOIN$SCHEMA.$tabname activate not logged initially INNER$SCHEMA.$tabname activate not logged initially
alter table $SCHEMA.$tabname activate not logged initially
alter table $SCHEMA.$tabname activate not logged initially

But its much better to fix the output of the program, and does the output vary all the time? How many version is needed?

@Jotne: No, I did not miss it. I left it to the requestor's discretion what to do with the full lines. Just adding {FS="\""; $0=$0; print $2} would yield the results you presented...
But I agree, all these are band aids not addressing the root problem.

Okey

Just for the fun of it, a much shorter version :slight_smile:

awk 'NR%2==0 {gsub(/\n/,x);print}' RS='"' file
alter table $SCHEMA.$tabname activate not logged initially
alter table $SCHEMA.$tabname activate not logged initially & $SCHEMA.$tabname activate not logged initially JOIN$SCHEMA.$tabname activate not logged initially INNER$SCHEMA.$tabname activate not logged initially
alter table $SCHEMA.$tabname activate not logged initially
alter table $SCHEMA.$tabname activate not logged initially

Nice!