deleting the part of the file(overwrite) using start and end point

here is the contents of bigfile.sql

CREATE TABLE `Table11` (
`id` int(11) NOT NULL ,
`entityName` enum('Lines','EndUsers') COLLATE utf8_unicode_ci NOT NULL,
`parentAllianceMigrationProjectId`  varchar(255) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `Table22` (
`empNo` int(13) NOT NULL AUTO_INCREMENT,
`channels` enum('one','two','three') COLLATE utf8_unicode_ci NOT NULL,
`reserved6` varchar(255),
`accountStatus` enum('Active','Locked','Expired','Deactivated') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'Active',
PRIMARY KEY (`empno`)
) ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `HOOOP` (
  `allianceSiteId` int(11) DEFAULT NULL,
  `trunkGroupsId` int(11) DEFAULT NULL,
  `lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `allianceSiteId` (`allianceSiteId`,`trunkGroupsId`)
) ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `Table33` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`allianceId` int(11) NOT NULL,
`migratedAt` datetime DEFAULT NULL,
`migrationDetail` text COLLATE utf8_unicode_ci,
`unigyId` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

here is the content of tmpfile.sql

CREATE TABLE `HOOOP` (
  `allianceSiteId` int(11) DEFAULT NULL,
  `trunkGroupsId` int(11) DEFAULT NULL,
  `lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `allianceSiteId` (`allianceSiteId`,`trunkGroupsId`)
)

this tmpfile.sql is extracted contents from bigfile.sql... now what i want to do is delete these extracted content from main file that is bigfile.sql... this i want to do dynamically.. but for explaining purpose lets say we have

firstword : CREATE TABLE `AllianceSiteTrunkGroupTrunkGroupsMap` (
lastword : ) ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

where firstword and lastword are the first and last line of the tmpfile.sql.. ofcourse the last line of tmpfile.sql has only ")" but i am appending "ENGINE=InnoDB AUTO........etc etc" to identify since main file (bigfile) has this.
i tried

sed '/"firstword"/,/""lastword/d' bigfile.sql

but this is not working.... can anyone help me... :wall::wall::wall:

---------- Post updated at 02:53 PM ---------- Previous update was at 02:50 PM ----------

spell mistale its this which i tried..

 sed '/"$firstword"/,/"$lastword"/d' bigfile.sql

It's either

sed "/$firstword/,/$lastword/d" bigfile.sql

or

sed '/'$firstword'/,/'$lastword'/d' bigfile.sql

You could also do:

table=HOOOP 
sed '/CREATE TABLE `'$table'`/,/^(/ d' bigfile.sql

All that is providing your table block doesnt contain a slash.

1 Like

thanks mirni... your code is working good :slight_smile: :slight_smile: thanks a lot...
i have one more doubt... its almost similiar....

---------- Post updated at 06:38 PM ---------- Previous update was at 06:33 PM ----------

consider bigfile.sql...
now i have two variables named
startpoint and endpoint..

 
tablename="HOOOP"
startpoint="CREATE TABLE `$tablename` ("
endpoint=") ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;"

how to extract lines from startpoint to endpoint from bigfile.sql and paste it in a temporary file like tmp.sql...

after executing the output in tmp.sql should be

cat tmp.sql

CREATE TABLE `HOOOP` (
  `allianceSiteId` int(11) DEFAULT NULL,
  `trunkGroupsId` int(11) DEFAULT NULL,
  `lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `allianceSiteId` (`allianceSiteId`,`trunkGroupsId`)
)

---------- Post updated at 06:39 PM ---------- Previous update was at 06:38 PM ----------

i have done the same function using while loop its working fine the problem is its efficiency is poor since it takes lot of cycles of while loop to find it... is there any one liner to do this function...

---------- Post updated at 06:58 PM ---------- Previous update was at 06:39 PM ----------

here is a code to extract lines from the dump on the basis of "CREATE TABLE" and "ENGINE"

perl -lne '(/^\s*CREATE\s+TABLE/i .. /^\s*\)\s+ENGINE=InnoDB/i) && print;' bigfile.sql >tmp.sql

but what i want to include is cut from CREATE TABLE $table_name to engine.. (that is include table name in the above command). can you guys help me out...?

Careful, you don't have it right:

tablename="HOOOP" 
startpoint="CREATE TABLE `$tablename` (" 
endpoint=") ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;"

Try to echo the $startpoint. You will not have the backticks there. It's because the backticks mean "output of this command" when in weak (double) quotes. So the shell is trying to execute command "HOOOP" and inserting the output (empty) in the var $startpoint.
What you want, is to escape the backticks:

tablename="HOOOP"
startpoint="CREATE TABLE \`$tablename\` ("
endpoint=") ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;"

echo start $startpoint

sed "/$startpoint/,/$endpoint/ w tmp.sql" < mysql.big
sed "/$startpoint/,/$endpoint/ d" mysql.big > mysql.new

The 'w' command of sed will write the block to tmp.sql. Watch the spaces, as any extra space will be considered as part of the filename:

sed "/$startpoint/,/$endpoint/ w tmp.sql " < mysql.big

Will create a file named 'tmp.sql '.

1 Like

thanks the code is working :-)... but the thing is i need to validate whether the table exists in the file or not.. if the particular startpoint is not there in the file this sed command should not be executed and flag must be set to 1..
like

 
if $startpoint exists in the file
then
     sed "/$startpoint/,/$endpoint/ w tmp.sql " < mysql.big
else
      flag=1
fi

thats is i am checking whether the table exists in the file or not...

also i have one more doubt... how to check how many tables present in the file or not.. currently i am using while loop to find it..

while read line1
do
        if [[ `expr match "$line1" ".*CREATE TABLE.*"` != "0" ]]
        then
                counter1=`expr $counter1 + 1`
        fi
done < onlytables1.sql

but this is consuming time.. is there any one line code to perform this...?

---------- Post updated at 11:52 AM ---------- Previous update was at 11:38 AM ----------

i found out how to find the number of tables (by trial and error ofcourse) :slight_smile:

cat 1.sql | grep -i CREATE | wc | awk -F' ' '{print $1}'

---------- Post updated at 01:45 PM ---------- Previous update was at 11:52 AM ----------

i found out how to find whether table name is there or not.. thanks anyway mirni :slight_smile:

cat testdump1.sql | grep -i "CREATE TABLE \`$tablename\` (" | wc | awk -F' ' '{print $1}'

the output will be 1 if present else 0.. thanks.. i might have few more doubts wil ask you gradually.. :slight_smile:

Glad you are making progress.
Let's simplify things a bit:

Your test, whether the pattern was found can be done after the sed command. If pattern was not found, the file will be empty. So:

sed "/$startpoint/,/$endpoint/ w tmp.sql " < mysql.big
if [ -s tmp.sql ] ; then  #is file non-empty?
   echo pattern found
else
   echo not found  #empty or doesnt exist
fi

Counting how many table entries, this can be done by grep alone, just use the -c option:

grep -c "CREATE TABLE \`$tablename\` (" testdump1.sql

Just beware:
sed's pattern matching is greedy. So, the range /start/,/end/ will match all lines between the first 'start' and last 'end'.

1 Like

as you can see in the attachment.. i have uploaded log as out.txt.. the output displayed at the promt.. the sed command is working fine for intial content of the files... but as you said its greedy, for the the last but one table its consider both the table of interest and next table... why is that.... how could it work fine initially and fail at the end...?.. oberserve tableextract1.sql and tableextract2.sql in the output...

and also when i run sed command.. its performs some function but the same time displays all the contents at the promt.. how to subdue this feature.. i mean it should perfrom the function but not display all the contents of the file ....

sorry mybad.. :slight_smile: i had made a silly mistake in the code.. :slight_smile:

---------- Post updated at 11:43 AM ---------- Previous update was at 11:42 AM ----------

but still i need help with my second doubt.. that is how to make sed not display all the contents of the file when its run...?

I think this option might help u...

-n, --quiet, --silent
suppress automatic printing of pattern space

1 Like

Sed will by default print all the lines in pattern space. If you want it to change this, use the -n switch. Try this:

startpoint='`HOOOP`' 
sed -n '/CREATE TABLE '$startpoint'/,${p; /^) ENGINE/q}' mysql.big 

This will start printing every lines after "CREATE TABLE $startpoint", including, and will quit (q) when the line begins with ') ENGINE'.

1 Like

thanks siva shankar.. it worked :slight_smile:

---------- Post updated at 12:10 PM ---------- Previous update was at 12:09 PM ----------

oh this idea is good, i will try it out now mirni.. :slight_smile: thanks

---------- Post updated at 12:35 PM ---------- Previous update was at 12:10 PM ----------

@mirni.. i tried the below thing

echo "`sed -n '/CREATE TABLE \`$table_name1\`/,${p; /^) ENGINE/q}' testdump2.sql > tablextract2.sql`"

but the file is empty... and also after changing most of the while loops the code is reduced from 14 secs to 13.5 secs execution time for a file with 4 tables... i think its consuming time while comparing two tables extracted into a file.. the comparion algorithm which i am using is bubblesort.. that is first line is compared with all the lines in other file.. etc etc.. i think i have to try `diff `command and put the differences to a new file and try out to extract line from this.. will try this method today.. hope it works out :-\

Try this snippet...

sed -n '/CREATE TABLE \`$table_name1\`/,/^ENGINE/w tablextract2.sql}' testdump2.sql

As mentioned before, sed's pattern matching is greedy. It will write all lines between the first 'CREATE TABLE' and last '^ENGINE'. That's why this will not work the way OP wants.

1 Like

consider the contents of the file is

cat tablextract.sql
CREATE TABLE `Table33` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`allianceId` int(11) NOT NULL,
`migratedAt` datetime DEFAULT NULL,
`migrationDetail` text COLLATE utf8_unicode_ci,
`unigyId` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
)

and

col="id"

so if i perform

 cat tablextract2.sql | grep $col

or

 line=`cat tablextract2.sql | grep $col`

it displays two line since id is present in two lines.... so my question is how to make it display one line after another say in "for" loop or something like that.. i want to work on them individually so...

grep $col tableextract2.sql | while read line ; do 
   echo $line
   #massage $line here
done
1 Like

thanks your code works partially for my logic... what i am trying to do is, search the column name in the file and if its present do something else do something... since i used your code with while loop..

say,

#line333=`cat tablextract2.sql | grep $colname1`
    
                                                        grep $colname1 tablextract1.sql | while read line22 ; do
                                                        echo "from while loop: $line22"
                                                        echo "counter: $counter"
                                                        counter=`expr $counter + 1`

                                                        if [[ "$line22" != "" ]]
                                                        then
                                                                len2=${#line22}
#                                                                charlast=${line22:$len2-1:$len2}
#                                                               if [ "$charlast" = "," ]
#                                                                then
#                                                                     line22=${line22:0:$len2-1}
#                                                               fi
                                                        else
                                                                 insert_column "$table_name" "$line11"
                                                        fi

here if that column name is not present it wont go inside the while loop only... how to rectify this...?

How about:

grep $col tableextract2.sql > tmpfile
if [[ -s tmpfile ]] ; then 
  while read line ; do
      echo $line
     #massage $line here
  done < tmpfile
elif
  echo nothing grepped
fi
1 Like

okay thanks.. this might work :slight_smile: will try this

here is one of my last problem which i am facing..( :slight_smile: i am assuming its the last issue)

cat > tablextract2.sql
CREATE TABLE `Table11` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`entityName` enum('Lines','EndUsers') COLLATE utf8_unicode_ci NOT NULL,
`parentAllianceMigrationProjectId` int(11) NOT NULL,
`lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
colname1="id"
flag=0
 grep -w -i $colname1 tablextract2.sql | while read line ; do
        echo "$line"
        flag=1
        echo "flag: $flag"
done
echo "flag: $flag"

output

`id` int(11) NOT NULL AUTO_INCREMENT,
flag: 1
PRIMARY KEY (`id`)
flag: 1
flag: 0

when the pointer comes out of the while loop why is my flag set to 0 again.. how to overcome this.. is there any global/local variable thing in shell scripts??

This is a known issue with bash.
What is it that you are trying to do? grep for id and set the flag to 1?

--ahamed

1 Like