To check the file and remove header before copying

Hi Guys,

I have below directory where there are certain files. Something like below

country_dir

aus_01.txt
nz_01.txt
aus_02.txt
bd.txt

property.txt
aus
nz
bd

I need to remove the header of the file which ends with _01.txt while copying from country directory to another directory

Code which i am using

for i in `cat property.txt`
do
file=$(ls -t1 $filepath|  grep -i $file)

if [ "$i" == "aus" ]
then
if [ -f $file ]
then

      	cp $file /output/aus/
else 
	echo "No aus files"

fi

elif [ "$i" == "nz" ]
if [ -f $file ]
then

      	cp $file /output/nz/
else 
	echo "No nz files"

fi

fi
....
....
done

in the above code i need to remove the header for the files aus_01.txt
nz_01.txt before copying to output directory. Can the changes be done in same code

Not sure I fully understand. You want to copy all files whose names contain a string from the property.txt file, and remove the header line(s) from those who end in _01.txt ?

Your sample code is not of great help to deduct in detail what you're after. I'm afraid it won't copy anything as $file 's doesn't seem to be predictable because a) $filepath is undefined and thus empty b) the ls result to be assigned to file is grep ped for $file itself, mayhap shooting yourself in the foot. $i , the variable looping through property.txt , isn't used for copying, just for testing for a filename existence. And, no provision is made to handle _01.txt files differently.

How many lines are the headers to be suppressed?

Hi Rudic,

To answer your below question

You want to copy all files whose names contain a string from the property.txt file, and remove the header line(s) from those who end in _01.txt? - Yes i need this and wanted to apply the same in my code

filepath=/country/

for i in `cat property.txt`
do
file=$(ls -t1 $filepath|  grep -i $i)

if [ "$i" == "aus" ]
then
if [ -f $file ]
then

      	cp $file /output/aus/
else 
	echo "No aus files"

fi

elif [ "$i" == "nz" ]
if [ -f $file ]
then

      	cp $file /output/nz/
else 
	echo "No nz files"

fi

fi
....
....
done

If your shell (which you failed to mention) is a recent bash , try (yes, admittedly deploying the unloved and sometimes dangerous eval )

X=$(<property.txt)
for PN in $(eval echo $filepath/*{${X//$'\n'/,}}*)
   do   FN=${PN##*/}
        DIR=${FN%%[_.]*}
        [ $FN = ${FN/_01.txt} ] && HDCNT=0 || HDCNT=2
        [ -f $FN ] && tail -n+$HDCNT $FN > /output/$DIR/$FN || echo "No $FN file"
   done

and comment on the result

1 Like

Hi Rudic,

Thanks for your inputs, can this be modified in my code which runs in bash

something like this

sed '1d' $file > $file

Why that sed ? What don't you like in the proposal, which should run unmodified in (a recent) bash ?

The thing is I wanted to modify in my existing code itself where sed '1d' will remove the header file or similar to any other approach. Moreover I am not able to see copying commmand in your code

The tail command in RudiC's suggestion copies your source files to their destinations removing "$HDCNT" lines of headers during the copy.

Your code above will do two things:

  • The shell will see the redirection and truncate the file named by the expansion of $file to size 0, and then
  • The sed command will delete the 1st line (of the empty file) named by the expansion of $file and copy the remaining (non-existent) lines to the output.

Do you really want to change your input files before you copy their contents to their destinations, or do you just need to remove the header lines from the copied files? If you don't need to modify the source files:

sed 1d "$file" > /output/aus/"$file"

will do what you want. If you need to change the source files too:

sed 1d "$file" > "$file.$$" && cp "$file.$$" "$file" && rm "$file.$$"

is a safe way to do what you want without breaking links to your source file (if any existed).

Hi don,

Before I copy I need to change the source files which ends with _01.txt I.e. To remove the header

Fine. And, since you copy the modified source file to the destination (not move it), when you run your script again a few seconds (or an hour, or day, or week, or month, or year) later, it will update that same file again (this time removing a line of data instead of the header). And then it will copy the updated file again to its destination discarding a line of data there as well. And, after you have run your script enough times, your source and destination files will both be empty.

This doesn't seem like a logical way to handle things to me, but you can do it that way if you want to.

Hi don,

Once i have copied i wil remove the files or i will move the files from filepath to temp path and remove the header from that temp path

Hi,

I'm quite new to scripting too.

just curious if the current code you are using is already working and you wanted to enhanced the code to omit the header in individual file_0X.txt?

may i also check if the header string in each individual *.txt is unique or different?

if its unique, maybe this might work?

for i in `cat property.txt | grep -v <HEADER>`
do
file=$(ls -t1 $filepath|  grep -i $file)

if [ "$i" == "aus" ]
then
if [ -f $file ]
then

      	cp $file /output/aus/
else 
	echo "No aus files"

fi

elif [ "$i" == "nz" ]
if [ -f $file ]
then

      	cp $file /output/nz/
else 
	echo "No nz files"

fi

fi
....
....
done

just a thought

Thanks
But yours wont work because there is no header like unique and in property file there is no header string like that i need to remove the header from source file before copying

Huh? You have to remove the header before you copy the source file and then you will copy the source file and then you will copy the source file (to a country directory) and then you will remove the source file OR you will remove the header before you copy the source file and then you will copy the source file (to a country directory) and then you will move the source file to a temporary file an thend you will remove the header from the temporary file again and then you will remove the temporary file. If either of these multistep processes are interrupted and you restart the process, zero of more lines of data will be lost.

Either of those sound like a lot of extra work when it sounds like you get the same results with MUCH less processing if you just copy the source file except for the header to a country directory and then remove the source file. If this processing is interrupted and restarted, no data will be lost.

Why is it that you want to make this so complicated? Why do you need to remove the header BEFORE copying the file? Why isn't removing the header WHILE you are copying the file sufficient?

1 Like

Hi Don,

I agree with your point so is there a best way to tune the code, Like remove header from only the file which ends with _01.txt and copy to respective folders before i copy to destination folder and also i am using multiple if conditions because i need to copy the files to respective destination folders based on property file.Below code which is making redundant

if [ "$i" == "aus" ]
then
if [ -f $file ]
then

      	cp $file /output/aus/
else 
	echo "No aus files"

fi

elif [ "$i" == "nz" ]
if [ -f $file ]
then

      	cp $file /output/nz/
else 
	echo "No nz files"

fi

If you keep showing us the same nested if ladder that you know doesn't work when you have had several posts explaining why that code doesn't work and you seem to ignore all of the suggestions provided, it makes the volunteers here who are trying to help you wonder if there is any reason to respond...

You could start by defining the variables i and file before you use them.

Then you could replace the if ladder used to determine what country code to use with a case statement.

Then you could replace the cp statements with sed or tail commands that have been suggested earlier in this thread.

And, you're probably going to want a while loop reading in the values from your country code file surrounding your case statement.

If I understand what you're trying to do, and I am not at all convinced that I do, it would seem that a more direct approach might help:

cd /country || exit 1
exit_code=0
while read -r country_code
do	file="${country_code}_01.txt"
	if [ -f "$file" ]
	then	sed 1d "$file" > "/output/$country_code/$file" || exit_code=2
	else	printf 'No %s file (%s) found.\n' "$country_code" "$PWD/$file" >&2
		exit_code=2	# omit this line and remove the redirection in the
				# above printf if the above output is a warning rather
				# than an error.
	fi
done < property.txt
exit $exit_code