Help with bash

Hi,

From a folder which contains like 250 *.txt files, I want to do egrep -i "can't|cannot|CANCELLED" . for all the files and then save each file that can't run as

$id.can.txt

Example of the contents inside one of the *.txt file:

This version and especially its site packages were build with gcc and gfortran.
***** Use at your own risk! *****
inp:/folder/1402_3853
out:/folder/res/
id:1402_3853
Can't open file. No such file or directory.

Can someone please help me in doing this task with bash.
Thanks

Rossi

Do you want to move the txt files, or the files they reference?

I agree with CarloM. The specification of what is wanted here is missing crucial information.

I'm guessing that if the input file given in the 1st message in this thread is named anything.txt you want to rename that file to be 1402_3853.can.txt . Is that correct?

Or, do you want a file literally named $id.can.txt that has one line in it containing the name of each file that contains one of the three strings can't , cannot , or CANCELLED (using a case insensitive match)?

Or, do you want something completely different from either of the above?

Hi,

I want to pick the name of the new file from id. So here the id is

1402_3853

and therefore the file name should be

1402_3853.can.txt

..

So I want a file named $id.can.txt that has one line in it containing the name of each file that contains one of the three strings can't , cannot , or CANCELLED .

So you want to rename the files containing those strings AND you want a file containing the names of the renamed files. Do you want the original names or the new names of the renamed files in $id.can.txt ?

EDIT: Don reads better than I do :).

MAybe it's more clear now

---------- Post updated at 09:18 PM ---------- Previous update was at 09:15 PM ----------

In a folder there is a1.txt, b2.txt, c3.txt....

Inside a1.txt :
id : 73254
cannot find

Inside b2.txt :
id :dfgds
Done

Inside c3.txt:
id : 3467
CANCELLED

So I want files named:

73254.can.txt
3467.can.txt

---------- Post updated at 09:35 PM ---------- Previous update was at 09:18 PM ----------

SO any solutions now?

Try this and remove echo when happy:

for FN in $(grep -Eli "can.t|cannot|cancelled" *.txt); do NFN=$(grep id $FN); echo mv $FN ${NFN#id*:*}.can.txt;  done
mv a1.txt 73254.can.txt
mv c3.txt 3467.can.txt
1 Like

I thought I had something ready a lot sooner, until I noticed with your new data that the input file format had changed. (There were no spaces in the id line in your first sample file; but in message #7 in this thread there is a space before the colon and sometimes a space after the colon.) These possible solutions might not be optimal. They were originally designed to create a list of files as you requested in message #4 in this thread.

Your specification still isn't clear as to whether all of the files to be processed are in a single directory or if they are somewhere in the file hierarchy rooted in a given directory. Both of these assume that you only want to process files with names ending in .txt in the current working directory. (If you want to work on all files in the file hierarchy rooted in the current directory we can simplify the find proposal to do that.) Depending on what system you're using and the lengths of the names you're processing the suggestion using just grep may fail by exceeding ARG_MAX limitations. So, if the grep proposal fails, you can use the find proposal instead:

#!/bin/bash
printf "Proposal using just grep:\n"
grep -Eil "can't|cannot|cancelled" *.txt | while read -r file
do      new=$(sed -n 's/^id *: *//p' "$file")
        if [ -n "$new" ]
        then    echo mv "$file" "$new.can.txt"
        fi
done

printf "\nProposal using find and grep:\n"
find . \( ! -name . -type d -prune \) -o \
       \( -name '*.txt' -exec grep -Eil "can't|cannot|cancelled" {} + \) |
while read -r file
do      new="$(sed -n 's/^id *: *//p' "$file")"
        if [ -n "$new" ]
        then    echo mv "$file" "${file%/*}/$new.can.txt"
        fi
done

There are simpler way to make some versions of the find utility only report on files in the current directory. The proposal here should work on any system that supports the find features specified by the POSIX Standards and the Single UNIX Specifications.

Note that RudiC's proposal is a simpler way to do this, but will fail if the length of the list of filenames exceeds ARG_MAX or if a file contains one of the grepped for strings but does not contain a line starting with id followed by zero or more spaces followed by a colon. All of our proposals may fail in unspecified ways if one or your input files contains more than one line starting with id followed by zero or more spaces and a colon (for either of my proposals) and if one of your input files contains more than one line containing id anywhere (for RudiC's proposal).