Grep -v (inverse matching)

I am totally unexperienced in writing scripts of any kind.
I am working on Mac OS X and would like to run a shell script to find files in a directory that do not conform to a specific naming convention and print to a text file in the same directory.

For example, I have a folder called ScriptTest.
In this folder there are 6 jpg images that must all follow this naming convention:
6 digits(0-9), an underscore, 3 digits(0-9), an underscore, 1 digit(1-4) (ex. 1234567_123_1).
I need to output the names of the files that do not conform to this naming convention, to a text file in the same directory.

I've searched many different forums, but all of the samples I find, either don't produce the results I need, or fail altogether.
Most have used a combination of find or ls used in conjunction with grep and xargs. I still have yet to get one to work...

I do like to research and try to figure things out on my own, but I'm at a dead end here. Any help, would be greatly appreciated!

ls | grep -v "^[0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9]_[1-4]\..*$"
1 Like

Thank you for the prompt reply!
However, still not there yet.

Here is the original ls of the folder contents(6 files):

1234567_123_1.jpg
1234567_456_1.jpg
1234567_56_2.jpg
123456_123_2.jpg
9876543-654_1.jpg
9876543_654-1.jpg

This is there output of the script(should be 4 files):

1234567_123_1.jpg
1234567_456_1.jpg
1234567_56_2.jpg
9876543-654_1.jpg
9876543_654-1.jpg

As you can see, the first 2 follow the naming convention perfectly, and should not be included in the results. The next line reporting back is named improperly, and is correct for being listed as the 2nd set of numbers is not 3 digits long. The last 2 lines reporting back are also correct for being reported as there are dashes instead of underscores in each of these. Lastly, there is number that did not even report back, and it should have since it is missing a digit from the first set of 7. See original input files: "123456_123_2.jpg" should have also reported back.

To summarize, the 2 files that were correctly named should not have been output, but were. And, it also failed to output the file that was missing a digit from the first part of the name.

Any thoughts??
Again, thank you for the prompt reply...

Your regex looks for 6 digits to match the begin of file name, as specified. Not 7. So the output is entirely correct as your files start with seven chars.

1 Like

Thanks Rudi! I totally missed my typo in the original post, it should have said 7 instead of 6.

I have added the 7th digit to the string and it produced the correct results!

Thanks for the extra set of eyes!

---------- Post updated at 05:37 PM ---------- Previous update was at 05:35 PM ----------

As pointed out by RudiC, I only designated that the first set of digits at 6 when it should have been 7. I have since added in the 7th to the string, and it worked perfectly!

Thank you for your help with this!!!
Much, much appreciated!!!

:b::b:

1 Like

Ok, now they are going to hand-off a folder with several subfolders, each containing jpeg images.

Can I search recursively and only pull the jpg names, not the directory names?

What can I add to the script to do so?

Thanks everyone!

Try

ls -R | grep -ov ...

Hi Rudi. Thank you again for the reply.
I added your suggestion, this is what the code now looks like now:

ls -R | grep -ov "[0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9]_[1-4]\..*$" | xargs ls > Bad_Filenames.txt

This is what I received in the Terminal window as feedback:
ls: ./lrg:: No such file or directory
ls: ./med:: No such file or directory
ls: ./sm:: No such file or directory
ls: 1234567_56_2.jpg: No such file or directory
ls: 123456_123_2.jpg: No such file or directory
ls: 9876543-654_1.jpg: No such file or directory
ls: 9876543_654-1.jpg: No such file or directory

It did output the file, but listed all of the images, not just the ones that did not follow the naming convention.

I have attached my test folder structure so all can see what I'm trying to do.
It also contains the output file that resulted.

Ok, so I changed the xargs command from:

xargs ls > Bad_Filenames.txt

to

xargs > Bad_Filenames.txt

This gives me a file with the correct output, but can I have it listed 1 item per line, as opposed to all results on a single line?

What be the output of the ls ... | grep -ov ... alone, without the pipe to xargs? I think the error msgs above are self-explanatory as the filenames without path prepended of course don't exist...