My requirement is to find the null values in particular column of a file and reject it in case if it contains null values. But the challenge is that I want a common command which can be used across different file, as the position of the column we need to check for different file may get vary.
For ex. in the first file we need to check null values in 2nd & 3rd column, where as in the second file we need to check in 4th, 5th & 6th column.All my files are pipe delimited
I have a few to questions pose in response first:-
Is this homework/assignment? There are specific forums for these.
What have you tried so far?
What output/errors do you get?
What OS and version are you using?
What are your preferred tools? (C, shell, perl, awk, etc.)
What logical process have you considered? (to help steer us to follow what you are trying to achieve)
Most importantly, What have you tried so far?
There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.
We're all here to learn and getting the relevant information will help us all.
Sorry for not providing much details on this. Please see my response below.
Is this homework/assignment? There are specific forums for these.
--> This is project requirement for me, where I need to implement for all the incoming files
What have you tried so far?
--> I tried different by using awk command
awk -F ',' $2!=''||$3!=''
But the problem is that the position of the column where I need to check null values vary.More over I am not that expert in UNIX.
What output/errors do you get?
--> I got my desired o/p, but which can used for only for specific file.I am not able to come up with a shared command which can be used across all the files in such a way that, we can pass the column position as parameter.
What OS and version are you using?
--> I am using UNIX
What are your preferred tools? (C, shell, perl, awk, etc.)
--> awk
Hopes this helps..Kindly let me know if you need more detail
Please, could you provide a realistic input sample of your file?
Please, could you provide a sample of the kind of output you want based on that input file?
For the above file I need to reject the second record as the second column contain null value.
File2
1235,afg,fgdf,nmbs,posk
1345,,fg,,mnsb
For file 2 by requirement is different where I need to reject a record where 2nd and 4th column is null
So I want to write a unique command where I can pass the column position to check where it is null or not.For ex. I want to pass column position as 2 for the first file and column position as 2&4 for the second file
From this awk -F ',' $2!=''||$3!='' I can see that you are separating fields on the comma but then I get confused. I think you test that the 2nd field is not null OR the 3rd field is not null.
What is the end goal? I'm wondering if a grep with an extended regular expression might be better, something like this:-
grep -Ev "^[0-9]*,," file1
This is excluding ( -v ) lines that match the pattern "Any line starting with zero or more digits followed by two commas (i.e. a null value)"
If that's okay, then we could do something similar for file2, but I'd rather not leap into that if the above is not what you want. The rules can be adjusted to take account of your real data, of course.
Could you please try following too once and let me know if this helps you.
You could dynamically put as per file like whichever field you want to check.
Unfortunately, you did not provide a sample of the desired output and your words leave a lot to interpretation. The input provided is quite ambiguous, as well, due to the skimpiness of the sample.
In post #1 you said:
Which make the examples not realistic neither.
A record normally is a string of characters terminated at the new line (a full human line). Is that the same way you are interpreting it? Do you want the resulted output to exclude any lines that match your criteria?
In your comment "2nd and 4th column is null":
Is your criteria for File2 to exclude the lines that have BOTH fields 2 AND 4 empty or to exclude lines that have EITHER fields 2 OR 4 empty?
The result which the above code is giving is, printing all the records where the column 2 does not contain any empty value.Thanks RavinderSingh .
I need some more modification with the above code in such a way that correct records should print to one file and incorrect records to other file.Kindly help in modifying the above code to achieve this functionality also.
Also I want to parameterise the column position and file name, so that I can create a script and pass the file name and column position as values dynamically.
Also you have put -F"|" in above code just wanted to tell you it means you are setting delimiter as | but in your sample Input_file I couldn't see that, so if you have different Input_file which is having | as delimiter then it is ok else we have to see how the Input_file looks like.