Clense Junk Data File - Using Shell or awk or sed

Hello Shell Gurus i need help in solving this puzzle. We have a junk data file that needs to be fed into the database. Need to clense the data file thru shell script. I am not a expert and so need help with

Here is what i need to do on the input file

-Step -1 Replace all pipes �|' within the file with space � �

-Step - 2 Remove Special Character and junk data within the file - Tricky part is we do not have a defined set of special / junk character. Solution would be to remove any character that's not a part of the keyboard stroke.

Remove Character NOT IN [ A-Z, a-z, 0-9, `,~, !, @, #, $, %, &, *, (, ), _, -, + ,=, .,",',:,;,{,},[,],<,>,?,/,\,|,, )

NOTE Basically remove any special charater thats not on the key board stroke.

  • Step - 3 Check the count of pipes on each line of the data to make sure we have the correct number. I would receive 4 pipes on each line. Which means if there are less we need to keep pading the next line ( concat the below lines ). This fields is basicall a memo where the user would have typed a small paragraph that needs to be joined into a single line.

-Step - 4 Replace all zzz with pipe �|'

Note : Below is a QA step to be embedded within the script after clensing. This is just to spit out a error log file that can be used to identify and fix records manually

-Step - 5 Check the length of the 2nd field > 50 and third field > 200 if yes write to error log file the line number and the record info

-Step - 6 Check the number of fields or pipe within each line. if fields not equal to 4 then write to the same error log. The line number and record record info

Sample Broken Lines and data
-----------------------------

467zzzComputer|MonitorzzzPurchase Prise $150
Best Price $100
Cheapest Price $75
[RIGHT]highest price $200zzzTzzz [/RIGHT]

Correct record would look like this
467|Computer Monitor|Purchase Prise $150 Best Price $100 Cheapest Price $75 highest price $200|T|

Note. Broken lines fixed. The '|' got replaced with a space where it read Computer|Monitor. The memo field converted into single line. Also all zzz got replaced with a pipe.

Thanks

 tr -s '|' ' ' < oldfile > newfile
sed 's/^A-Za-z0-9, `~!@#$%&*()_-+=."\|':;{}\[\]<>\?\/\\//g' filename > newfile

Not sure about this step....

sed 's/zzz/|/g' oldfile > newfile