First find the special character, from the special character take next two bytes convert the bytes to decimal and replace with next present byte of decimal value times.
E.g.
Input: 302619�1A?
Output: 302619(3 spaces for �1A)??????????????????????????
The below code failed with below error
"Unrecognized switch: -E (-h will show valid options)."
Could you guide us to change the code to use it in SunOS.
Welcome dineshnak,
I don't really see the question clearly, but I have a few to questions pose in response first:-
Is this homework/assignment? There are specific forums for these.
What have you tried so far?
What output/errors do you get?
What OS and version are you using?
What are your preferred tools? (C, shell, perl, awk, etc.)
What logical process have you considered? (to help steer us to follow what you are trying to achieve)
Most importantly, What have you tried so far?
There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.
We're all here to learn and getting the relevant information will help us all.
Hi Robin,
We are getting a fixed length file got compressed with special character "�07 & �1A?, etc.," in between the characters, need to read the special characters along with the next two bytes hexadecimal value. Once we read the hexadecimal value need to convert to decimal and add mentioned byte (symbol or space) after hexadecimal value in a file. We tried using awk, sed but no progress in output. we need some sample script or information to run the script on SunOS and drive further.
Is this packed-decimal data from a mainframe perhaps? It would probably be easier to generate the file as truly plain text at the source before transferring it. If you are using FTP, make sure the transfer is forced to be an ASCII transfer.
From your example in the first post, I think what you want is to read the 1A? to mean 'please insert 26 (decimal) question marks' and any other time we hit the special character in the same line or any other line.
It makes it all a bit complex, hence why I suggest you generate a fully expanded file at the source. If it won't fit, or the transfer takes too long, then there are commercial compression tools that are available for pretty much any platform combination. We changed one transfer from 23 hours to 4 by using one, but there will be others out there.
We encounter fixed format special characters like "? or spaces" in the file between characters after hexadecimal values. The file got FTP from windows machine, provide some information how to handle the situtation using UNIX script.
Is this a Windows compressed file, a Winzip file or something else?
It may be possible to expand this with gunzip or similar utilities if these are available to you, but we'd need to know how it is generated in the first place.
The substitution regex above matches any character which is not a letter, number or space (as defined by the current locale) followed by 2 characters that could be interpreted as hexadecimal, followed by any character and replaces them with the 3 spaces followed by the character repeated n times.
The fact that Perl allows executable code in the substitution block means we can do things like this not available in sed or awk as a single substitution.
The e flag marks the substitution block for evaluation, the g flag would allow the substitution to be applied globally rather than to just the first match.
File FTP from windows system without any compression, only data got compressed, we need to expand the data by finding the special character ie. "�07 & �1A?.," lies in the file and to pad the next byte ie(space or ?) after the hexadecimal value convert to decimal.
By typo missed the data in between. The expectation of your's is correct, could you please provide the logic how would you processed the data and got the result. The file got generated by VB script.
It was read by eye and converted by brain to hopefully confirm that we are working on the correct input to output relationship, hence why CODE tags are so important. I have done no coding so far.
Is the special character always the same? Can you confirm what character it is with od? If you can cut down a copy of the file to just contain the character, then use:-
od -x input_file
and paste the output here in
```text
&
```
tags, then that would help.
Is this a large file that needs the powerful processing of awk, sed or perl, or would a simpler, but slower loop in a shell be acceptable? If the file is smaller , but there are many than you want to call in a loop, sometimes calling awk etc. can work out slower.
I was expecting to see just a single line with the character code for � and a terminating 0a
I don't see a match for the rest of your sample either. I was looking for hex strings like this:-
0000000 4142 4344 3137 3220 3220 4231 3030 3031
The next character would be the � that we're interested in. I don't fancy working out byte-for-byte what there is and what's which.
The above is from your first line up to the �
Can you clarify which input you have used for this? If you have to sanitise the input, please provide the matching output for the sanitised version.