I'm sure this will be an easy question for you experts out there, but I have been searching the forum and working on this for a couple hours now and can't get it right.
I have a very messy data file that I am trying to tidy up - one of the issues is some records are split into multiple lines:
999999000 "Name" "this is text for line one
line two
line three"
And I've been trying all sorts of version of sed to get it to look like this:
999999000 "Name" "this is text for line one line two line three"
and yes, I have tried things like sed 's/$/ /' file1 > file2... the problem is not every line has an issue, so I'm trying to figure out how to only remove line feeds for problematic lines, not all lines
the problem lines will begin with alpha characters not numeric, so I've been trying to do something with that but to no avail
> cat file31
999999000 "Name" "this is text for line one
line two
line three"
888888000 "Yep" "All on one line"
777777111 "Yes" "Another good text"
555555999 "Name" "this is other text for line one
line two
line three"
> cat calc_file31
rm file32
while read line
do
if [ `echo "$line" | tr -d " " | grep '"$'` ]
then
echo "$line""~" >>file32
else
echo "$line" >>file32
fi
done <file31
cat file32 | tr "\n" " " | tr "~" "\n"
> calc_file31
999999000 "Name" "this is text for line one line two line three"
888888000 "Yep" "All on one line"
777777111 "Yes" "Another good text"
555555999 "Name" "this is other text for line one line two line three"
>