Replacing \n,\n with ,

skerit · December 10, 2008, 8:59am

Hi everyone,

I've read lots of posts about tr and sed and such, I haven't been able to get out of this one.

I have a file, like this:

1
,
Jelle

2
,
Sandra

The only thing it needs to do is delete the enter between the number and the name.

So, it needs to replace \n,\n with ,

When I do: tr \n,\n , < test > test3
I get this:

1
,
Jelle

2
,
Sa,dra

When I do: tr "\n,\n , < test > test3
I get this:

1
,,
,Jelle
,
,2
,,
,Sandra

And I've tried other variations of the tr command line (and sed, which I found out will never work since it only reads the file one line at a time...)

methyl · December 10, 2008, 9:20am

#!/bin/ksh
cat file3|while read LINE1
do
	read LINE2
	read LINE3
	read LINE4
	echo "${LINE1}${LINE2}${LINE3}"
done


1,Jellie
2,Sandra

skerit · December 10, 2008, 9:49am

That's a nice workaround, but I'm afraid my source file isn't always as pretty as I thought it was.

Sometimes there's an extra enter, sometimes there's a name which doesn't have another field behind it, so:

1
,
Jelle

2
,
Sandra

Management

1
,
Peter

2
,
Michel

Advertising

1
,
Linda

And so on, and so on.

joeyg · December 10, 2008, 10:15am

> sed "s/[0-9]/~&/" <file104 | tr "\n" " " | tr "~" "\n" | awk '{print $1$2$3}'

1,Jelle
2,Sandra
1,Peter
2,Michel
1,Linda

skerit · December 10, 2008, 10:24am

All the 3 scripts in 1 line, how cool ...

Anyway, it almost worked, except for this line:

7
,
Jurgen,4442

Which gets transformed to

7,Jurgen
4442

joeyg · December 10, 2008, 10:28am

> sed "s/^[0-9]/~&/" <file104 | tr "\n" " " | tr "~" "\n" | awk '{print $1$2$3}'

1,Jelle
2,Sandra
1,Peter
2,Michel
1,Linda
7,Jurgen,4442

Note that I put a ^ inside that sed command; that will lock the data to the beginning of a row (a $ locks to end of a row).

radoulov · December 10, 2008, 10:36am

Yet another one:
(use nawk or /usr/xpg4/bin/awk on Solaris)

awk '$1=$1' RS= ORS='\n\n' infile

data:

$ cat file
1
,
Jelle

2
,
Sandra

Management

1
,
Peter

2
,
Michel

Advertising

1
,
Linda

output:

$ awk '$1=$1' RS= ORS='\n\n' file
1 , Jelle

2 , Sandra

Management

1 , Peter

2 , Michel

Advertising

1 , Linda

or:

$ awk '$1=$1' RS=  file          
1 , Jelle
2 , Sandra
Management
1 , Peter
2 , Michel
Advertising
1 , Linda

A similar with sed:

$ sed -e '/./{H;$!d;}' -e 'x;s/\n,\n/ , /g'  file

1 , Jelle

2 , Sandra

Management

1 , Peter

2 , Michel

Advertising

1 , Linda

skerit · December 10, 2008, 10:56am

Ah, so that's what ^ and $ mean. I almost thought "^$" was *another* way of writin a newline character... Good to know!

sed "s/^[0-9]/~&/" < test | tr "\n" " " | tr "~" "\n" | awk '{print $1$2$3}'

Does a good job, how do I add space-sensibility?
If there's a space in the name it drops the rest of the sentence.

So

3
,
Peter Jan,100

becomes "3,Peter", dropping all the rest.

Awk does a pretty good job, but if a few lines are already in the good order, it concatenates those, too.