toupper or tolower case of first letter of the line depending on another file

louisJ · December 29, 2011, 4:20pm

Hi

I would like to read if the first letter of a line in a first file (gauche.txt) is uppercase or lowercase, and change consequently the first letter of the corresponding line in the second file (droiteInit.txt).
I have done this but it won't work (I launch this using gawk -f ./awk_lowercasesUppercases gauche.txt )

# initialize strings 
BEGIN { upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; 
        lower = "abcdefghijklmnopqrstuvwxyz"; 
        fichierLinenumberLower="LinenumberLower.txt"; 
        fichierLinenumberUpper="LinenumberUpper.txt";
        fichierdroite="droite.txt";
        linenumber = 1;
        NR==FNR; 
} 
 
# for each input line 
{ 
    # get first character of first word 
    FIRSTCHAR = substr($1, 1, 1); 
 
    # if lower get position of FIRSTCHAR in lowercase array; if 0, ignore 
    if (CHAR = index(lower, FIRSTCHAR)){  
#print linenumber > fichierLinenumberLower;
            {next}
            {
            FIRSTCHARdroite = substr($1, 1, 1);
#print FIRSTCHARdroite
#print toupper(FIRSTCHARdroite)
            sub(/^./,toupper(FIRSTCHARdroite),$0);
            print $0;
            } 
        }
         
    # if upper get position of FIRSTCHAR in uppercase array; if 0, ignore 
    if (CHAR = index(upper, FIRSTCHAR)){
            {next}
            { 
#print linenumber > fichierLinenumberLower;
            FIRSTCHARdroite = substr($1, 1, 1);
#print FIRSTCHARdroite
#print tolower(FIRSTCHARdroite)
            sub(/^./,tolower(FIRSTCHARdroite),$0);
            print $0; 
            } 
        } 
 
    linenumber=linenumber+1; 
}

Can you help me please?
Thanks

birei · December 29, 2011, 4:46pm

Hi louisJ,

Can you paste part of file 'gauche.txt', the incorrect ouput you get and the correct output you except? or the error in case of a incorrect syntax of your 'awk' program?

Regards,
Birei

jim_mcnamara · December 29, 2011, 7:42pm

FWIW - some versions of awk support tolower() and toupper() natively.
Check your documentation.

louisJ · December 30, 2011, 5:36am

Gawk supports toupper and tolower, I checked.

gauche.txt looks like this:

Sized to 
Durable, midweight,
ultraviolet protection
quick flo

DroiteInit.txt:

Dimensionn� 
nylon durable 
Indice de Protection 
s�chant flo

The result should be like this:

Dimensionn� 
Nylon durable 
indice de Protection 
s�chant flo

I want to obtain the same capitalization in the resulting file than in gauche.txt.

Klashxx · December 30, 2011, 6:03am

If you want to go on with awk:

awk 'NR==FNR{a=substr($0,1,1);b[FNR]=match(a,/[A-Z]/) ? 1 : 0;next}
{c=substr($0,1,1);c=b[FNR] ? toupper(c):tolower(c);print c""substr($0,2)}'  gauche.txt DroiteInit.txt
Dimensionn� 
Nylon durable 
indice de Protection 
s�chant flo

louisJ · December 30, 2011, 8:35am

Beautiful, thank you Klashxx

louisJ · December 30, 2011, 12:10pm

klashxx:

If you want to go on with awk:

awk 'NR==FNR{a=substr($0,1,1);b[FNR]=match(a,/[A-Z]/) ? 1 : 0;next}
{c=substr($0,1,1);c=b[FNR] ? toupper(c):tolower(c);print c""substr($0,2)}'  gauche.txt DroiteInit.txt

This doesn't work, every first letter ouputs in uppercase....I talked too fast!

Klashxx · December 30, 2011, 12:52pm

Using your samples :

MacBookPro:kk klashxx$ cat gauche.txt 
Sized to 
Durable, midweight,
ultraviolet protection
quick flo
MacBookPro:kk klashxx$ cat DroiteInit.txt 
Dimensionn� 
nylon durable 
Indice de Protection 
s�chant flo
MacBookPro:kk klashxx$ awk 'NR==FNR{a=substr($0,1,1);b[FNR]=match(a,/[A-Z]/) ? 1 : 0;next}
{c=substr($0,1,1);c=b[FNR] ? toupper(c):tolower(c);print c""substr($0,2)}'  gauche.txt DroiteInit.txt
Dimensionn� 
Nylon durable 
indice de Protection 
s�chant flo

durden_tyler · December 30, 2011, 6:25pm

And using Perl; the "bit-array" algorithm is similar to Klashxx's -

$
$ cat gauche.txt
Sized to
Durable, midweight,
ultraviolet protection
quick flo
$
$ cat DroiteInit.txt
Dimensionn�
nylon durable
Indice de Protection
s�chant flo
$
$
$ perl -lne 'if ($ARGV eq "gauche.txt") {/^(.)/ and push @x,(grep/$1/, (A..Z))?1:0}
             else {$x[$i++]==1?s/^(.)/\u$1/:s/^(.)/\l$1/; print}
            ' gauche.txt DroiteInit.txt
Dimensionn�
Nylon durable
indice de Protection
s�chant flo
$
$

tyler_durden

louisJ · January 1, 2012, 7:43am

Indeed, the match command returns 1 whatever the case....

---------- Post updated at 07:43 AM ---------- Previous update was at 07:40 AM ----------

klashxx:

Using your samples :

MacBookPro:kk klashxx$ cat gauche.txt 
Sized to 
Durable, midweight,
ultraviolet protection
quick flo
MacBookPro:kk klashxx$ cat DroiteInit.txt 
Dimensionn� 
nylon durable 
Indice de Protection 
s�chant flo
MacBookPro:kk klashxx$ awk 'NR==FNR{a=substr($0,1,1);b[FNR]=match(a,/[A-Z]/) ? 1 : 0;next}
{c=substr($0,1,1);c=b[FNR] ? toupper(c):tolower(c);print c""substr($0,2)}'  gauche.txt DroiteInit.txt
Dimensionn� 
Nylon durable 
indice de Protection 
s�chant flo

My linux is crazy, I copy/paste your command, and get a different result:

----$ cat gauche.txt 
Sized to 
Durable, midweight,
ultraviolet protection
quick flo
----$ cat droiteInit.txt 
Dimensionn� 
nylon durable 
Indice de Protection 
s�chant flo
----$ awk 'NR==FNR{a=substr($0,1,1);b[FNR]=match(a,/[A-Z]/) ? 1 : 0;next}
{c=substr($0,1,1);c=b[FNR] ? toupper(c):tolower(c);print c""substr($0,2)}'  gauche.txt droiteInit.txt
Dimensionn� 
Nylon durable 
Indice de Protection 
S�chant flo

??weird...any idea what is wrong?

ahamed101 · January 1, 2012, 9:15am

Try this... same code, a slight modification...

awk ' NR==FNR{ a=substr($0,1,1); b[FNR]=match(a,/[ABCDEFGHIJKLMNOPQRSTUVWXYZ]/) ? 1 : 0; next } 
{ c=substr($0,1,1); c=b[FNR] ? toupper(c):tolower(c); print c""substr($0,2) }' gauche.txt droiteInit.txt

--ahamed

louisJ · January 1, 2012, 9:20am

yes this works well now:

---$ awk 'NR==FNR{a=substr($0,1,1);b[FNR]=match(a,/[ABCDEFGHIJKLMNOPQRSTUVWXYZ]/) ? 1 : 0;next}                                                                                                           
{c=substr($0,1,1);c=b[FNR] ? toupper(c):tolower(c);print c""substr($0,2)}'  gauchetest.txt droiteInittest.txt
Dimensionn� 
Nylon durable 
indice de Protection 
s�chant flo

but why? it should be the same, do you have an explanation?

ahamed101 · January 1, 2012, 9:23am

I am trying to figure that out. That code change was just a hunch! I tried with awk, gawk, nawk etc it is all giving the same result.

--ahamed

louisJ · January 1, 2012, 9:36am

I don't get it...

how to write this as a awk script (and not a one liner), the following should be working, but it displays results from the first file (gauche.txt) instead of the right file (DroiteInit.txt) with the requiered changes...:

$ gawk -f ./awk_script gauche.txt droiteInit.txt
"awk_script" being:

NR==FNR 
{ 
    a=substr($0,1,1); 
    b[FNR]=match(a,/[AZERTYUIOPMLKJHGFDSQWXCVBN]/) ? 1 : 0; 
    next 
} 
{ 
    c=substr($0,1,1); 
    c=b[FNR] ? toupper(c):c=tolower(c);
    print c""substr($0,2) 
}

ahamed101 · January 1, 2012, 11:22am

The curly brace should be in the same line

NR==FNR{

--ahamed

pandeesh · January 1, 2012, 12:23pm

louisj:

I don't get it...

how to write this as a awk script (and not a one liner), the following should be working, but it displays results from the first file (gauche.txt) instead of the right file (DroiteInit.txt) with the requiered changes...:

$ gawk -f ./awk_script gauche.txt droiteInit.txt
"awk_script" being:
NR==FNR 
{ 
   a=substr($0,1,1); 
   b[FNR]=match(a,/[AZERTYUIOPMLKJHGFDSQWXCVBN]/) ? 1 : 0; 
   next 
} 
{ 
   c=substr($0,1,1); 
   c=b[FNR] ? toupper(c):c=tolower(c);
   print c""substr($0,2) 
} 

try this in awk_script:

NR==FNR {      a=substr($0,1,1);      b[FNR]=match(a,/[AZERTYUIOPMLKJHGFDSQWXCVBN]/) ? 1 : 0;      next  }  {      c=substr($0,1,1);      c=b[FNR] ? toupper(c):c=tolower(c);     print c""substr($0,2)  }

louisJ · January 1, 2012, 3:43pm

Thanks it works now!!