How to check string encoding?

I want to check if the string is WINDOWS-1251 or UTF-8

can you help me to find the string encoding???

or maybe to get URL Content-Type charset with wget?

this is my function on PHP

function check_utf8($str) { 
    $len = strlen($str); 
    for($i = 0; $i < $len; $i++){ 
        $c = ord($str[$i]); 
        if ($c > 128) { 
            if (($c > 247)) return false; 
            elseif ($c > 239) $bytes = 4; 
            elseif ($c > 223) $bytes = 3; 
            elseif ($c > 191) $bytes = 2; 
            else return false; 
            if (($i + $bytes) > $len) return false; 
            while ($bytes > 1) { 
                $i++; 
                $b = ord($str[$i]); 
                if ($b < 128 || $b > 191) return false; 
                $bytes--; 
            } 
        } 
    } 
    return true; 
}

I want to make it in shell

Thank you!!!

What's your system? What's your shell? This will be a lot harder in some shells than others, most can't directly turn a character into an integer value like that.

If perl is available I'd suggest using that.

1 Like

Im with OS X

this is my code

wget --server-response --spider -t 1 --timeout=1   "$fieldurl" 2>&1 | egrep -i "utf"

its ok for now :slight_smile:

1 Like