Code

stoudtLion · October 7, 2013, 1:18am

When you trying for work with computers first you see KB (or KiloBytes)and MB (or MegaBytes) all the time but where you does it stand for?Basically it�s a measurement of memory on a computer.
Here i have some examples of what some other numbers and letters look like to the computer:

0 = 00000000	a = 01100001	L = 01001100
1 = 00000001	v = 01110110	p = 01110000
2 = 00000010	$ = 00100100	z = 01111010

Link deleted.

jim_mcnamara · October 7, 2013, 3:48am

Computer systems use binary numbers. That is: the number "two" times itself.

2x2x2x2x2x2x2x2x2x2 = 1024 = 1 K (2^10)

2x2x2x2x2x2x2x2x2x2X2x2x2x2x2x2x2x2x2x2 = 1048576 = M (2^20)

So one Kb (or KB) is 1024 bytes of data in a file
One MB is 1048576 bytes of data in a file.

Since the computer uses only numbers to represent everything, the alphabet and other characters are assigned a number. Capital A=65, 0 (zero number character) = 49.

The definitions exist for EVERY character you can type, including the <enter> key.
The definition of these characters is a world-wide standard, it is called ASCII. When you set your computer to use languages that have complicated sets of glyphs, the new rules (numbers for each glyph) are stored in locale settings. These are special, not ASCII usually.

The default locale is named "C". After the computer language C. That locale uses ASCII.

Computers are all about numbers and number crunching, so everything boils down to how numbers are stored in memory - they are stored as base2 numbers - binary - only ones and zeroes are allowed.

Don_Cragun · October 7, 2013, 9:12am

jim mcnamara:

... ... ...

Since the computer uses only numbers to represent everything, the alphabet and other characters are assigned a number. Capital A=65, 0 (zero number character) = 49.

The definitions exist for EVERY character you can type, including the <enter> key.
The definition of these characters is a world-wide standard, it is called ASCII. When you set your computer to use languages that have complicated sets of glyphs, the new rules (numbers for each glyph) are stored in locale settings. These are special, not ASCII usually.

The default locale is named "C". After the computer language C. That locale uses ASCII.

Computers are all about numbers and number crunching, so everything boils down to how numbers are stored in memory - they are stored as base2 numbers - binary - only ones and zeroes are allowed.

The C and POSIX standards do not require that the C (and POSIX) Locales be based on the ASCII codeset. The POSIX standards do require that the collation order of the 128 characters in the ASCII codeset is the same as the binary byte order of the ASCII characters no matter what the actual underlying codeset is. This guarantees that if you sort text files that only contain characters from the portable character set (which omits some of the ASCII control characters), the text will sort into the same order on a machine no matter what codeset is underlying the C and POSIX Locales. Note that to make this happen when sorting, you have to compare strings using the locale's collating order (like you get using strcoll()); not the numeric values of bytes (like you get using strcmp()).

There are UNIX branded implementations where the codeset underlying the C and POSIX Locales is a superset of ASCII (such as UTF-8 or one of the ISO 8859-* standards) where all of what Jim said above are true, but there are also implementations where the codeset underlying the C and POSIX Locales is one of the EBCDIC codesets where most of what Jim said above is not true.

Capital A is decimal 65 in ASCII, but it is decimal 193 in EBCDIC. Although in ASCII, the lowercase (uppercase) letters a-z (A-Z) have adjacent increasing values, that is not true in EBCDIC. The standards do, however, require that the 10 decimal digits (0-9) do have adjacent increasing values no matter what the underlying codeset is. When d is an integeral value from 0 through 9 inclusive, the following works to convert d to the corresponding character value in an ASCII based environment:

48 + d

(note 48; not the 49 Jim listed above) the same conversion in an EBCDIC based environment would be:

240 + d

But, you can portably write:

'0' + d

no matter what codeset underlies your C and POSIX Locales.

In ASCII, 'a' + 25 is 'z' and 'A' + 25 is 'Z' ; that relationship does not hold in EBCDIC.

Note also that your system's default locale is chosen by your system's administrator and is frequently not the C Locale. However, when a C program enters main, it will be act as if it had made the call:

setlocale(LC_ALL, "C");

When a program capable of dealing with internationalized environments starts, it should explicitly make the call:

rc = setlocale(LC_ALL, "");

to set the locale in use to be the system's default locale (if the user hasn't overridden the default) or the locale specified by the user by setting the LANG and LC_* environment variables. And, obviously, the program should verify that the user didn't specify an invalid locale by checking the return code from that call to setlocale().

Scrutinizer · October 7, 2013, 1:55pm

To add some confusion. In the disk storage world KB en MB typically stand for 10^3 and 10^6. Also, it is becoming more and more common to use KiB, MiB, GiB (Kibi, Mebi and Gibibytes) etc to explicitly refer to the 2^10, 2^20, 2^30, etc variety and to use the kB, MB and GB for the 10^n variety (with a lower case k). But KB, and MB, etc are still routinely used for the base 2 variety...