| |
§ DATA REPRESENTATIONS IN COMPUTERS
For the discussion that follows, I restrict the definition of a data representation to sets of tokens for use in digital computers. In modern computers, all information is represented fundamentally as series of binary digits. A binary digit (or, bit) can represent only two different values, which are usually called one and zero. [Binary representation is not a necessary property of computers. Computers have existed where the fundamental representation had ten possible values, three possible values, etc. Analog computers represent data in a continuous range between two boundaries, similar to the speedometer of a car or the mercury in a thermometer. It is somewhat an historical accident that the binary property of computers prevailed, but by now computers are almost universally so.] Having only two possible values, a bit can represent only two different units of information. These units can be interpreted as one/zero, true/false, on/off, yes/no, hot/cold, up/down, etc. Correctly reading the meaning of a bit depends on knowing which of the many possible binary data representations was used when the bit's value was recorded.
To allow computers to represent more than two different things, bits are combined in series. For the sake of convenience, the value of a bit is usually talked about as '1' or '0', regardless of its actual meaning. Using combinatorics, a series of two bits can represent four different units of information, as follows:
00
01
10
11
A three-bit series can represent eight different units of information; four bits can represent sixteen units; and so on. To represent the set of twenty-six Latin letters, we need at least five bits in series.
As in the Morse code example above, we now have two levels of data representation:
- In the primary data representation, the letter 'M' represents the spoken sound "Mm."
- In the secondary data representation, a series of bits represents the letter 'M'.
The most common binary data representation for text is ASCII (American Standard Code for Information Interchange). ASCII uses a seven-bit series, with combinations for the upper- and lower-case alphabetical characters, numerical digits, punctuation symbols, and various codes that are frequently needed for computer communications via modem. [There are several 8-bit extended ASCII representations that add special characters.] The ASCII code for the letter 'M', for instance, is 1001101. In most modern computers, textual data is recorded in the ASCII data representation. The binary pattern 1001101, however, is also used in computers to represent the numerical quantity 77 decimal [viz., 1x64 + 1x8 + 1x4 + 1x1], and so this bit pattern has multiple semantics. This same series also serves to represent many other types of data (for instance, as a bitmap for a graphical image). Therefore, an interpretation must be provided to a program if it is to understand the intended meaning of a given seven-bit pattern.
Top of document
|
|