Skip to main content
Skip table of contents

Character set (encoding)

character set is a group of alphanumeric characters that can used on the internet: numbers (0-9), English letters (A-Z), and some special characters like ! $ + - ( ) @ < > . ASCII was the first character encoding standard.

Character set encoding refers to a set of characters and the way the way these characters are stored into memory. A coded character set is a character set in which each character corresponds to a unique number. The code unit size is equivalent to the bit measurement for the particular encoding.

  • A code unit in US-ASCII consists of 7 bits.

  • A code unit in UTF-8, EBCDIC and GB18030 consists of 8 bits.

  • A code unit in UTF-16 consists of 16 bits.

  • A code unit in UTF-32 consists of 32 bits.

Related terms

  • Unicode


JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.