Summary -
In this topic, we described about the below sections -
Character set encoding used to display the HTML page correctly. Browser should identify the character set before to display or use on the webpage. Character set also called as character encoding. In today’s world, there are 4 different types of character sets available.
- ASCII character set
- ASCII character set (Windows-1252)
- ISO-8859-1 Character set
- UTF-8 Character set
The ASCII Character Set
ASCII was the first character encoding standard. ASCII uses the values from 0 to 31, 127 for control characters. ASCII uses the values from 32 to 126 for letters, digits, and symbols.
ASCII does not use the values from 128 to 255. ASCII supported numbers (0-9), English letters (A-Z) and some special characters like !, $, +, -, (, ), @, <, >.
For full set of ASCII character set, Refer ASCII character set reference
The ANSI Character Set (Windows-1252)
ANSI (Windows-1252) was the original Windows character set. ANSI (Windows-1252) supported 256 different character codes. ANSI is identical to ASCII for the values from 0 to 127.
ANSI (Windows-1252) has a proprietary set of characters for the values from 128 to 159. ANSI is identical to UTF-8 for the values from 160 to 255.
For full set of ANSI character set, Refer ANSI character set reference
The ISO-8859-1 Character Set
ISO-8859-1 was the default character set for HTML. ISO-8859-1 also supported 256 different character codes. ISO-8859-1 is identical to ASCII for the values from 0 to 127.
ISO-8859-1 does not use the values from 128 to 159. ISO-8859-1 is identical to UTF-8 for the values from 160 to 255.
For full set of ISO-8859-1 character set, Refer ISO-8859-1 character set reference
The UTF-8 Character Set
UTF-8 is the default character encoding was changed in HTML5 because ANSI and ISO were limited. UTF-8 (Unicode) covers almost all of the characters and symbols in the world. UTF-8 is identical to ASCII for the values from 0 to 127.
UTF-8 does not use the values from 128 to 159. UTF-8 is identical to both ANSI and 8859-1 for the values from 160 to 255. UTF-8 continues from the value 256 with more than 10 000 different characters.
For full set of UTF-8 character set, Refer UTF-8 character set reference