Character Sets

HTML Character Sets (字元集)

The browser should know what character sets (character encoding) to use. It is required to display an HTML page correctly. (浏览器应该知道要使用的字符集(字符编码)。需要正确显示HTML页面。)

UTF-8 is the default character encoding for HTML5. However, it was used to be different. ASCII was the character set before it. And the ISO-8859-1 was the default character set from HTML 2.0 till HTML 4.01. (UTF-8是HTML5的默认字符编码。然而,它曾经是不同的。ASCII是它之前的字符集。ISO-8859-1是从HTML 2.0到HTML 4.01的默认字符集。)

However, there were still problems with encoding, and when UTF-8 appeared with HTML5 and XML, many issues were solved. (然而,编码仍然存在问题,当UTF-8与HTML5和XML一起出现时,许多问题得到了解决。)

Let’s see more details about character sets. (让我们来看看有关字符集的更多详细信息。)

ASCII

ASCII

ASCII was the first character encoding standard, which is also called a character set. It is abbreviated from American Standard Code for Information Interchange. (ASCII是第一个字符编码标准,也称为字符集。它是美国信息交换标准代码的缩写。)

For each storable character, ASCII defined a unique binary number to support the upper and lower case alphabet (a-z, A-Z), the numbers from 0-9, and special characters. It is originally based on the English alphabet and encodes 128 characters into a 7-bit binary integer as it is known that all computer information is recorded as binary ones and zeros (01000101) in the electronics. (对于每个可存储字符, ASCII定义了一个唯一的二进制数,以支持大写和小写字母( a-z , A-Z ) , 0-9的数字和特殊字符。它最初基于英文字母,将128个字符编码为7位二进制整数,因为众所周知,所有计算机信息在电子设备中都记录为二进制1和0 ( 01000101 )。)

Below, you can see an ASCII chart. (在下面,您可以看到ASCII图表。)

The biggest problem for ASCII is that it didn’t have non-English letters. It is still in use, especially in mainframe computers. (ASCII最大的问题是它没有非英语字母。它仍在使用,特别是在大型计算机中。)

Click here to see more about ASCII. (单击此处查看有关ASCII的更多信息。)

ANSI

ANSI (")

ANSI, which was also called Windows-1252, was the default character set for Windows up to Windows 95. It is an extension for ASCII, which adds international characters. It supported 256 characters using a full byte (8-bits). (ANSI ,也称为Windows-1252 ,是Windows 95之前Windows的默认字符集。它是ASCII的扩展,增加了国际字符。它支持使用全字节( 8位)的256个字符。)

ANSI was supported by all the browsers since it was announced as the default character set of Windows. (所有浏览器都支持ANSI ,因为它被宣布为Windows的默认字符集。)

ISO-8859-1

ISO-8859-1

ISO-8859-1 became the default character encoding in HTML2.0, as most countries use characters different from ASCII. It is also an extension to ASCII, just like ANSI, and it adds international characters. ISO-885-1 also uses a full byte to show twice as many characters as ASCII. (ISO-8859-1成为HTML2.0中的默认字符编码,因为大多数国家/地区使用的字符与ASCII不同。它也是ASCII的扩展,就像ANSI一样,它添加了国际字符。ISO-885-1还使用全字节显示两倍于ASCII的字符。)

Click here to see more about ISO-8859-1. (单击此处查看有关ISO-8859-1的更多信息。)

ISO-8859-1 is an extension to ASCII, with international characters added. (ISO-8859-1是ASCII的扩展,添加了国际字符。)

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

If an HTML4 page uses a different character encoding than ISO-8859-1, it must be defined in the <meta> tag.

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-8">

All HTML4 processors support UTF-8. (所有HTML4处理器都支持UTF-8。)

<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

When a browser detects ISO-8859-1 it commonly defaults to ANSI, as the latter has 32 more international characters. (>当浏览器检测到ISO-8859-1时,它通常默认为ANSI ,因为后者还有32个国际字符。)

Unicode UTF-8

Unicode UTF-8 (Unicode (UTF-8))

UTF-8 is the default character encoding for HTML5. (UTF-8是HTML5的默认字符编码。)

As the character sets mentioned above are limited, the Unicode Consortium developed a Unicode Standard. (由于上面提到的字符集是有限的, Unicode联盟开发了一个Unicode标准。)

This Unicode Standard has almost all the characters, punctuations, and symbols used in the world. (这个Unicode标准几乎包含了世界上使用的所有字符、标点符号和符号。)

In HTML, the charset attribute is used to add character encoding. (在HTML中, charset属性用于添加字符编码。)

<meta charset="UTF-8">

All HTML5 and XML processors support ANSI, ISO-8859, and UTF-8. (>所有HTML5和XML处理器均支持ANSI、ISO-8859和UTF-8。)



请遵守《互联网环境法规》文明发言,欢迎讨论问题
扫码反馈

扫一扫,反馈当前页面

咨询反馈
扫码关注
返回顶部