Character Set - SS3 ICT Lesson Note
A character set, also known as a character encoding, is a mapping between characters (letters, numbers, symbols) and binary code values (bits or bytes). It defines how text data is represented in computers. Two commonly used character sets are:
ASCII (American Standard Code for Information Interchange): ASCII is a widely used character encoding scheme that assigns a unique 7-bit binary code to each character, including letters, digits, punctuation, and control characters. Extended ASCII uses 8 bits, allowing for additional characters and symbols.
Unicode: Unicode is a more comprehensive character encoding system that supports a vast range of characters from various writing systems worldwide. It uses 16 bits (UTF-16) or even 32 bits (UTF-32) to represent characters, accommodating languages like Chinese, Arabic, and many others.
In summary, data representation is the cornerstone of how computers store, process, and communicate information. It involves converting diverse types of data into a format that computers can understand and manipulate efficiently, and character sets like ASCII and Unicode play a crucial role in representing textual data.