Base64 Definition
In computer programming, Base64 is a group of binary-to-text encoding schemes that transforms binary data into a sequence of printable characters, limited to a set of 64 unique characters. More specifically, the source binary data is taken 6 bits at a time, then this group of 6 bits is mapped to one of 64 unique characters.
Characters Used
Base64 encoding uses a set of 64 characters: the uppercase and lowercase letters A-Z, the numbers 0-9, plus two more characters which vary by system, commonly + and / for the last two. For
URL-safe encoding, these last two characters are often replaced with - and _.
Encoding Process
- Binary Conversion: The original data is converted into binary.
- Divide into Chunks: The binary data is divided into chunks of 6 bits because 6 bits (2^6) can represent 64 different values, which matches the Base64 character set.
- Mapping to Characters: Each 6-bit chunk is then mapped to a corresponding character in the Base64 alphabet.
- Padding: ince Base64 encoding requires the input to be divisible by 3 bytes (24 bits), padding is added if the input is not of the right length. The = character is used as padding at the end of the encoded data.
Example
Let's take the text "Hi" as an example:
- "Hi" in ASCII is 72 105 in decimal, 48 69 in hexadecimal, and 01001000 01101001 in binary.
- These binary bytes are split into 6-bit chunks: 010010 000110 1001.
- Since the last group is only 4 bits, it's padded with two zeros: 010010 000110 100100.
- These chunks are then converted to decimal: 18 6 36.
- Using the Base64 index table, these numbers correspond to the characters SGk=.
Value: 0 1 ... 25 26 ... 51 52 ... 61 62 63
Character: A B ... Z a ... z 0 ... 9 + /
- The final Base64-encoded string is SGk=, where the = indicates padding was added.