Contact Us Today 01642 716680

Base64 Encoding

Definition: Base64 encoding is a method for converting binary data into an ASCII string using a set of 64 different ASCII characters, facilitating safe transmission of the data through channels that only reliably support text.

Base64 encoding is prevalent within cyber security, particularly for encoding binary data before it is sent over protocols that are not designed to handle binary data effectively, like email or HTTP. This encoding scheme takes every 3 bytes of binary data and represents them as 4 characters from the Base64 alphabet, which includes uppercase and lowercase letters, numerals, plus (+), and slash (/). The encoding process helps ensure the data remains intact without modification during transport.


Base64 is widespread, for example, in MIME email messages for attachments, encoding credentials in HTTP Basic Authentication, and representing binary files in XML or JSON data formats. It is essential to note that Base64 encoding is not encryption; it’s a way of encoding data that can be easily reversed, so it does not provide secrecy. Instead, its purpose is to encode data to traverse systems without compatibility or corruption issues.


While Base64 encoding helps in data transport, attackers can sometimes misuse it to obfuscate malicious payloads or data exfiltration. Security systems need to decode Base64-encoded data to inspect the underlying binary or textual data for potential threats. Decoding Base64 should not be equated with decryption, which involves a mathematical algorithm and key for rendering encrypted data back into its original form.
Because of these characteristics, any security strategy must treat Base64 encoded data with the same caution as regular data and employ necessary security controls such as encryption, secure transport protocols, and regular security auditing to maintain data integrity and confidentiality.

Why is Base64 encoding used?

Base64 encoding is used because sometimes you must transport binary over mediums that cannot handle binary data formats. This can be applied to many different fields, such as emails and URLs.

Base64 encoding is typically used to encode attachments in emails. This is because some email systems can’t handle binary data very well. To assist with this, Base64 encoding can convert binary attachments, such as any images or documents you link, into a new format that email systems can handle and send safely. 

Base64 encoding can also be used when embedding certain resources, like images, into web pages using data URLs. This is done by including binary data as a text string within the URL. Base64 encoding is also used for transmitting binary data over certain text-based protocols like HTTP and SMTP.

Not only do emails and URLs use Base64 encoding, but sometimes even cookies do. As cookies can store only small amounts of text-based data, it is important that Base64 encoding is used so that non-text data can also be stored in the cookies.

Finally, images that are Base64 encoded can be embedded directly into HTML and CSS. In doing so, the number of server requests needed to render a web page can be reduced.

How does Base64 image encoding work?

Base64 image encoding works by converting the binary data of the image you have selected into a text string. This text string will only consist of ASCII characters. In doing so, the image data will be easier to transmit when only text-based data can be used. 

Remember that it is not recommended to use Base64 encoding for particularly large images or frequent image transfers. This is because when you encode an image in Base64, the size of the now encoded data will be approximately 33% larger than its original binary data. This is due to how each 3 bytes of binary data will be converted into 4 characters of Base64 encoded text.

How to identify Base64 encoding

When data is encoded into a Base64 format, it has certain characteristics that make it easy to identify. The first characteristic you should look for in your string is the length. In a Base64 string, the length will always be a multiple of 4. 

As stated earlier, only lowercase and uppercase letters, numerals, “+”, and “/” will be used in the encryption. If the string has a character that is not one of those, then it is safe to say it is not a Base64 encoded string. 

The final way to identify Base64 encoding is by checking the end of the string. This is done because the end of the string that is Base64 encoded can be padded up to two times by using “=”. This does not contradict the character limit discussed above, as “=” is only allowed at the end and not anywhere else in the string. Many online tools can identify Base64 encoded strings; the quickest way is to try to decode the string using an online converting tool like this one. It is typical for cyber security professionals to use more sophisticated tools when trying to identify an encoded string, such as the CyberChef tool released publically by GCHQ

Example of a Base64 encoded string

Below is an example of the sentence, “This is a string encoded in Base64.”

VGhpcyBpcyBhIHN0cmluZyBlbmNvZGVkIGluIEJhc2U2NA==

Key Characteristics:

  • Encodes binary data to ASCII text
  • Maps 3 bytes of binary to 4 Base64 characters
  • Commonly used for data transmission over protocols that handle text
  • Not an encryption technique but rather an encoding scheme

Examples:

  • Real-World Example: When sending an email with an image attachment, the email protocol (SMTP) may use Base64 encoding to encode the binary image so it can travel alongside the textual parts of the message without corruption.
  • Hypothetical Scenario: A developer needs to embed a small binary object inside a JSON configuration file. They use Base64 encoding to convert the binary data into a text representation, which is inserted without compatibility issues.

Related Terms:

  • ASCII: The character encoding standard used in Base64 encoding.
  • MIME (Multipurpose Internet Mail Extensions): A standard that extends the format of email to support text in character sets other than ASCII, and attachments, often using Base64 encoding.
  • HTTP Basic Authentication: An authentication method where the user’s credentials are encoded in Base64 and sent in an HTTP header.

What is the OWASP Top 10: Download our flash cards to find out.

Inside you will find a description of the most common web vulnerabilities.

Contact us

Get a free, no obligation quote from one of our expert staff.

      Looking for reliable Penetration Testing? Use the contact form below and request a quote today.