How To Convert String To Byte In Python

Bytes are a sequence of integers representing binary data. Converting a string to bytes is a common task in Python. It is necessary when dealing with binary data and network protocols, encryption, and compression.

To transform a string into a byte there is some encoding format used such as ASCII or UTF-8. Byte is used in further operations which require binary data.

Why is there a need to convert a String to a byte?

Here are a few reasons to change string to the byte in Python:

  • Binary data is usually read from a file as a sequence of bytes.
  • To send the data over a network which requires a sequence of bytes.
  • When working with network protocols or binary data, bytes are often the preferred data type
  • Some cryptographic processes and hashing algorithms require data to be in a byte format.
  • Bytes are a low-level data type that permits more accurate manipulation of binary data.

In this article, we will discuss some famous procedures for modifying the string to A byte in Python.

How to change string to byte in Python

There are three different methods to convert string to byte in Python. We are analyzing here with detailed explanations about each process, its code and its output.

The approaches are:

  1. Using encode() method
  2. Using bytes() constructor
  3. Using bytearray() constructor

Approach 1: Using Encode() method

The Encode() technique is used to change a string into a byte object when there is a need to specify the encoding format such as UTF-8, ASCII & ISO-8859-1. As well as it is also utilized to deal with non-ASCII characters which can not be defined in plain ASCII.

The example of using the Encode( ) function with explanation:

Input:

#Converting a string to bytes using encode() method
string = "Hii Rekha Roy!"
obj = string.encode("UTF-8")
print(obj)



#Converting a string to bytes using encode() method with non-ASCII characters
string = "Hii Rékha Róy!"
bytes_obj = string.encode("UTF-8")
print(bytes_obj)

Output:

# Output 1
b'Hii Rekha Roy!'
# Output 2
b'Hii R\xc3\xa9kha R\xc3\xb3y!'

Explanation:

  • Here in example 1 we call the encode() method with a UTF-8 parameter. It converts the string to a byte object which is assigned to bytes_obj.
  • In example 2 there is a non-ASCII character and call the encode() method with a UTF-8 parameter to convert the string to a byte object.

Approach 2: Using bytes() constructor

To make a byte object from a list or a tuple of numbers, we use the bytes() constructor.

The example of using the bytes() constructor with explanation:

Input:

int_list = [72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33]
bytes_obj = bytes(int_list)
print(bytes_obj)

Output:

# Output 
b'Hello World!'

Explanation:

  • The int_list has a series of integers that are the ASCII codes for the characters in “Hello World!”.
  • A byte object with the integers from the iterable as bytes is returned by this constructor that takes an iterable of integers as an argument.
  • The resulting bytes_obj contains the bytes representation of the integers in the int_list in the order they appear.
  • The bytes object can be used to transmit or store the data in a binary format.
  • A binary format is needed to store or transmit data like an image or audio file and the bytes() constructor helps us with that.

Approach 3: Using bytearray() constructor

The bytearray() constructor is used to create a mutable sequence of bytes in Python which is useful when we need to modify the bytes object after creation.

The example of using the bytes() constructor with explanation:

Input:

# create a bytearray from a string
my_string = "Hello, world!"
my_bytearray = bytearray(my_string, "utf-8")

# modify the bytearray
my_bytearray[0] = 72  # ASCII code for 'H'
my_bytearray[7:12] = b"Python"  # replace 'world!' with 'Python'

# convert the bytearray back to a string
my_new_string = my_bytearray.decode("utf-8")

# print the results
print("Original String:", my_string)
print("Bytearray:", my_bytearray)
print("Modified String:", my_new_string)

Output:

Original String: Hello, world!
Bytearray: bytearray(b'Hello, Python!')
Modified String: Hello, Python!

Explanation:

  • Create a bytearray my_bytearray from my_string using the bytearray() constructor and specifying the encoding format “utf-8”.
  • Modify the bytearray by changing the first byte to 72, which is the ASCII code for the letter ‘H’, and replacing the substring ‘world!’ with the bytes b”Python”.
  • Convert the modified bytearray back to a string using the decode() method and specifying the encoding format “utf-8”.

Best Approach to Convert String to Byte in Python

There are several reasons why using the encode () method is often considered the best approach:

Simplicity: The encode() method is the easiest way to change a string to bytes. One line of code is required to recast a string to byte. It’s easy to implement and understand.

Versatility: Using the encode() method is versatile and works seamlessly with Unicode strings, which are the default in Python 3.

Efficiency: Using the encode() method is efficient and performs well in most cases.

Widespread usage: Using the encode() method is the most widely used and recognized method for converting a string to bytes in Python.

Sample problem for How to Convert string to byte in Python for each of the approaches:

By using encode() method

Sample Problem 1:

Your project involves sending data that is confidential over a network. You have to encrypt the data to make sure it is safe when you send it over the network. You have decided to use the AES encryption algorithm which requires the data to be in bytes format. The sensitive data that you need to send is in string format. Write a solution to recast the string to bytes using the encode() method.

Solution:

  • Install the pycrypto module using pip install pycrypto which includes the Crypto package.
  • Given a sensitive string that needs to be sent securely over the network.
  • AES encryption algorithm requires data to be in bytes format.
  • Use the encode() method to convert the string to bytes format.
  • A bytes object that is the encoded version of the string is created by calling the encode() method on the string object.
  • Store the bytes object in a variable called bytes_data.
  • Display both the original string and the encoded bytes using the print() function to verify the encoding.
  • Use the right decryption algorithm to decrypt the encoded bytes on the other end after sending them over the network.

Input:

from Crypto.Cipher import AES
import base64

# Prompt user to enter sensitive data
sensitive_data = input("Enter sensitive data: ")

# Convert sensitive data to bytes
byte_data = sensitive_data.encode('utf-8')

# Encrypt the byte data using AES algorithm
key = b'This is a secret key'
cipher = AES.new(key, AES.MODE_EAX)
encrypted_data, tag = cipher.encrypt_and_digest(byte_data)

# Encode the encrypted data and tag it in base64 format
base64_encrypted_data = base64.b64encode(encrypted_data).decode('utf-8')
base64_tag = base64.b64encode(tag).decode('utf-8')

# Send the encoded data over the network
send_data(base64_encrypted_data, base64_tag)

# Decode the base64 data and tag
byte_encrypted_data = base64.b64decode(base64_encrypted_data)
byte_tag = base64.b64decode(base64_tag)

# Decrypt the byte data using AES algorithm
cipher = AES.new(key, AES.MODE_EAX, nonce=cipher.nonce)
decrypted_data = cipher.decrypt_and_verify(byte_encrypted_data, byte_tag)

# Convert decrypted byte data to string format
plaintext_decrypted_data = decrypted_data.decode('utf-8')

# Print the decrypted plaintext data for verification
print("Decrypted data:", plaintext_decrypted_data) 

  Output:

Enter sensitive data: Satellite access code is EXC#$%^&*@
Decrypted data: Satellite access code is EXC#$%^&*@

By using byte() constructor

Sample Problem 2:

A program needs to read a binary file containing encoded data which is represented as a sequence of bytes and process the data in Python. The data has to be a bytes object and not a string object for the program to work. How can the binary data be converted to a bytes object in Python?

Solution:

  • Open the binary file containing the encoded data using the open() function, with the ‘rb’ mode flag to indicate.
  • The file should be opened in binary read mode.
  • Read the contents of the binary file using the read() method which returns a sequence of bytes.
  • To make a new bytes object, use the bytes() constructor and give it the sequence of bytes that you read from the file.
  • Store the bytes object in a variable for further processing.

Input:

# Open the binary file for reading in binary mode
with open('encoded_data.bin', 'rb') as file:

    # Read the file as bytes
    file_data = file.read()
    
    # Create a bytes object from the binary data
    bytes_data = bytes(file_data)
    
# Close the file
file.close()

Note:- No output is printed on the console. The program reads the contents of the binary file and covert it into a bytes object.

By using bytearray() constructor

Sample Problem 3:

We want to create an application that will encrypt messages over the internet. To ensure the security of the messages. We need to convert the plaintext message into a byte array before encrypting it. Is there any way to overcome this problem?

Solution:

  • Define a function to simulate sending a message over the internet.
  • Prompt the user to enter a plaintext message.
  • Transform the plaintext message to a bytearray using the bytearray() constructor.
  • Send the bytearray message over the internet using the previously defined function.
  • Upon receiving the message, convert the bytearray back to a plaintext string using the decode() method.
  • Print the unpacked plaintext message for verification.

Input:

import struct

# Dummy function to simulate sending a message over the internet
def dummy_send_message(message):
    print("Message sent over the internet.")

# Prompt user to enter plaintext message
plaintext_message = input("Enter plaintext message: ")

# Convert plaintext message to bytes
bytearray_message = bytearray(plaintext_message, 'utf-8')

# Convert Byte Array to a packed binary data representation
packed_message = struct.pack('!{}s'.format(len(bytearray_message)), bytearray_message)

# Send packed messages over the internet
dummy_send_message(packed_message)

# Unpack it into bytes
unpacked_message = struct.unpack('!{}s'.format(len(packed_message)), packed_message)

# Convert bytes back to plaintext string
plaintext_unpacked_message = unpacked_message[0].decode('utf-8')

# Print the unpacked plaintext message for verification
print("Unpacked message:", plaintext_unpacked_message)

  Output:

Enter plaintext message: There is an important meeting at 6 pm.
Message sent over the internet.
Unpacked message: There is an important meeting at 6 pm.

Conclusion

Converting a string to bytes in Python is a fundamental operation that is useful in many applications. In this blog, We have discussed three distinct techniques for transforming a string into bytes.

Each approach has its use cases and advantages. We can choose the most appropriate one to perform the modification efficiently but based on their pros and uses encode() is the best approach to change into a byte. We can effectively work with byte data in Python and build powerful applications that take advantage of its versatility and flexibility.