How To Convert String To Bytes In Java

Converting a string to bytes is a typical operation in Java that is necessary for network connection, file I/O, and cryptography. The procedure entails turning a string into a series of bytes.

This is possible with built-in Java classes like Charset and String, which provide methods for encoding and decoding texts into bytes using different character sets like ASCII or UTF-8. The bytes produced can be saved in a byte array or sent to an output stream.

It is vital to remember that the encoding and decoding processes might result in information loss, therefore selecting the right character set depending on the context of the operation is critical.

Why do we need to convert string to bytes in java

Here are some reasons why we might need to convert a string to bytes in Java:

  • Network communication: When transferring data across a network, the data must be transformed into a byte string before transmission. This is usually accomplished by encoding a text into bytes.
  • File I/O: When reading or writing a file, the data must be in bytes. We can write string data to a file by converting a string to bytes.
  • Cryptography: Cryptographic algorithms work with byte streams rather than characters. Encrypting or decrypting data requires converting a string to bytes.
  • Serialization: When objects are serialized in Java, they are generally transformed into a byte string. When serializing objects containing string data, string data must be converted to bytes.
  • Encoding: In certain cases, a string must be encoded using a certain character set. When we convert a string to bytes, we may provide the appropriate character set for encoding.

Ways to convert string to bytes in java

Here are some approaches for converting a string to bytes in Java:

  • Using the getBytes() method of the String class.
  • Using the Charset class and its encode() method.
  • Using the ByteArrayOutputStream and DataOutputStream classes.

Approach 1: Using the getBytes() method of the String class.

In Java, the getBytes() method is a built-in method of the String class. This function transforms a string to a bytes array using the platform’s default character encoding and returns a byte array containing the encoded string representation.

For easy encoding jobs, utilise the getBytes() function.It should be noted that the approach use the platform’s default character encoding, which may not be appropriate in all circumstances.

class StringToBytesExample {
    public static void main(String[] args) {
        String message = "Hello, world!";
        byte[] bytes = message.getBytes();

        System.out.println("Original message: " + message);
        System.out.print("Byte array: [");
        for (int i = 0; i < bytes.length; i++) {
            System.out.print(bytes[i]);
            if (i != bytes.length - 1) {
                System.out.print(", ");
            }
        }
        System.out.println("]");
    }
}

Output:

Original message: Hello, world!
Byte array: [72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 33]

Explanation:

  • Initializes a string variable message with the value “Hello, world!”.
  • The getBytes() method is called on the message string to get a byte array that contains the encoded representation of the string.
  • The resulting byte array is then printed out using a for loop and a formatted println() statement. The loop iterates over each byte in the array and prints it out, separated by commas. The if statement ensures that a comma is not printed after the last byte.
  • The output shows the original message and the resulting byte array, where each byte is printed as a decimal number.

Approach 2: Using the Charset class and its encode() method.

In Java, the Charset class allows you to define character encodings for converting between characters and bytes. This class’s encode() function may be used to convert a String into a ByteBuffer of bytes using a given character set.

This approach gives the programmer more control over the encoding process, allowing him or her to directly specify the desired character set. When working with non-ASCII characters or encoding data for network transmission or file I/O, this method comes in handy.

import java.nio.ByteBuffer;
import java.nio.charset.Charset;

class StringToBytesExample {
    public static void main(String[] args) {
        String message = "Hello, world!";
        Charset utf8 = Charset.forName("UTF-8");
        ByteBuffer buffer = utf8.encode(message);

        System.out.println("Original message: " + message);
        System.out.print("Byte array: [");
        for (int i = 0; i < buffer.limit(); i++) {
            System.out.print(buffer.get(i));
            if (i != buffer.limit() - 1) {
                System.out.print(", ");
            }
        }
        System.out.println("]");
    }
}

Output:

Original message: Hello, world!
Byte array: [72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 33]

Explanation:

  • Initializes a string variable message with the value “Hello, world!”.
  • Initializes a Charset variable utf8 with the character set name “UTF-8”.
  • Calls the encode() method of utf8 on message, which returns a ByteBuffer containing the encoded representation of the string.
  • The Charset.forName() method is called to get a Charset object representing the UTF-8 character set.
  • The encode() method is called on utf8 to convert a message to a ByteBuffer of bytes using the UTF-8 encoding.
  • The resulting ByteBuffer is then printed out using a for loop and a formatted println() statement. The loop iterates over each byte in the buffer and prints it out, separated by commas. The if statement ensures that a comma is not printed after the last byte.
  • The output shows the original message and the resulting byte array, where each byte is printed as a decimal number.

Approach 3: Using the ByteArrayOutputStream and DataOutputStream classes.

Java’s ByteArrayOutputStream and DataOutputStream classes allow you to convert a String to a byte array in a more flexible and customized manner. The programmer can control the byte order, format, and type of the resultant byte array by publishing the string to a DataOutputStream that is backed by a ByteArrayOutputStream. This method is appropriate for applications that require specific byte formats, such as network protocols or binary file formats.

import java.io.ByteArrayOutputStream;
import java.io.DataOutputStream;
import java.io.IOException;

class StringToBytesExample {
    public static void main(String[] args) {
        String message = "Hello, world!";
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        DataOutputStream dos = new DataOutputStream(baos);

        try {
            dos.writeUTF(message);
            dos.close();
        } catch (IOException e) {
            e.printStackTrace();
        }

        byte[] bytes = baos.toByteArray();

        System.out.println("Original message: " + message);
        System.out.print("Byte array: [");
        for (int i = 0; i < bytes.length; i++) {
            System.out.print(bytes[i]);
            if (i != bytes.length - 1) {
                System.out.print(", ");
            }
        }
        System.out.println("]");
    }
}

Output:

Original message: Hello, world!
Byte array: [0, 13, 72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 33]

Explanation:

  • Initializes a string variable message with the value “Hello, world!”.
  • Initializes a ByteArrayOutputStream baos to capture the bytes written by the DataOutputStream.
  • Initializes a DataOutputStream dos to write message to baos using the UTF-8 character encoding.
  • Writes the string to the DataOutputStream using the writeUTF() method.
  • Closes the DataOutputStream.
  • Converts the contents of the ByteArrayOutputStream to a byte array using the toByteArray() method.
  • A ByteArrayOutputStream is created to store the bytes that are generated when the string is written to a DataOutputStream.
  • A DataOutputStream is created and is backed by the ByteArrayOutputStream.
  • The writeUTF() method is called on the DataOutputStream to write the string to the output stream using the UTF-8 encoding.
  • The ByteArrayOutputStream is converted to a byte array using the toByteArray() method.
  • The byte array is printed using a loop and a formatted println() statement. The loop iterates over each byte in the array and prints it out, separated by commas. The if statement ensures that a comma is not printed after the last byte.
  • The output shows the original message and the resulting byte array. The first two bytes of the array are a two-byte header that indicates the length of the string in bytes. The remaining bytes represent the ASCII values of the characters in the string.

Best Approach

Here are some reasons why the getBytes() method of the String class might be a good choice:

  • It is a simple and straightforward way to convert a string to bytes.
  • It uses the platform’s default character encoding, which may be appropriate for many applications.
  • It doesn’t require any external libraries or additional setup.
  • It returns a byte array, which is a common and versatile data type in Java.
  • It can be used in conjunction with other Java classes that work with byte arrays, such as InputStream and OutputStream.
  • It is widely used and well-documented, so it is easy to find examples and tutorials on how to use it.

Sample Problem

Sample Problem 1:

Write a Java program that reads a string from the console and converts it to bytes using the getBytes() method of the String class. Then, write the bytes to a binary file using the FileOutputStream class.

Solution:

  • The program reads a string from the console using BufferedReader and InputStreamReader.
  • The String class’s getBytes() method is used to convert the input string to a byte array.
  • The program opens a FileOutputStream and writes the byte array to a binary file using the write() method.
  • The close() method is called on the FileOutputStream to free system resources.
  • The program prints a success message indicating the number of bytes written to the binary file.
  • The try-catch block is used to handle any IOException that may occur during file I/O operations.
import java.io.*;

class StringToBytesToFileExample {
    public static void main(String[] args) {
        try {
            // read a string from the console
            BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
            System.out.print("Enter a string: ");
            String inputString = reader.readLine();
            
            // convert the string to bytes
            byte[] bytes = inputString.getBytes();
            
            // write the bytes to a binary file
            FileOutputStream outputStream = new FileOutputStream("output.bin");
            outputStream.write(bytes);
            outputStream.close();
            
            System.out.println("Successfully wrote " + bytes.length + " bytes to output.bin");
            
        } catch (IOException e) {
            System.err.println("Error: " + e.getMessage());
        }
    }
}

Output:

Enter a string: Hello there!!! I am here to help you.
Successfully wrote 37 bytes to output.bin

Sample Problem 2:

Create a Java class that has a method that takes a string as input and returns a byte array representing the string encoded in UTF-8. The method should use the Charset class and its encode() method to perform the conversion. Provide an explanation for above in bullet points

Solution:

  • The Charset class and its StandardCharsets.UTF_8 constant are used to get a reference to the UTF-8 character set.
  • The convertToUTF8() method takes a String parameter and returns a byte array representing the input string encoded in UTF-8.
  • The input string is encoded as a byte array using the encode() method of the Charset class and casting it to a byte array using the array() method.
  • The main() method provides an example usage of the convertToUTF8() method.
  • The input string is converted to a byte array using the convertToUTF8() method and printed to the console.
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;

class StringToBytesConverter {
    
    public static byte[] convertToUTF8(String str) {
        // get the UTF-8 charset object
        Charset utf8 = StandardCharsets.UTF_8;
        
        // encode the input string using the UTF-8 charset
        byte[] bytes = utf8.encode(str).array();
        
        return bytes;
    }
    
    public static void main(String[] args) {
        // example usage of the convertToUTF8 method
        String input = "Hello, world!";
        byte[] bytes = convertToUTF8(input);
        
        System.out.println("Input string: " + input);
        System.out.println("UTF-8 bytes: " + bytes);
    }
}

Output:

Input string: Hello, world!
UTF-8 bytes: [B@1fb3ebeb

Sample Problem 3:

Write a Java program that reads a string from a text file and serializes it into a byte array using the DataOutputStream class. Then, write the byte array to a binary file using the FileOutputStream class. Finally, read the byte array from the binary file and deserialize it back into a string using the DataInputStream class.

Solution:

  • The program reads a string from a text file using BufferedReader and FileReader.
  • The DataOutputStream class is used to serialize the input string into a byte array. First, a ByteArrayOutputStream object is created to write the serialized data into a byte array. Then, a DataOutputStream object is created that writes the serialized data into the ByteArrayOutputStream.
  • The byte array is written to a binary file using the FileOutputStream class’s write() method.
  • The FileInputStream class is used to read the byte array from the binary file.
  • A ByteArrayInputStream object is created from the byte array to be deserialized.
  • A DataInputStream object is created to read the deserialized data from the ByteArrayInputStream.
  • The DataInputStream object’s readUTF() method is used to deserialize the byte array into a string.
  • The program prints the deserialized string to the console.
  • The try-catch block is used to handle any IOException that may occur during file I/O operations.
import java.io.*;

class SerializeDeserializeString {
    public static void main(String[] args) {
        String inputString = "";
        try {
            // Read input string from file
            BufferedReader reader = new BufferedReader(new FileReader("input.txt"));
            inputString = reader.readLine();
            reader.close();

            // Serialize input string into a byte array
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            DataOutputStream dos = new DataOutputStream(baos);
            dos.writeUTF(inputString);
            byte[] bytes = baos.toByteArray();
            dos.close();

            // Write byte array to binary file
            FileOutputStream fos = new FileOutputStream("output.bin");
            fos.write(bytes);
            fos.close();

            // Read byte array from binary file and deserialize into a string
            FileInputStream fis = new FileInputStream("output.bin");
            byte[] bytesFromFile = fis.readAllBytes();
            ByteArrayInputStream bais = new ByteArrayInputStream(bytesFromFile);
            DataInputStream dis = new DataInputStream(bais);
            String outputString = dis.readUTF();
            dis.close();

            // Print deserialized string to console
            System.out.println(outputString);

        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Output:

A B C D E F

[Note: To run this code you would have to create a file named “input.txt”. The File should have some data.]

Conclusion

Depending on the application’s needs, there are several ways to convert a text to bytes in Java. The getBytes() function of the String class, the encode() method of the Charset class, and the usage of the ByteArrayOutputStream and DataOutputStream classes are three regularly used ways.

The getBytes() function converts a string to bytes in a basic and uncomplicated manner, and it employs the platform’s default character encoding. It does not require any other libraries or additional configuration and may be used alongside other Java classes that operate with byte arrays. It may not, however, be appropriate for applications that need a certain character encoding or byte order.

When a specific character encoding is necessary, the Charset class’s encode() function can be used, and it allows additional control over the byte order. It is beneficial in instances where the string data will be utilized elsewhere in the system or for data sharing.

The ByteArrayOutputStream and DataOutputStream classes allow the developer to write binary data to the output stream. When the programme has to serialize objects or send data to a binary file, this technique comes in handy.

Finally, the option of which strategy to employ for converting a string to bytes will be determined by the application’s unique requirements, and all three ways are viable and offer advantages depending on the scenario.