Saturday, December 27, 2025

Reading unsigned data in Java

In the world of binary files, a single byte is often used to represent numbers from 0 to 255 (unsigned). However, in Java, a byte is signed, ranging from -128 to 127, because Java doesn't have a unsigned types (with the exception of the 2 byte char type). A raw byte 0xFF in a file represents 255. In Java, buffer.get() will return -1 if the byte is 0xFF. To fix this:

int unsignedByte = buffer.get() & 0xFF;

In Java, you cannot perform bitwise operations (&, |, ^) on a byte. Java automatically "promotes" the byte returned by buffer.get() to a 32-bit signed integer before doing the math. f your byte was 0xFF (which is -1 in signed 8-bit), Java preserves that "-1" value by filling the new 24 bits with 1s. This is called Sign Extension.

Original Byte: 11111111 (-1, see two's complement)
Promoted Int : 11111111 11111111 11111111 11111111 (Still -1)
Now that you have a 32-bit integer, you apply the mask 0xFF. In binary, 0xFF as an integer is 00000000 00000000 00000000 11111111.
  11111111 11111111 11111111 11111111 (The promoted -1)
& 00000000 00000000 00000000 11111111 (The 0xFF mask)
 -------------------------------------
  00000000 00000000 00000000 11111111 (The result: 255)

The final 32-bit result is now 255. You can now safely store this in an int variable without it ever appearing as a negative number again.

Reading an unsigned short (16-bit):

int unsignedShort = buffer.getShort() & 0xFFFF;

Reading an unsigned int (32-bit):

long unsignedInt = buffer.getInt() & 0xFFFFFFFFL;

Note that an unsigned 32-bit integer can exceed the capacity of a Java int. You must jump up to a long and use a long literal mask (noted by the L).