Kioskea
Recherche
Ask a question »

Parsing a binary file in PHP

March 2015

Binary code data files are used to program in scientifically-used low-level programming languages, like C, which cannot be coherently translated into text format.

Using any low-level language to store a given value, the binary code must be used once you load it in a text editor to record or read it.

A binary file cannot be read as it is in raw binary format in languages like Pascal. Hence, webmasters use PHP to read and write the files as text. A specific function must be used to retrieve your values.

PHP uses a unique function called unpack(). After the first data type argument is declared, you need to recover and create a second argument as the string from which you want to retrieve the data. The recoverable data must be in symbolic arguments.

Parsing a binary file in PHP



When using low level languages like C or Pascal, it is a common procedure to store data in a binary file (a record that can't be translated into text).

Using the C language, if you want to save the value 500 in a file, the code will be as follows:

#include <stdio.h>     

int main()     
{     
    int val = 500;     
    FILE *fp = fopen("file", "wb");     
         
    fwrite(&val, sizeof(int), 1, fp); //store val in "file"     
    fclose(fp);     
    return 0;     
}


When opening this particular file with a text editor, you may find it unreadable because your value is not saved as text but in its binary raw form.

But, if you use PHP, it is often necessary to retrieve values stored as binary from time to time. However, PHP reads and writes in the files as text. A specific function must be used to retrieve your values.

The solution:

The function unpack() can be use to solve this kind of problem. You must first declare the type of data you want to recover and then the string from which you want to retrieve the data.

The type of data to be recovered must be detailed using the corresponding symbol. For example, to retrieve a signed integer, use the i character.

So if we use the file we saved in the example above, here's the code to retrieve our value:

<?     
$fp = fopen("file", "rb");     
$data = fread($fp, 4); // 4 is the byte size of a whole on a 32-bit PC.     
$number = unpack("i", $data);     
echo $number[1]; //displays 500     
?>
    • Important notes:
  • The data size may change depending on the processor architecture (Sparc, ARM, PowerPC).
    • A program written in C uses integers of different sizes from 32-bit to 64 bits.
    • The arrangement of data may not be the same. Some machines store data in Big Endian, others in Little Endian.
    • The data size can vary depending on the compiler
    • The unpack function returns an array a little more elaborate that the one given as an example here. In our case, with one requested value, our value is in offset 1 of the array.
  • Data types for a 32-bit PC
    • Here is a table showing the data recorded by a C program compiled for a 32-bit PC:
    • char: c
    • unsigned char: C
    • short: s
    • unsigned short: S
    • int: I
    • unsigned int: L
    • float: f
    • double: d

For unlimited offline reading, you can download this article for free in PDF format:
Parsing-a-binary-file-in-php.pdf

See also

In the same category

Published by netty5 - Latest update by Paul Berentzen
This document entitled « Parsing a binary file in PHP » from Kioskea (en.kioskea.net) is made available under the Creative Commons license. You can copy, modify copies of this page, under the conditions stipulated by the license, as this note appears clearly.