Binary files. Binary file

28.04.2019 Programs

The examples we have considered so far have demonstrated formatted input / output of information to files. It is advisable to use formatted file input / output of numbers only when they are small and small in number, and also when it is necessary to provide the ability to view files using non-software tools. Otherwise, of course, it is much more efficient to use binary I / O, in which numbers are stored in the same way as in the computer's OP, and not in the form of character strings. Let me remind you that an integer (int) or real (float) value occupies 4 bytes in memory, a double value - 8 bytes, and a char value - 1 byte. For example, the number 12345 in a text (formatted) file is 5 bytes, and in a binary file it is 4 bytes.

Binary files, i.e. files in which information is stored in the internal form of presentation are used for subsequent use by software, they cannot be viewed by non-software. The advantage of binary files is that, firstly, when reading / writing, time is not wasted on converting data from symbolic representation to internal and back, and secondly, there is no loss of precision of real numbers. As in the case of formatted input / output, and in the case of binary input / output, for "correct" processing of information from a file, it is necessary to know what types of data, how and in what sequence are written to a binary file, especially since viewing a binary file using a text editor won't do anything.

Consider an example that demonstrates writing integer elements of a dynamic array to a binary file and reading them from this file.

#include

using namespace std;

cout<< "Vvedite kol-vo elementov celochisl. massiva: "; cin >> N;

int * mas = new int [N];

for (i = 0; i

cout<< " Vvedite " << i << "-i element: "; cin >> mas [i];

cout<< "\nIdet zapis dannyh v fail..." << endl;

ofstream fout ("c: \\ os \\ bin.dat", ios :: binary);// create out. binary stream

if (! fout) (cout<< "\n Oshibka otkrytiya faila!"; getch(); return 1; }

fout.write (reinterpret_cast (mas), N * sizeof (int));// write array to file

fout.close ();// close the stream

cout<< "Dannye uspeshno zapisany!" << endl;

for (i = 0; i

ifstream fin ("c: \\ os \\ bin.dat", ios :: binary); // create a stream to read the file

if (! fin) (cout<< "\n Oshibka otkrytiya faila!"; getch(); return 1; }

cout<< "Fail sodergit:" << endl;

fin.read (reinterpret_cast (mas), N * sizeof (int));// read array from file

for (i = 0; i

getch (); return 0;

Particular attention in this program should be paid to the use of the write () (method of the ofstream class) and read () (method of the ifstream class). These functions think of data in terms of bytes and are designed to transfer a specified number of bytes from the data buffer to a file and vice versa. The parameters of these functions are the address of the buffer and its length in bytes.

The write () function is designed to write to a file the number of bytes specified in the second parameter from the specified in the first parameter addresses data buffer, and the read () function is for reading data from a file. It should be noted here that these functions operate on a data buffer of type char only. In this regard, in this program we used the operator reinterpret_cast<> which converts our data buffer of type int (mas) into a buffer of type char.

It should be remembered that type casting using the operator reinterpret_cast is only necessary in cases where the first parameter of the functions write () and read () is not a character array (after all, a character of type char takes only 1 byte). In addition, if it is necessary to write or read not an array, but separate variables, then you need to use a reference mechanism (reference to the address of the data buffer), for example:

ofstream fout (filename, ios :: app | ios :: binary);

fout.write (reinterpret_cast (& cb), sizeof (float));

Now it is necessary to discuss the second parameter of the considered functions. In this program, as the second parameter, we used the expression N * sizeof (int), with which we calculated the number of bytes. For example, if we have 5 integer array elements, then the number of bytes will be 20. The sizeof () function returns the number of bytes allocated for the data type specified as a parameter. For example sizeof ( int) will return 4.

So, the program shown in this example allows you to write data in a binary form to the bin.dat file and read them from this binary file. Moreover, after reading, this data is converted to the int type, acquires the structure of an array, and any operations can be performed with it.

Now, imagine that you need to write a program that allows you to read data from the bin.dat file, and we only know that this file contains elements of an integer array in binary form. The number of recorded elements ( N ) we do not know... When creating a program, we have no right to use a constant array, i.e. allocate memory for it at the stage of program creation. This will lead to an erroneous result. Since too small a value of N will lead to the fact that not all elements of the array are counted, and too large a value of N will lead to filling extra cells with random values.

Consider an example of a program that allows you to read elements of an integer array from a binary file by dynamically allocating memory, and to prove the realism of the read data, calculate their sum.

#include

using namespace std;

int N, i, sum = 0, dfb; // dfb - file length in bytes

ifstream fin ("c: \\ os \\ bin.dat", ios :: binary);

if (! fin) (cout<< "Oshibka otkrytiya faila!"; getch(); return 1; }

fin.seekg (0, ios :: end);// set the read position to the end of the file (from the end of 0 bytes)

dfb = fin.tellg ();// get the value of the end-of-file position (in bytes)

N = dfb / 4;// knowing that an integer takes 4 bytes, calculate the number of numbers

int * arr = new int [N];// create a dynamic array

fin.seekg (0, ios :: beg);// before reading data, we move the current position to the beginning of the file

fin.read (reinterpret_cast (arr), dfb);

cout<< "Iz faila schitano " << N << " elementov:" << endl;

for (i = 0; i

cout<< "\n Ih summa = " << sum;

getch (); return 0;

Let us consider this program in detail, in which we actively used the seekg () and tellg () functions, which are methods of the ifstream class. It should be noted here that with any file when it is opened, the so-called current read or write position is associated... When a file is opened for reading, this position is by default set to the beginning of the file. But quite often it is necessary to control the position manually in order to be able to read and write, starting from an arbitrary location in the file. The seekg () and tellg () functions allow you to set and check the current read pointer, and the seekp () and tellp () functions do the same for the write pointer.

The seekg (1_parameter, 2_parameter) method moves the current read position from the file by the number of bytes specified in 1_parameter relative to the location specified in 2_option. 2_parameter can take one of three values:

ios :: beg - from the beginning of the file;

ios :: cur - from the current position;

ios :: end - from the end of the file.

Here beg, cur and end are constants defined in the ios class, and the symbols :: denote an access operation to this class. For example, the operator fin.seekg (-10, ios :: end); allows you to set the current read position from the file 10 bytes before the end of the file.

Now let's get back to the description of the program. Based on the fact that we do not know the number of numbers written to the file, we first need to find out the number of numbers. To do this, using fin.seekg (0, ios :: end); we move to the end of the file and use the tellg () function to return the length of the file in bytes to the dfb variable. The tellg () function returns the current position of the pointer in bytes. Since we know the length of one integer in bytes (4 bytes), it is easy to calculate the number of numbers written to the file, knowing the length of the file in bytes ( N = dfb / 4;). Having found out the number of numbers, we create a dynamic array and move to the beginning of the file in order to start reading data using the read () function. After the specified number of data bytes (dfb) is transferred to the data buffer (arr), the data read in this way acquires the structure of an array and becomes fully suitable for any kind of operations and transformations.

In the above example, the "longest" option is "b": it requires 23 bytes (21 bytes for a string and 2 bytes for an integer). For options "n" and "m", 4 and 5 bytes are required, respectively (see table).

name, publisher item Variant part

Binary files

Binary files store information in the form in which it is represented in the computer's memory, and therefore are inconvenient for humans. Looking into such a file, it is impossible to understand what is written in it; it cannot be created or corrected manually - in some text editor - etc. However, all these inconveniences are compensated by the speed of working with data.

In addition, text files are sequential access structures, and binary ones are direct access structures. This means that at any time you can refer to anyone, not just the current element of the binary file.

Typed files

Variables of structured data types (other than strings) cannot be read from a text file. For example, if you need to enter data from a text file to fill the toy record with information about the toys available for sale (product name, product price and age range for which the toy is intended):

age: set of 0..18; (defined by boundaries in the file)

then you have to write the following code:

c: char; i, j, min, max: integer;

a: array of toy;

begin assign (f, input); reset (f);

for i: = 1 to 100 do if not eof (f)

then with a [i] do

begin readln (f, name, price, min, max); age: =;

for j: = min to max do age: = age + [j];

As you can see, such element-by-element reading is very inconvenient and laborious.

A way out of this situation is offered typed files- their elements can be of any basic or structured data type. The only limitation is that all items must be of the same type. This apparent inconvenience is

an indispensable condition for organizing direct access to the elements of a binary file: as in the case of arrays, if the length of each component of the structure is known exactly, then the address of any component can be calculated using a very simple formula:

<начало_структуры> + <номер_компонента>*<длина_компонента>

Description of typed files

In the var section, file variables intended to work with typed files are described as follows:

var<файловая_перем>: file of<тип_элементов_файла>;

No file variable can be constant.

Purpose of a typed file

From this moment until the end of the section, under the word "file" we mean " binary typed file"(of course, unless otherwise stated).

Team assign (f, "<имя_файла>"); serves to establish a connection between the file variable f and the name of the file for which this variable will be responsible.

Line "<имя_файла>"may contain the full path to the file. If the path is not specified, the file is considered to be located in the same directory as the executable module of the program.

Opening and Closing a Typed File

Depending on what actions your program is going to do with the file being opened, it can be opened in two ways:

reset (f); - opening a file to read information from it and at the same time to write to it (if such a file does not exist, an attempt to open it will cause an error). The same command is used to return a pointer to the beginning of the file;

rewrite (f); - opening a file to write information into it; if such a file does not exist, it will be created; if a file with the same name already exists, all the information previously contained in it will disappear.

Are closing typed files procedure close (f), common for all types of files.

Reading from a typed file

Reading from a file opened for reading is done using the read () command. The file variable name is indicated in brackets first, followed by the input list1):

Only variables of the corresponding type declaration can be entered from a file, but this data type can also be structured. Let's say, if we return to the example given at the beginning of p. " Typed files", it will become obvious that the use of typed file instead of text, it will significantly reduce the text of the program:

type toy = record name: string; price: real;

age: set of 0..18; (given by boundaries)

var f: file of toy;

a: array of toy; begin

assign (f, input);

for i: = 1 to 100 do

if not eof (f) then read (f, a [i]); close (f);

Search in a typed file

The familiar eof (f: file): boolean function reports the end of file reached. All other "end-finding" functions (eoln (), seekeof (), andseekeoln ()) common to text files cannot be applied to typed files.

But there are special subroutines that allow you to work with typed files as with direct access structures:

1. The function filepos (f: file): longint will report the current position of the pointer in the file f. If it points to the very end of a file containing N elements, then this function will return the result N. This is easily explained: the elements of the file are numbered starting from zero, so the last element is numbered N-1. And number N belongs, thus, to a "non-existent" element - a sign of the end of the file.

2. The filesize (f: file): longint function will calculate the length of the file f.

3. The procedure seek (f: file, n: longint) will move the pointer in file f to the beginning of record n. If it turns out that n is greater than the actual length of the file, then the pointer will be moved beyond the real end of the file.

4. The procedure truncate (f: file) will truncate the "tail" of the file f: all elements from the current to the end of the file will be removed from it. In reality, only the "end of file" attribute will be rewritten to the place where the pointer pointed, and the physically "cut off" values will remain in their original places - they will simply become "ownerless".

Writing to a typed file

You can save variables to a file open for writing using the write () command. As in the case of reading, the file variable is indicated first, followed by the output list:

write (f, a, b, c); - write to file f (previously opened for writing by commands rerite (f) or reset (f)) variables a, b, c.

Output to typed file only variables corresponding to the description of the data type are allowed. Unnamed and untyped constants cannot be output to

typed file.

Typed files are considered as structures of both direct and sequential access. This means that writing is possible not only to the very end of the file, but also to any other element of it. The value written will overwrite the previous value in this element (the old value will be "overwritten").

For example, if you need to replace the fifth element of the file with the value stored in the variable a, then you should write the following program excerpt:

seek (f, 5); (the pointer will be positioned at the start of the 5th element)

write (f, a); (the pointer will be positioned at the beginning of the 6th element)

A binary file is any file on your computer. All information on a computer and related media is recorded in bits (hence the name). However, for comparison, a text file can be read in the readers corresponding to the extension (the simplest ones - even in Notepad), but the executable file cannot. And although in fact a txt file is the same binary file, when they talk about the problem of opening the contents of binary files, they mean executable files, as well as compressed data.

You will need

- Hex Edit program.

Instructions

Download the Hex Edit program to the hard drive - a file editor that represents their contents in binary form. Open the program by double-clicking on the start file. This software allows you to read binaries in real time, modify content, add your own entries, and much more. To work properly in this environment, you need to know a little about the general concepts of binaries.

The program window is not much different from the usual editor: the familiar menu and panel with buttons, the body of the edited file, bookmarks and the status bar. Open the binary file through the File menu or by clicking on the corresponding icon on the panel. The binary file will appear before you as strings with numbers and letters. Do not confuse these characters with printable data in text files. They can also be edited by deleting symbols, but this will delete cells with data, pieces of the file.

Make changes to the contents of the file. The application can show important parts of the file for easier searching, and also has flexible configuration of the graphical display of the binary code. Switch the content view to ASCII + IBM / OEM mode to see the program code of the file. If you enter the wrong lines in the file, it may not work correctly, causing serious consequences for the operating system of the personal computer.

Save your changes. If you have no experience in editing files in this way, be prepared for the file not opening and refusing to work after making changes. You will most likely mess up a few copies before you get the job done. Try not to save all changes to the original file so that its contents remain unchanged.

You've probably come across the terms "text" and "binary" before reading some articles about files. And they decided that all this is too difficult for you, you will never figure it out, so they did not delve into it, giving up on it.

We will try to explain everything in the most simple language possible, because such information is useful for every user, even the most inexperienced, because they are directly related to the basics of computer security.

A bit of theory

The text file contains ASCII characters (the abbreviation stands for American Standard Code for Information Interchange, something like "American standard for encoding for information interchange").

In fact, ASCII is a table in which each letter, number, punctuation mark and different "dogs" with "snowflakes" (in the sense of @ and *) are allocated one byte. That is, eight zeros and ones (bits). Plus, of course, control characters like line breaks.

A program to open files with ASCII characters converts bytes to letters, numbers and other printable characters on the display. Of course, the software must understand that part of the table that corresponds to the Russian language, but this is already a question of encoding.

In a binary file, zeros and ones are arranged in a sequence that is not necessary for displaying texts (although there are some, for example, * doc). And for what, you ask. The answer is simple: for everything else. Programs, films, music, images - each format has its own structural principles for organizing data.

The word “binary” itself means “two-component”, “double”. Indeed, only two components are clearly defined - zero and one, bits, "bricks" that make up the file. The meaning of everything else can manifest itself only during launch (opening, playback).

The underside of the digital world

You can look inside a binary file using a special program - a HEX editor. (From the word Hexadecimal, denoting a hexadecimal number system.) Such software shows bytes in the form of their HEX-designations, located in fact also in the form of a table (matrix).

For example, the bytes of a JPEG image, a regular picture or a photograph, in the editor window will be shown as FF D8 FF 00 04 3A 29 and so on.

The specialist will understand that the sequence of bytes FF D8 at the very beginning indicates that we are dealing with JPEG. And for non-specialists, all this is not so interesting.

You can also open a text file in a HEX editor to see which bytes correspond to specific letters (ASCII characters). But only out of curiosity, it still makes no sense.

But binary files are sometimes viewed in hexadecimal form for quite meaningful and specific purposes. For example, specialists in antivirus laboratories thus search for malicious code added to the main one. By the way, let's move on to security issues.

What can harm

The text file cannot contain anything other than ASCII characters. However, programs are not only binary, but also consisting of the above symbols. This means scripts, of course.

In other words, the * txt file is not infected in principle and does not pose a threat. And if there is a script inside a text file, then it can do a lot of mischief.

For example, the * bat file contains the code of various commands and is launched by double clicking, like a normal program. Those commands are written in ASCII characters, but the operating system is able to interpret them - turn them into such zeros and ones, which are typical for programs.

But, of course, you don't click on unknown bat-files, right? That's good.

Previous publications:

Last edit: 2012-11-06 14:45:16

Material tags:,

records), then the desire to somehow reduce the unused but occupied memory space is quite understandable.

Especially for such cases, there are records with variant part.

Description of a record with a variant part

In the var section record with variant part describe as follows:

var<имя_записи>: record<поле1>: <тип1>; [<поле2>: <тип2>;] [...] case<поле_переключатель>: <тип>of<варианты1>: (<поле3>: <тип3>; <поле4>: <тип4>; ...); <варианты2>: (<поле5>: <тип5>; <поле6>: <тип6>; ...); [...] end;

Nonvariant part notation (before the case keyword) obeys the same rules as regular notation. Generally speaking, the non-variant part may be absent altogether.

Variant part begins with the reserved word case, after which the field of the record is indicated, which will further serve as a switch. As with a regular case statement, the switch must belong to one of the enumerated types data (see Lecture 3). The list of choices can be a constant, a range, or the union of multiple constants or ranges. The set of fields that must be included in the record structure, if the corresponding option has been performed, is enclosed in parentheses.

Example... In order to describe the contents of the library, the following information is needed:

The columns "Title" and "Publisher" are common to all three options, and the rest of the fields depend on the type of print edition. To implement this structure, we will use entry with variant part:

type biblio = record name, publisher: string; case item: char of "b": (author: string; year: 0..2004); "n": (data: date); "m": (year: 1700..2004; month: 1..12; number: integer); end;

Depending on the value of the item field, the record will contain either 4, 5, or 6 fields.

The mechanism for using a record with a variant part

The number of bytes allocated by the compiler for record with variant part, is determined by its "longest" version. Shorter sets of fields from other variants take up only a fraction of the allocated memory.

name, publisher	item	Variant part
...	"b"	author			year
...	"n"	data
...	"m"	year	month	number
...	"b"	author			year

Binary files

Also, text files belong to structures sequential access, and binary - direct. This means that at any time you can refer to anyone, not just the current element.

Binary files. Binary file

Binary files

Typed files

Description of typed files

Purpose of a typed file

Opening and Closing a Typed File

Reading from a typed file

Search in a typed file

Writing to a typed file

You will need

Instructions

A bit of theory

The underside of the digital world

What can harm

Description of a record with a variant part

The mechanism for using a record with a variant part

Binary files

Top related articles