Data compression methods have a fairly long history of development, which began long before the advent of the first computer. This article will attempt to give a brief overview of the main theories, concepts of ideas and their implementations, without, however, claiming absolute completeness. More detailed information can be found, for example, in Krichevsky R.E. , Ryabko B.Ya. Witten I.H. , Rissanen J. , Huffman D.A. , Gallager R.G. , Knuth D.E. , Vitter J.S. and etc.

Information compression is a problem that has a fairly long history, much longer than the history of the development of computer technology, which (history) usually ran parallel to the history of the development of the problem of encoding and encrypting information. All compression algorithms operate on an input stream of information, the minimum unit of which is a bit, and the maximum unit is several bits, bytes or several bytes. The purpose of the compression process, as a rule, is to obtain a more compact output stream of information units from some initially non-compact input stream by means of some transformation. The main technical characteristics of compression processes and the results of their work are:

The degree of compression (compress rating) or the ratio (ratio) of the volumes of the original and resulting streams;

Compression rate - the time spent compressing a certain amount of information from an input stream before obtaining an equivalent output stream from it;

Compression quality is a value that shows how tightly the output stream is compressed by applying re-compression to it using the same or another algorithm.

There are several different approaches to the problem of information compression. Some have a very complex theoretical mathematical basis, others are based on the properties of the information flow and are algorithmically quite simple. Any approach and algorithm that implements data compression or compression is designed to reduce the volume of the output information stream in bits using its reversible or irreversible transformation. Therefore, first of all, according to the criterion related to the nature or format of the data, all compression methods can be divided into two categories: reversible and irreversible compression.

Irreversible compression means such a transformation of the input data stream in which the output stream, based on a certain information format, represents, from a certain point of view, an object quite similar in external characteristics to the input stream, but differs from it in volume. The degree of similarity between the input and output streams is determined by the degree of correspondence of certain properties of the object (i.e., compressed and uncompressed information, in accordance with some specific data format) represented by a given information stream. Such approaches and algorithms are used to compress, for example, data from raster graphics files with a low degree of byte repetition in the stream. This approach uses the structure property of the graphic file format and the ability to present a graphic image approximately similar in display quality (for perception by the human eye) in several (or rather n) ways. Therefore, in addition to the degree or magnitude of compression, the concept of quality arises in such algorithms, because Since the original image changes during the compression process, quality can be understood as the degree of correspondence between the original and resulting images, assessed subjectively based on the information format. For graphic files, this correspondence is determined visually, although there are also corresponding intelligent algorithms and programs. Irreversible compression cannot be used in areas where it is necessary to have an exact match between the information structure of the input and output streams. This approach is implemented in popular formats for presenting video and photo information, known as JPEG and JFIF algorithms and JPG and JIF file formats.

Reversible compression always leads to a reduction in the volume of the output information stream without changing its information content, i.e. - without loss of information structure. Moreover, from the output stream, using a reconstruction or decompression algorithm, the input can be obtained, and the recovery process is called decompression or decompression, and only after the decompression process is the data suitable for processing in accordance with its internal format.

In reversible algorithms, encoding as a process can be viewed from a statistical point of view, which is even more useful, not only for constructing compression algorithms, but also for assessing their effectiveness. For all reversible algorithms there is a concept of coding cost. Coding cost refers to the average length of a codeword in bits. Coding redundancy is equal to the difference between the cost and entropy of encoding, and a good compression algorithm should always minimize redundancy (remember that the entropy of information is the measure of its disorder). Shannon's fundamental theorem on information encoding says that “the cost of encoding is always no less than the entropy of the source, although it can be arbitrarily close to it.” Therefore, for any algorithm, there is always a certain limit on the degree of compression, determined by the entropy of the input stream.

Let us now move directly to the algorithmic features of reversible algorithms and consider the most important theoretical approaches to data compression associated with the implementation of encoding systems and methods of information compression.

Compression by series encoding method

The most well-known simple approach and algorithm for compressing information in a reversible way is Run Length Encoding (RLE). The essence of the methods in this approach is to replace chains or series of repeating bytes or their sequences with one coding byte and a counter for the number of their repetitions. The problem with all similar methods is only to determine the way in which the decompressing algorithm could distinguish an encoded series from other unencoded byte sequences in the resulting byte stream. The solution to the problem is usually achieved by placing marks at the beginning of the coded chains. Such marks can be, for example, characteristic bit values ​​in the first byte of a coded series, the values ​​of the first byte of a coded series, etc. These methods, as a rule, are quite effective for compressing raster graphics (BMP, PCX, TIF, GIF), because the latter contain quite a lot of long series of repeating byte sequences. The disadvantage of the RLE method is the rather low compression ratio or the cost of encoding files with a small number of series and, even worse, with a small number of repeating bytes in the series.

Compression without using the RLE method

The process of data compression without using the RLE method can be divided into two stages: modeling and, in fact, encoding. These processes and their implementing algorithms are quite independent and diverse.

Coding process and its methods

Coding usually means processing a stream of characters (in our case, bytes or nibbles) in some alphabet, and the frequencies of appearance of characters in the stream are different. The purpose of encoding is to convert this stream into a stream of bits of the minimum length, which is achieved by reducing the entropy of the input stream by taking into account symbol frequencies. The length of the code representing characters from the stream alphabet must be proportional to the amount of information in the input stream, and the length of the stream characters in bits may not be a multiple of 8 or even variable. If the probability distribution of frequencies of occurrence of symbols from the alphabet of the input stream is known, then an optimal coding model can be constructed. However, due to the existence of a huge number of different file formats, the task becomes much more complicated. The frequency distribution of data symbols is unknown in advance. In this case, in general, two approaches are used.

The first is to view the input stream and construct an encoding based on the collected statistics (this requires two passes through the file - one to view and collect statistical information, the second for encoding, which somewhat limits the scope of such algorithms, because, in this way, , eliminates the possibility of single-pass on-the-fly encoding used in telecommunication systems, where the volume of data is sometimes unknown, and its retransmission or parsing can take an unreasonably long time). In this case, the statistical scheme of the encoding used is written to the output stream. This method is known as static Huffman coding.

IntroductionWe use archivers all the time. On our website there is a detailed (albeit written long ago) description of the most popular archiver programs ( Archivists: A look from the outside), which we will not repeat here, but will deal only with the compression algorithms that are used in these programs. What's the problem here? Modern archivers give us the opportunity to choose from several compression algorithms. Here, for example, are the characteristics of some programs...

Formats supported by archivers

ArchiverPacking and UnpackingUnboxing only
WinZipZIPTAR, GZIP, BH, ARJ, LZH, ARC
WinRarRAR, ZIPCAB, ARJ, LZH,TAR, GZ, ACE, UUE, BZ2, JAR, JSO
WinAceACE, ZIP, LHA, MSCABRAR, ARC, ATJ, GZIP, TAR ZOO
7-Zip7Z, ZIP, GZIP, TAR, BZIP2RAR, CAB, ARJ, CPIO, RPM, DEB, SPLIT
Power ArchiverTAR, BH, CAB, LHA, ZIPRAR, ACE, ARJ, GZIP, BZIP2, ARC, ZOO

Depending on the circumstances, we use the archiver as a compressor, which is required to compress information for faster transmission over communication channels (mail and the Internet). In other cases, the archiving function itself is of greater importance, that is, converting information into a compact form (one file) in order to get rid of disassembly and, in addition, reduce the space occupied on disk due to the file table. Accordingly, the indicator of compression of the original information and the indicator of the speed of processing of the original information are of great interest. The purpose of our research is to determine the absolute and relative indicators of the degree of compression and performance of the algorithms (formats) that are made available to us by the archivers listed in the table...

The content of the study is planned as follows:

1. Creation of comprehensive and private (by file type) sets of information (folders) for testing (tests).

2. Conducting preliminary tests on a complex set and clarifying (based on the results) the plan for further local tests.

3. Processing and analysis of results with substantiation of recommendations for the practical application of different archiving algorithms (formats).

As an indicator of the degree of compression, the percentage ratio of the size of a compressed folder to its original size is taken, and as an indicator of performance, the processing speed is taken as the quotient of the original size in kilobytes divided by the processing time in seconds. Actually, measurements are performed only in relation to time (with a stopwatch). A time measurement error can distort the performance indicator when this indicator is very large (more than 1000 kb/sec). In other cases, the error can be ignored.

Definition of general characteristics of the main archival formats

For testing, we used material simulating a “custom basket” made up of files in DOC, HTM, JPG, MP3, PDF, TXT formats. In total, the basket contains 359 folders and 3337 files, and has a total size of 208893 KB (about 204 MB). The composition of this set is shown in the following table:

Composition of a set of files for testing

TypeNumber of foldersNumber of filesSize, KBOn disk, KB
TXT 0 2 34781 34783
HTM 329 2869 30913 36962
DOC 3 24 31443 31474
PDF 0 1 33691 33694
JPG 26 430 40493 41382
MP3 1 11 37571 37589

Total 359 3337 208893 215884

Each test consisted of conducting an archiving cycle with recording the time the archiver worked from the moment the Add button was pressed until the window with the contents of the received archive file was opened.

Tested programs:

WinZip 8.1 SR-1
WinRar 3.30
WinAce 2.5
7Zip 3.13
Power Archiver 8.70 07b


System Configuration Information

Processor Intel Celeron 1700MHz
256 Mb (DDR SDRAM)
HDD ST360015A (60 Gb, 7200PRM)
Windows 2000 Pro, SP3

The test results are shown in the following tables:

Test results for ZIP format

Archiver / ModeSize, KBTime, min.-sec.CompressionSpeed, KB/s

WinZip
Without compression 208893 - - -
Norm 146408 2-00 70.0% 1740
Maximum 145884 2-45 69.8% 1266
Fast 147690 1-58 70.7% 1770
Very fast 149450 1-50 71.5% 1899

WinRar
Usually 146 078 2-22 69.9% 1471
Maximum 145881 3-07 69.8% 1117

WinAce
Norm 146 418 2-28 70.1% 1411
Maximum 145844 2-40 69.8% 1305

7-Zip
Normal/Deflate 145 480 3-22 69.6% 1034
Ultra/Deflate 145 341 5-55 69.6% 588
Ultra/Deflate64 144924 6-10 69.4% 565

Power Archiver
Norm 146074 3-40 69.9% 950
Maximum 145948 3-42 69.9% 941

In general, the compression obtained by the ZIP format is approximately the same order of magnitude, and depends little on the archiver - with the exception of the 7-ZIP archiver, in which, by changing the compression method, the indicator for the ZIP format can be slightly improved. The size of the dictionaries (WinRar and 7-ZIP archivers) was not changed specifically in this series of tests, but was set automatically (by default).

ModeSize, KBTime, min.-sec.CompressionSpeed, KB/s
Without compression 208893 - - -
Store 209129 0-58 100.1% 3601
Fastest 144017 6-00 68.9% 580
Fast 143281 6-22 68.6% 547
Normal 142830 6-40 68.4% 522
Good 139826 6-58 66.9% 499
Best 140023 7-25 67.0% 469
Best (64kb) 140685 5-40 67.3% 614

In the mode settings, it is possible to change the dictionary size within the range of 64 - 4096 kilobytes. By default, the maximum size is set (4096 KB), with which the results in this table were obtained. Only in the Best line (64kb) the minimum size was set to 64 kilobytes. Obviously, the resulting change in compression and performance can serve as an analogue for all other rows of this table.
The Good and Best lines were tested and their values ​​were fully confirmed, so an illogical transition between them cannot be considered a consequence of testing errors.

ACE format testing results

ModeSize, KBTime, min.-sec.CompressionSpeed, KB/s
Without compression 208893 - - -
Normal 132978 8-30 63.7% 410
Maximum 132918 8-42 63.6% 400
Good 132925 9-50 63.6% 354
Fast 133216 8-53 63.8% 397
Super Fast 133273 8-46 63.8% 397
Store 209136 1-48 100.1% 1934

Changes in the operating mode of the WinAce archiver in our case have little effect on the compression performance - the spread is within tenths of a percent.

7z format testing results

ModeSize, KBTime, min.-sec.CompressionSpeed, KB/s
Without compression 208893 - - -
Normal 130964 9-24 64.2% 362
Maximum 130000 13-51 63.7% 246
Fast 141922 4-16 69.6% 797
Ultra (1 MB) 131392 8-47 64.4% 387
Ultra (6 MB) 130101 11-40 63.8% 291
Ultra (12 MB) 129871 12-47 63.7% 266
Ultra (24 MB) - - - -
Ultra (Deflate) 141171 3-15 69.2% 1046
Ultra (PPMd) 140171 8-45 68.7% 389
Ultra (Bzip2) 135342 7-32 66.4% 451

Note:

For the 7z format, the archiver allows you to install:

- Level (Fast, Normal, Maximum, Ultra),
- Method (LZMA, PPMd, Bzip2, Deflate),
- Dictionary size (32kb - 192 mb),
- Word size (8 - 255).

As you can see, a very large number of combinations of setting the archiver operating mode are possible, which can confuse the user. You can be guided by the following premises:

- The larger the dictionary size, the greater the compression and packaging time. The compression increases slowly, but the packing time increases very strongly.

- The same applies to word size.

- The optimal settings are set themselves (default settings), and you don’t have to change them unless necessary.


CAB Format Test Results

ModeSize, KBTime, min.-sec.CompressionSpeed, KB/s
Without compression 208893 - - -

PowerArchiver
Medium 140444 9-55 67.2% 351
Maximum 137152 15-55 65.6% 219

WinAce
Norm 144374 3-24 69.1% 1024
Maximum 138538 12-54 66.3% 270

The CAB (cabinet file) format is based on the MS-Zip and LZX algorithms, supported and used by Microsoft. Format unpackers are available in Windows 98 and higher. The algorithm is open source and can be freely used by all programmers.

Test results for BH and LHA formats

ModeSize, KBTime, min.-sec.CompressionSpeed, KB/s
Without compression 208893 - - -

PowerArchiver, LHA format
Norma 147518 4-40 70.6% 746
Maximum 147518 4-47 70.6% 728

PowerArchiver, BH format
Norma 145912 2-16 69.8% 1536
Maximum 145718 2-34 69.8% 1356

The indicators of the LHA and BH archive formats are at the level of the indicators of the ZIP archive format, and no advantages are visible.

In general, as you can see, the best compression performance is provided by the ACE and 7Z formats. The best performance indicators were shown by the ZIP and BH formats. Further tests are planned to be carried out according to the same principle, but with “baskets” of a homogeneous composition, with file formats: TXT, HTML, DOC, JPG, MP3, PDF.

Determining the compressibility of files of different formats

To ensure this series of tests, sets of completely uniform file formats were compiled, and duplicate files in the set were excluded. EXE and DLL files were taken from the Windows system folder without any selection. The fact is that EXE files are already compressed and further compression does not make sense. The characteristics of the sets are given in the following table:

File formats in test sets

FormatNumber of foldersNumber of filesTotal size, KB
TXT 0 27 35096
HTM 7 1371 25076
DOC 1 33 37211
PDF 0 1 33691
JPG 26 430 40493
MP3 2 11 37571
EXE 0 316 32446
DLL 0 184 40323
XLS 6 15 17228
CHM 0 69 33940
MPEG 0 24 46606
WAV 0 1 30804
BMP 0 15 31713
AVI 0 89 9261

During testing, only the normal (usual) mode of operation of the archiver was used. At the same time, each archive format was created by its own archiver (WinZip, WinRar, WinAce, 7-Zip); Power Archiver was used to pack it into the CAB format, which does not have its own (proprietary) format.

File compressibility depending on archive format

FormatZIPRARACE7ZCAB
TXT 43.7% 37.8% 37.4% 34.3% 36.3%
HTM 29.2% 28.3% 9.09% 7.75% 15.0%
DOC 8.76% 6.39% 5.47% 5.21% 6.49%
PDF 97.7% 97.4% 97.8% 97.5% 97.3%
JPG 98.5% 98.5% 85.0% 85.1% 97.9%
MP3 98.1% 97.9% 98.1% 97.9% 97.7%
EXE 46.9% 42.1% 37.8% 32.7% 39.3%
DLL 45.6% 39.6% 37.6% 34.3% 39.6%
XLS 11.8% 8.27% 7.44% 5.97% 8.49%
CHM 98.6% 98.8% 99.0% 99.6% 98.6%
MPEG 95.3% 94.7% 94.8% 94.5% 94.4%
AVI 86.1% 84.1% 84.5% 82.7% 83.4%
WAV 92.2% 62.8% 62.6% 87.0% 92.1%
BMP 63.5% 31.9% 30.6% 51.5% 56.2%

Average 65.5% 59.2% 56.2% 58.3% 61.6%

As a comment to the table, the following can be noted:

- The best compression for the main source file formats is provided by the 7z archive format.

- The best average figure is for the ACE archive format due to record compression of the WAV and BMP formats.

If we talk about the compressibility of source files, we can note the following: the compression rate depends on the source file format, sometimes implying internal data compression. If the file is pre-compressed according to its own algorithms, then its compressibility by the archiver is low. For example, a CHM file is a compressed version of an HTML file and, accordingly, their compressibility is different. We see the same in relation to Wav and MP3, BMP and JPG and so on.

Archiver operating speed, KB/s

FormatZIPRARACE7ZCAB
TXT 2064 408 386 217 226
HTM 2507 836 627 643 411
DOC 7400 2862 1550 1378 886
PDF 2246 293 370 387 370
JPG 2670 587 337 368 287
MP3 2348 458 368 335 332
EXE 2318 773 601 416 433
DLL 2016 858 672 474 434
XLS 4300 1436 1148 507 224
CHM 1886 556 365 357 323
MPEG 2453 583 416 370 338
AVI 1852 617 463 370 356
WAV 2370 1711 1184 354 288
BMP 2883 1269 933 401 373

Average 2838 856 609 485 385

This table demonstrates an obvious rule - better compression almost always comes at the cost of packing speed.

Compressibility of different file formats. Addition

FormatZIPRARACE7Z
VXD 55.1% 52.5% 43.3% 40.8%
INF 14.9% 13.3% 13.2% 12.3%
VBP 78.3% 72.6% 26.0% 18.5%
GIF 90.0% 94.3% 87.2% 86.1%
SCR 88.8% 88.0% 88.1% 87.9%
DAT 23.1% 20.1% 20.5% 18.0%
INI 35.6% 33.2% 32.5% 30.2%

Average 55.1% 53.4% 44.4% 42.0%

This table provides additional data on the compressibility of file formats. Here testing was carried out without recording time on small sets (100-200 kb). As you can see, for all formats the best compression is provided by the 7z archive format.

Next, as an example, I will give the results of packaging a real distribution kit of the Norton Antivirus program. Packing was carried out in normal mode; additionally, self-extracting versions of the same archives were obtained. The result of this test is shown in the following table (the last column is the approximate time to load the packaged distribution over the network using a regular modem connection at a speed of 2.7 KB per second):

Archive formatSize, KBTimeCompressionLoading time, hours-min.
Without compression 47410 - - 4-53
ZIP 29045 0-21 61.3% 2-59
RAR 26619 1-15 56.1% 2-44
ACE 23838 1-30 50.3% 2-27
7Z 22871 1-50 48.2% 2-21
CAB 26804 2-22 56.5% 2-45
EXE (RAR) 26671 1-15 56.3% 2-45
EXE (ACE) 23903 1-30 50.4% 2-28
EXE (7Z) 22941 1-52 48.4% 2-22

The table results clearly demonstrate that:

When transferring files over the network, packaging is almost mandatory.

Packaging with good compression can reduce file transfer time, in our case by half an hour.

The use of promising formats ACE and 7Z is quite justified now in the form of self-extracting archives. It is advisable for distributors of software products to take this circumstance into account on the Internet.

The 7-ZIP archiver is a good program with a high compression ratio and has the necessary minimum of user convenience. In particular, you can delete and view individual files without unpacking the archive. At the same time, files are opened by associated applications of the system. You can supplement the archive with separate files.

Conclusion

Archive programs remain an indispensable tool for packaging and compressing digital information. The processed information significantly saves storage space and transmission time over communication channels in the network. The most popular and used packaging formats today are ZIP and RAR. Other formats, for example, ARJ, ICE, PAC, ARC and some others, were gradually replaced and forgotten. But packaging technology does not stand still. Archivers are in demand, so programmers are constantly searching for more efficient compression methods. This is evidenced by the results of our experiment. In reality, there are at least two archive formats (ACE and 7z), which are significantly superior in compression to the usual ZIP and RAR. The use of these formats will significantly reduce the time it takes to transfer files over the Internet, which meets the interests of many users...

Update dated May 24, 2004

In this section we will look at the impact of the Solid option on the performance of archivers. Let us remind you that packaging with the Solid option results in the fact that a file cannot be added to the archive and a separate file cannot be extracted from it; the archive is packed and unpacked only as a whole. In general, this can cause some inconvenience when using such archives. But sometimes such inconveniences may be of secondary importance compared to the advantages.

Additional testing was done exactly as described in the main section on the same sets of material. Taking into account additional testing, the table "RAR format testing results" of the main text began to look like this...

RAR format testing results

ModeSize, KBTime, min.-sec.CompressionSpeed, KB/s
Without compression 208893 - - -
Store 209129 0-58 100.1% 3601
Fastest 144017 6-00 68.9% 580
Fast 143281 6-22 68.6% 547
Normal 142830 6-40 68.4% 522
Normal (Solid) 131664 9-14 63.0% 377
Good 139826 6-58 66.9% 499
Good (Solid) 129314 8-24 61.9% 414
Best 140023 7-25 67.0% 469
Best (Solid) 129527 8-36 62.0% 405
Best (64kb) 140685 5-40 67.3% 614

Setting up the WinRar archiver includes:

1. Selecting a compression method (Normal, Store, Fastest, Fast, Good, Best).

2. Choice of modification:
- Add and replace files,
- Add and update files,
- Fresh existing files only,
- Syncronize axchive contents.

3. Select option:
- Deleting files after archiving,
- Create SFX archive,
- Create solid archive,
- Put autohenlicity verification,
- Put recovery record,
- Test archived files,
- Lock archive.

It is easy to see that there are more than a hundred combinations of settings that determine the operating mode of the archiver. Accordingly, the range of results for this format and this archiver turned out to be quite large - compression ratio: 61.9 - 68.9%, speed: 377 - 614 KB/sec.

The WinAce archiver also has the Solid option. But in this archiver the option (Make solid archive) is always enabled (by default) and therefore included in the test results. Thus, injustice was committed only for the RAR format and the WinRar archiver.

Taking into account the new circumstances, the leaderboard for compression ratio looks like this:

1. RAR (Good, Solid) - 61.9%.

2. 7-Zip (Maximum) - 62.2%.

3. ACE (Good) - 63.6%.

The updated table of packaging results for a real Norton Antivirus distribution package ("Example of Norton Antivirus distribution package packaging") now looks like this...

Norton Antivirus distribution packaging example

Archive formatSize, KBTimeCompressionLoading time, hours-min.
Without compression 47410 - - 4-53
ZIP 29045 0-21 61.3% 2-59
RAR 26619 1-15 56.1% 2-44
RAR (Normal, Solid) 22745 1-21 48.0% 2-20
RAR (Good, Solid) 22680 1-28 47.8% 2-20
ACE 23838 1-30 50.3% 2-27
7Z 22871 1-50 48.2% 2-21
CAB 26804 2-22 56.5% 2-45
EXE (RAR) 26671 1-15 56.3% 2-45
EXE (RAR, Normal, Solid) 22797 1-29 48.1% 2-21
EXE (ACE) 23903 1-30 50.4% 2-28
EXE (7Z) 22941 1-52 48.4% 2-22

The results of this table also confirm that the WinRar archiver can provide maximum compression, and is a leader in this indicator. Compared to the ZIP format, downloading the same distribution in RAR format can be done 39 minutes shorter...

In the table with the results of testing the 7z format, our reader Alexander Rykhlov discovered an error in calculating the compression ratio. Thank you very much to Alexander, and the corrected table “7z format testing results” now looks like this...

130101 11-40 62.3% 291
Ultra (12 MB) 129871 12-47 62.2% 266
Ultra (24 MB) - - - -
Ultra (Deflate) 141171 3-15 67.6% 1046
Ultra (PPMd) 140171 8-45 67.1% 389
Ultra (Bzip2) 135342 7-32 64.8% 451

Note: in Ultra mode (LZMA) when setting the Dictionary size to 24 megabytes, the speed decreased so much that the test became impossible.

Conclusion


The brewing sensation that the WinRar archiver was not as good as many users believed did not materialize. Our testing has confirmed that the technical characteristics of this archiver are indeed the highest today. The 7-Zip archiver has very similar indicators, but in terms of the degree of development and user qualities, the latter is still somewhat inferior to the leader. To obtain maximum compression in the WinRar archiver, you must enable the Solid option (it is disabled by default), other settings (Normal, Good, etc.) have a lower value.

General information about archiving files

Process conceptarchiving files One of the most widely used types of service programs is archiving programs, intended for archiving, packaging files by compressing the information stored in them. Information compression - this is the process of converting information stored in a file to a form in which redundancy in its representation is reduced and, accordingly, less memory is required for storage. Compression of information in files is carried out by eliminating redundancy in various ways, for example, by simplifying codes, eliminating constants from them bits or representing repeating symbols or a repeating sequence of symbols in terms of repetition rate and corresponding symbols. Various algorithms for such information compression are used. Either one or several files can be compressed, which are placed in a compressed form in a so-called archive file or archive. Archive file- this is a specially organized file containing one or more files in compressed or uncompressed form and service information about file names, the date and time of their creation or modification, sizes, etc. The purpose of packing files is usually to ensure a more compact placement of information on disk, reducing the time and, accordingly, the cost of transmitting information via communication channels in computer networks. In addition, packaging a group of files into one archive file significantly simplifies their transfer from one computer to another, reduces the time of copying files to disks, allows you to protect information from unauthorized access, and helps protect against infection by computer viruses. File compression level characterized by the coefficient Ks, defined as the ratio of the compressed file volume Vc to the volume of the source file Vo, expressed as a percentage: Kc=(Vc/Vo)*100% The compression ratio depends on the program used, compression method and source file type. The most well-compressed files are graphic images, text files and data files, for which the compression ratio can reach 5 - 40%; files of executable programs and load modules are compressed less - 60 - 90%. Archive files are almost not compressed. Archiving programs differ in the compression methods they use, which consequently affects the compression ratio. Archiving (packaging)- placing (downloading) source files into an archive file in compressed or uncompressed form. Unzipping (unpacking) - the process of restoring files from an archive exactly as they were before they were loaded into the archive. When unpacking, files are extracted from the archive and placed on disk or in RAM; Programs that pack and unpack files are called archiving programs Large archive files can be placed on several disks (volumes). Such archives are called multi-volume. A volume is an integral part of a multi-volume archive. When creating an archive from several parts, you can write its parts onto several floppy disks. Main types of archiver programs Currently, several dozen archiver programs are used, which differ in the list of functions and operating parameters, but the best of them have approximately the same characteristics. Among the most popular programs are: ARJ, PKPAK, LHA, ICE, HYPER, ZIP, RAK, ZOO, EXPAND, developed abroad, as well as AIN and RAR, developed in Russia. Typically, packing and unpacking files are performed by the same program, but in some cases this is carried out by different programs, for example, the PKZIP program packs files, and PKUNZIP unpacks files. Archiving programs also allow you to create archives from which you can extract the contents contained in These files do not require any programs, since the archive files themselves may contain an unpacking program. Such archive files are called self-extracting. Self-extracting archive file - This is a bootable, executable module that is capable of independently unzipping the files contained in it without using an archiver program. The self-extracting archive is called SFX - archive (SelF - eXtracting). Archives of this type in MS DOS are usually created in the form of an .EXE file. Many archiver programs unpack files, uploading them to disk, but there are also those that are designed to create a packaged executable module (program). As a result of such packaging, a program file is created with the same name and extension, which, when loaded into RAM, self-extracts and runs immediately. At the same time, it is also possible to convert the program file back to the unpacked format. Such archivers include the PKLITE, LZEXE, UNP programs. The EXPAND program, which is part of the utilities of the MS DOS operating system and the Windows shell, is used to unpack software product files supplied by Microsoft. RAR and AIN archiver programs, in addition to the usual compression mode, have a solid mode, in which archives are created with an increased compression ratio and a special organizational structure. In such archives, all files are compressed as one data stream, i.e. The search area for repeating character sequences is the entire set of files loaded into the archive, and therefore the unpacking of each file, if it is not the first, is associated with the processing of others. Archives of this type are preferable to use for archiving a large number of files of the same type. Ways to manage the archiver program The archiver program is controlled in one of two ways:
  • using the MS DOS command line, in which a launch command is generated containing the name of the archiver program, the control command and its configuration keys, as well as the names of the archive and source files; similar management is typical for archivers ARJ, AIN, ZIP, RAK, LHA, etc.;
  • using a built-in shell and dialog panels that appear after starting the program and allow control using menus and function keys, which creates a more comfortable working environment for the user. The RAR archiver program has this control.
Carrying out the actions prescribed to it, the archiver program, as a rule, displays a log of its work on the screen. All modern archiver programs are equipped with help screens that are called up when you enter only one program name or a name with the /? key on the command line. Help can be brief - on one screen or detailed - on several. Many archivers have help screens with examples of composing commands to perform various operations. Help information is usually displayed in English or another international language. Considering the similarity of the management principles of most archiver programs, let's consider the main features of the ARJ program (version 2.42), which is known as one of the best in terms of the range of functions provided to the user, compression ratio and speed of operation. The ARJ program is especially effective when working with database files and text files. ARCHIVE PROGRAM ARJ Appointment of the ARJ archiver The ARJ program allows you to:
  • create archive files from individual or all files of the current directory and its subdirectories, loading up to 32,000 files into one archive;
  • add and replace files in the archive;
  • extract And delete files from the archive;
  • protect each archived file with a 32-bit cyclic code, test the archive, checking the safety of information in it;
  • receive help with work in 3 international languages;
  • enter comments to files into the archive;
  • remember file paths in the archive;
  • save several generations (versions) of the same file in an archive;
  • reorder the archive file by file size, name, extension, date and time of modification, compression ratio, etc.;
  • search for strings in archived files;
  • restore files from destroyed archives;
  • create self-extracting archives both on one volume and on several volumes;
  • view the contents of text files contained in the archive;
  • ensure the protection of information in the archive and access to files placed in the archive using a password.
COMMAND LINE STRUCTURE FOR WORKING WITH THE ARJ PROGRAM To get brief help on the screen, just enter the program name in the command line: ARJ. To get detailed help and examples of specifying commands, enter: ARJ - ? or ARJ /?To load a program and perform the necessary functions, a command line format is used, where the program name and parameters are separated by spaces: ARJ<команда> [-<кл1> [-<кл2>,..]] <имя_архива> [<список_имен_файлов>] Required command line parameters are two parameters:<команда>And<имя_архива>.Parameter<команда>is written as a single character following the program name and specifies the archiving function in accordance with table. 11.1. Table11.1 - Basic commands of the ARJ archiver program

Group number

Team group

Team

Archive function

Archiving

add files to archive

replace files in the archive with new versions

add only new files to the archive

move files to archive

Extracting from the archive

extract files from archive to current directory

extract files from the archive and place them in directories in accordance with the specified access paths

Deleting from the archive

delete files from archive

Service functions

full testing of the archive

output the contents of the archive without specifying the path to the files

output the contents of the archive indicating the path to the files

copy archive with new parameters

find a text string in an archive

Parameter<имя_архива>specifies the name of the archive file and is written according to the general rules of MS DOS, but without specifying the extension, which is assigned automatically when a new file is created. The archive name can be written with the path to the file. The archiver processes archive files with the .ARJ extension by default. Self-extracting archive file is created with the extension .EXE. Such a file contains an unpacking software module, and the ARJ program is not required to extract files from it. Optional command line parameters are switches<клN>And<список_имен_файлов>. It is customary to denote optional parameters using square brackets. The keys specify the action of the archiving command, and there can be several of them. Each switch begins with a "-" character and can be placed anywhere on the command line after the command. In addition to the "-" symbol, the key sign can be the "/" symbol. In table 11.2 The most important configuration keys are listed. Note.Commands and keys of the ARJ archiver program can be entered on the command line in any register. A list of file names is provided when not all files in the archive or current directory are subject to processing. If you need to add, extract, or delete multiple files on the command line, write down their full names. You can specify up to 64 file names in the file list. To shorten the recording of file names, you can use templates in accordance with MS DOS rules, for example: *.* - all files; *..bat"- all files with the extension .BAT; A?.*- all files starting with A. Table 11.2. The most important configuration keys for the ARJ archiver program

Purpose

Adding files from the current directory and all its subdirectories, indicating the path to the files
Creating a multi-volume archive file

Password protection of the created archive:

g<пароль>- password is entered on the command line

g? - enter an invisible password when executing

Adding/replacing files, with the exception of files whose names are indicated after the key

Request to perform an operation for each file:

To confirm, you must enter the symbol "Y"

for refusal - symbol "N"

Creating a self-extracting archive

Specifying the archiving method:

m0 - no compression;

m1 - normal compression (default);

m2 - highest compression;

m3 - fast compression and less compression;

m4 - fastest compression and lowest compression;

The answer is “Yes” to all questions from the archiver.
Pause when viewing archive content after the screen is full
Archiving files One of the main operations when working with archive files is placing files in an archive, which can be performed using the commands: a, u, m, f. Most often these commands are used in conjunction with keys: -r, -g, -q, -je Let us give typical examples of commands for creating and editing archive files. Example 11.1. To archive file arhtxt add two files from the current directory n1.txt and n2.txt:ARJ and arhtxt n1.txt n2.txt Example 11.2. Create an archive file in the current directory arhobj.arj, containing all the files in the directory OBJ:ARJ a arhobj obj\*.* Note . When adding any files that are already in the archive, the files are replaced regardless of the date and time of their modification or creation. Example 11.3. On drive B: create an archive arhmat.arj, in which you need to place all the files in the current directory, except for files with the extension prg. Files are added to the archive indicating their paths : ARJ a b:\arhmat - x*.prg -r Example 11.4. Replace files in the archive with new versions arcmat.arj on disk b: and add to it from the current directory files that are not in the archive: ARJ u b:\arcmat Note. If there are no new or missing files in the source directory, the message “no change” is displayed on the screen. Example 11.5. Move to archive file bas.arj all files with the extension bas from the current directory:ARJ m bas*.bas Note . Team m similar to the command A, except that upon successful completion, the moved files are removed from the original directory. By default, the command does not ask for permission to delete. Example 11.6. Replace only new files with the extension in the archive bas from the current directory with confirmation for each file: ARJ f bas*.bas -q Example 11.7. Move to archive file arch.arj all files in the current directory, protecting them with a password DINO:ARJ m arch -gDINO Example 11.8. Add to archive arch.arj from the current directory all files with the extension bas, protecting them with a password that will be entered upon request during the archiving process: ARJ a arch -g? *.fox Example 11.9. Create an unpacking archive file yourself arxbank.exe, containing all the files in the current directory:ARJ a arxbank -je Attention! When entering a password word, the case of the characters entered matters, for example, the passwords DINO and Dino are significantly different. It is very important not to forget the password, without which it will be impossible to extract files from the archive. Extracting files from an archive Extracting files from the archive is carried out using the commands e or X. Team e extracts files and places them either in the current directory or according to the path specified on the command line itself. Team X extracts files to the directory from which they were previously archived, and if such a directory does not exist on the disk, it will be created. In the event that there is already a file with the same name in the directory in which the extracted file should be placed, the program will ask the user for permission to replace the file. The user must enter the character "Y" to allow substitution or "N" to deny. To exclude such a dialogue with the program, you can enter the key into the command line - at, which corresponds to the response "Y" to all requests to replace files. Files archived with a password can only be retrieved if the password is specified correctly. Example 11.10.Extract from archive file arhtxt.arj two files n1.txt And n2.txt to the current directory: ARJ e arhtxt n1.txt n2.txt Example 11.11. Extract from archive file arhobj.arj all files in the current directory: ARJ e arhobj Example 11.12. Extract from archive file arhobj.arj all files in directory d:\obj:ARJ e d:\obj\arhobj Example 11.13.Extract from archive file arch.arj all files to the current directory specifying the DINO password and without confirming requests to replace existing files: ARJ e arch - gDINO -y Example 11.14.Extract from archive file arhmat.arj on disk IN: all files and write them to directories according to their paths: ARJ x b:\arhmat Removing files from the archive The ARJ archiver program allows you to physically delete one file or a group of files specified in a list from an archive file. Using the key -q, You can provide a warning before deleting each file from a specified list. When all files are deleted from the archive, it is saved on disk as an empty file, i.e. file with zero size. Example 11.15. Deleting from an archive file arhmat.arj two files with confirmation for each file: ARJ d -q arhmat m_012.fox m_12.prg Service functions The service functions that the ARJ archiver program has are very diverse. The user can test the archive, view the contents of the archive on the screen or print out the contents of the archive, replace the names of files in the archive, copy the archive with new parameters, find a text string in text files contained in the archive, and much more. Archive testing. Archive testing is based on the principle of checking the cyclic redundancy check code (CRC) of each file included in it. Cyclic control code is calculated as the sum of all the codes representing the file information, and is therefore often called the file checksum. When calculating the checksum, its maximum value is usually limited to 16 or 32 bits, and in order to avoid overflow, the transfer value from the most significant bit is added to the value of the least significant bit. During testing, the newly calculated cyclic check code is compared with the code stored in the archive. When the integrity of any file is compromised, its CRC changes and a mismatch occurs. Either the entire archive or part of it can be scanned in accordance with the list of files. The check is carried out quite quickly and is accompanied by the display of a protocol in which the value “OK” is displayed for each correct file. Scanning password-protected files is not possible without specifying the password. Archive testing - this is a check of the integrity of the information of each file contained in the archive. Example 11.16. Check the integrity of all files in the archive arcmat.arj on disk A:ARJ t a:arcmat Viewing archive contents . To view the contents of the archive, two commands are used: I And v. The contents of the archive can be displayed on the screen or to the standard output device. Team I prints information about each file on one line, the command v- in two lines, one of which indicates the path to the file. The display may be paused after the screen is full if the key is used -jp. The contents of the archive are displayed in the form of a table in which information about the files is arranged in the order in which the files were placed in the archive. The table is not sorted. The table can include either information about all files, or about some of them in accordance with a given list of files. You can view the contents of both regular archive files and self-extracting files with the EXE extension. You can use ARJ message redirection to print file information to a printer. On rice. 11.1 The contents of the archive file QPR4.ARJ are shown. To view the command was used: ARJ I qpr4 The columns in Figure 11.1 contain the following information about the files: Filename - file name; Original - original file size; Compressed - compressed file size; Ratio - compression ratio; DateTime modified - date and time of file creation (modification); CRC-32 - 32-bit cyclic control code; Attr - file attributes; BTPMGVX - additional information about the file. Rice. 11.1 Screen view displaying the contents of the archive file qpr4.arjProcessing archive: QPR4.ARJArchive created: 1996-02-23 18:41:34, modified: 1996-02-23 18:43:46Filename Original Compressed Ratio Date Time modified CRC-32 BTPMGVXANALYZE .WQ1 13844 2898 0.209 92-10-13 17:34:26 311D59E9 A-W B 1GMASTER.WQ1 69500 ​​20816 0.300 92-09-12 04:00:00 85B7D6F6 A-W B 1GOPTIMIZR.WQ1 6491 2556 0.394 92-10-13 17: 54:56 F1B958DE A-W B 1GREGISTER.WQ1 5537 2001 0.361 92-09-12 04:00:00 3B9A3005 A-W B 1GSAMPLE.WQ1 5017 1912 0.381 92-12-02 20:51:28 31508 CCA A-W B 1GZVUKEFKT.WQ1 205 968 0.439 94-11-01 00:39:54 118CBFC3 A-W B 1GGRAGRED.WQ1 3437 1306 0.380 94-11-02 22:50:28 55C06C4F A-W B 1GCOUP.SPO 19862 15243 0.767 92-02-12 0 4:00:00 3D1734C3 A-W B 1ASCII>SOR 1637 975 0.596 92-09-12 04:00:00 010C0344 A-W B 1DUTB.SFO 33228 33176 0.998 92-02-12 04:00:00 1D76197A A-W B 1 10 files 8 81851 0.509The last column of the table displays file characteristics: B - for files with the extension .VAK; T - file type (B - binary, T - text, D - directory); P - the archive contains information about the path to the file, which can be viewed with the command v; M - compression method; G - sign of file password protection ; V - the file is continued on the next volume; X - file originates on previous volumes. Example 11.17. Display information about files with the extension bas stored in an archive file bas.arj with pause when the screen is full: arj I bas *.bas -jp Example 11.18. Display information about all files contained in the archive arh-mat.arj on disk A:, indicating the paths to the files:ARJ v a:\arhmat -jp Example 11.19. Display information about the files contained in the self-extracting archive arxbank.exe:ARJ I arxbank.exe Example 11.20. Display information about all archive files arhmat.arj to printer:ARJ v a:\arhmat > prn Copying an archive with new parameters mi. To change archive parameters, use the command at, with which you can, for example, convert a regular archive file into a self-extracting one. Example 11.21. Create a self-extracting archive file arhmat.exe from archive file arhmat.arj ARJ y -je arhmat Working with multi-volume archives One of the important advantages of the ARJ archiver program is the ability to create multi-volume archives, i.e. archives that use several disks to accommodate them. Each disk contains one archive file, occupying all its free space. In this case, it is not necessary that the disk be previously cleaned, since there may be other files on it along with the archive file. When creating an archive, the file placed on the first disk is assigned the extension .ARJ by default, and on subsequent disks - .A01, .A02, etc. The rule for naming extensions can be changed using setting keys, which practically removes restrictions on the number of archive volumes. Viewing the table of contents of each archive file in a multi-volume archive is carried out in the same way as in a single-volume archive. The ARJ program allows you to adjust the contents of a multi-volume archive - delete, replace and add files. In this case, files are not redistributed between volumes. To work with a multi-volume archive, you must specify the key -v. Refinement of command settings is achieved by using command modifiers. Command modifier - this is a Latin character in any register, written after the key. There can be several molikaurs in a team; the order in which they are recorded does not matter. In addition, a number that indicates the size of the archive volume in bytes can be used as modifiers. The purpose of some modifiers is given in Table 11.3.Table 11.3. Assigning modifiers to the ARJ command for working with a multi-volume archive

Modifier

Purpose of the modifier

Indicates that archive files of a multi-volume archive will occupy all free space on disks (volumes)
Allows you to execute any number of DOS commands before creating a new volume, for example viewing, cleaning or formatting a floppy disk on which the next archive file will be written; After executing the commands, you must enter the EXIT command to continue archiving
Prohibits sharing archived files between volumes
Provides a sound signal before installing the next volume
Allows you to reserve free space on the first volume; the number written after the symbol r indicates the size of this space

360, 720, 1200

Modifier options for specifying archive volume sizes
Example 11.22. Create a multi-volume archive armat.arj in the drive A: using all free space on floppy disks:ARJ a A:armat -vaExample 11.23. Create a multi-volume archive armat.arj in the drive A: using all the free space on the floppy disks, sounding a beep and entering MS DOS commands before installing the next disk:ARJ a A:amat -vvas Example 11.24. Create a multi-volume archive armat.arj in the drive A: using all free space on floppy disks and prohibiting sharing archived files between volumes:ARJ a A:armat -vaw Example 11.25. Create a multi-volume archive armat.arj in the drive A: each volume of which will occupy 360 KB: ARJ a A:armat -v360 Extracting files from a multi-volume archive is carried out in the same way as from a single-volume archive, but you must specify the key on the command line -v.Example 11.26 Extracting all files of a multi-volume archive armat.arj from floppy disks installed in the drive A:ARJ e A:armat -v MULTI-FUNCTIONAL INTEGRATED RAR ARCHIVER Main features of the program The RAR archiver serves as a powerful tool for creating and maintaining archives. Its distinctive features are:
  • possibility of working in two modes - full screen interactive interface and regular command line interface;
  • support for other types of archives; in full-screen mode, RAR provides the ability to work with archives of other types (.ZIP, .ARJ, LZH), view their contents, change and convert;
  • use of the highly efficient solid compression method to obtain a high compression ratio (10 - 50% higher than usual);
  • the ability to create self-extracting and multi-volume archives;
  • password protection of archives.
RAR service functions are varied:
  • password encryption;
  • adding file and archive comments;
  • the possibility of partial or complete recovery of damaged archives;
  • protecting the archive from changes;
  • the ability to add information to the archive about the creator of the archive, the time and date of the last changes made to the archive.
The advantages of RAR are especially noticeable when archiving executable modules (.EXE), object files (.OBJ), large text files, etc. The RAR archiver can be controlled in two modes:
  • in command line mode;
  • in full screen interface mode.
Since the control technology, the list of commands and switches in command line mode are similar to the ARJ archiver discussed above, only the features of managing the RAR archiver in full-screen interface mode will be discussed below. Full screen mode To work with the RAR archiver in full-screen interface mode, you need to load the RAR program from the DOS command line, for example: C:\ RAR After loading the program, a window with two panels will appear on the screen (Fig. 11.2). Fig. 11.2 View of the RAR archiver window The right panel has two sections Memory and Settings, which contain information about memory usage, the current default compression method, the presence of a password, archive backup mode, etc. The left panel contains a list of files and subdirectories of the current directory, which can be accessed using The cursor keys can move the selector. Pressing a key at the moment when the selector is on the line with the directory name, a link to the upper directory (".."), or on the line with the name of the archive file, it allows you to enter a subdirectory, go to a superdirectory or enter an archive, respectively. The RAR program allows you to work with archives the following types: RAR, ARJ, ZIP and LZH. After entering the archive, a list of its files is displayed similar to a regular directory. Thus, you can navigate through directories and archives, work with files both in archives and in directories. When you are in a directory, a hint about the assignment of function keys is displayed in the bottom line of the screen: 1-Help 2-Add 3-View 4- Fresh 5-Volume 6-Move 7-Update 8-Repair 9-Option 0-QuitThe line of function keys corresponds to the functions that RAR allows you to perform at the moment.When you press the key the tooltip changes and contains a list of additional functions that can be called by pressing the key together and function keys: 2-Solid 3-View.. 4- 5-SFXVol 6-SolVol 7-SolSVI 8- 0- 9- The list of control keys and the corresponding archiver functions when working with the catalog is given in table. 11.4.Table 11.4. Assignment of RAR archiver control keys when working with a directory

Function name

Purpose

Add a file to the archive, if the archive does not exist it will be created
View file
Update files in the archive - only changed files are added, old copies of which are in the archive
Create Archive Volumes
Transfer files to archive
Add files that are not in the archive and update those whose old copies are already in the archive
Recover a damaged archive
Exit RAR. Key
Create a continuous (solid) archive
View file
Create an archive divided into SFX volumes
Create solid - archive divided into volumes
Create solid - archive divided into SFX volumes
When you press a key simultaneously with changing the prompt line, a window appears with a list of additional archiver functions performed when entering key combinations with letters: Alt-C - switch to color or black and white mode; Alt-D - select the current disk; Alt-J - temporary exit to DOS (DOS-shell); Alt-M - select compression method; Alt-P - setting a password; Alt-S - record current options; Alt-W - assigning a working directory for temporary files. If you press a key , when the cursor is on the line with the name of the archive, you will be taken to the archive itself, like a directory. The same thing will happen if you run the RAR program with a parameter - the name of the archive you want to go to. When you are in the archive, the line of function keys looks like this: 1-Help 2-Test 3-View 4-Extr 5-Comment 6-ExCurD 7-SFX 8-Delete 9-Oplion 0-Quit The additional function bar appears when you hold down the key < Alt >: 1- 2- 3-View.. 4-ExtrTo 5-FilCmt 6- 7-Lock 8- 9- 0 The list of control keys and the corresponding archiver functions when working with the archive is given in table. 11.5 Table 11.5. Assignment of RAR archiver control keys when working with an archive

Function name

Purpose

Displaying help information
Test archive
View file
Extract the file from the archive with full paths
Add a comment to the archive
Extract files to current directory
Convert to SFX - archive
Delete files from archive
Configuration/Save Configuration
Exit from the archive
View a file with a built-in program if an external one is available
Extract files to specified directory
Add comments to files
Lock the archive from changes
Files can be marked (selected) or unmarked using the keys or . To select a group of files or deselect them, use the keys<Серый +>And<Серый ->.If there are marked files, a service line appears at the bottom of the screen, which indicates the number of marked files and their total size, while the size of files in subdirectories is not taken into account. When you have entered the archive, its contents are located on the left half of the screen. Files from this list can be viewed, marked, etc. in the same way as files in a regular directory. In the list, next to the names of files encoded with a password, an asterisk (*) is placed. On the right half of the screen there is an information window in which information about the archive is displayed: the name of the archive and its status, the presence of a comment, the presence of files encoded with a password, as well as statistical information about the number of files in the archive, their total volume, compression ratio, the minimum RAR version number for unpacking this archive and the name of the operating system in which the archive was created. When working with multi-volume archives in full-screen mode, you must start unpacking such an archive from its very first part (from the very first volume). The lengths of files whose parts ended up in different volumes apply only to the current volume. Symbol<= обозначает файл, продолжающийся с предыдущего тома, а символ =>file that continues into the next volume . Configuring archiver parameters To change the parameters of the archiver, after starting it, press the key and call up the settings menu. A window with the following menu will appear on the screen: Configuration...Set password

Work directory Default comment fileExternal viewerChange disk RegistrationSave options First menu item Configuration allows you to call up a configuration dialog to configure basic RAR parameters (Fig. 11.3). The window contains five groups of parameters: Interface options - interface settings; Sort names - setting the file sorting option; Include file mask - setting the file inclusion mask; Compression - compression method settings; Other options - configure other parameters. Fig. 11.3. View of the window for setting configuration parameters of the RAR archiver A parameter marked with a cross means that the corresponding function is enabled. Move from one parameter to another by pressing the arrow keys. To change the parameter value in the current field, click . When all parameters are set, go to the "OK" field and click to confirm the selected values. If you decide not to change the settings, go to the "Cancel" field and click to cancel them. New parameter values ​​can be saved for use as default during subsequent launches. Interface settings: Color - color/black and white full screen mode; Sound - sound effects; Stdout mode - console mode when performing actions from the command line; Mouse - support mouse in full screen mode . Setting up sorting by file names: Unsorted - disable sorting; Name - sorting by names; Extension - sorting by extensions; Size - sorting by size. Setting up a file inclusion mask allows you to add files to the archive according to their attributes: Read only files - files only for reading;System files - system files;Archive files - files for reading and writing;Hidden files - hidden files. Setting the default compression method: Store - add files to the archive without compression; Fastest - very fast compression (least effective) ;Fast - fast compression;Normal - normal compression (default);Good - good compression (more efficient); Best - best compression (most effective). Configuring other parameters: Keep backup archive - save the old archive in a file with the extension .VAK before changing;Add empty directories - add empty directories to the archive;Always make solid - create solid archives by default;(+) Put Authenticity - add author's control information every time you create an archive;(+) Log errors to file - keep records about critical situations when working with RAR in the RAR.LOG file. Menu item Set password serves to assign a password when packing files into an archive and unpacking them from an archive. You can also assign a password by pressing the key combination And<Р>. The password is not saved for use on future runs. Menu item Work directory allows you to specify the directory in which the RAR archiver will place temporary files. It can also be specified by pressing the key combination And . The file name with a general comment for archives can be set using the menu item Default comment file.Menu item External viewer allows you to define an external program that RAR will call to view the contents of files from the archive. When viewing a file in full screen mode, RAR uses the built-in viewer if no external one is defined. Menu item Change disk is intended for changing the current disk, the directory of which is displayed in the working window. To save the archiver settings, use the menu item Save setup. After clicking in this field you will be asked to select: Save - save the set parameter values ​​to use by default; Cancel - refuse to write parameters. RAR stores the default configuration (a set of settings) in the RAR-CFG file, which is located in the same directory as the RAR.EXE program itself. Settings can also be saved by pressing the key combination<Аlt>And .Technology of working with the archiver Let's look at the sequence of actions when performing the most frequently performed archiving procedures after loading the RAR program to work in full screen mode. Creating a new archive from several files 1.Select the drive by pressing the key combination And 2.To change the order of files in the list, press the key , and in the configuration window that appears in the group Sort names select the checkbox of the required sorting option.3.Press the key or<СерыЙ +>select files to be placed in the archive.4.To protect files placed in the archive with a password, press the key combination And<Р>and enter the password and re-confirm it. 5.If you need to change the compression method, click<Аlt>And<М>and in the dialog box that appears, select the required one.6.To create an archive, press the keys: - for a regular archive; And - for archive type solid;<Аlt>And - for a self-extracting archive (SFX) and enter the maximum archive size in kilobytes. 7. In the window that appears, enter the name of the archive file and the path to it. To move files to the archive, enable the Move checkbox in the window.8. After entering the information into the window, press the key . Two diagrams (horizontal stripes) appear on the screen, displaying the progress of archiving each file separately and the formation of the archive as a whole. At the end of the process, brief information about the volume of files before and after archiving is displayed. Extracting files from an archive 1. Select the drive by pressing the key combination And .2.Set the directory containing the archive file on the left panel of the information window.3.In the information window that appears on the screen, set the selector to the line with the name of the archive file and press the key . A list of archive files will appear in the left panel.4.Using the keys or<Серый +>, select the files to be extracted.5.Press the keys: - to extract marked files according to paths and recreate the directory structure; - to extract marked files to the current directory; And - to extract marked files to a specified directory. If the files were archived with a password, a window for entering the password will appear. Creating a multi-volume archive on floppy disks containing all the files in a given hard drive directory 1.Set the directory with source files on the left panel of the information window.2.Using the key or<Серый +>, highlight the names of the files to be archived. 3. Press the key and in the window that appears on the screen, set or select from the proposed list the size of one archive volume. For automatic sizing, i.e. using all free disk space, select Autodetect. Press key .4.Insert a floppy disk into the drive to record the first archive volume and, at the “Enter archive name” prompt, enter the full name of the first archive volume: drive name, path, file name. The first file will automatically receive the .RAR extension. Complete input and press key start the archiving process.5.The archiving protocol is displayed on the screen (Fig. 11.4), in which for each archived file the volumes in the original and packed state and the degree of compression are indicated.6.After writing the first volume to a floppy disk, the program will offer to install floppy disks for recording the next archive volumes that will automatically receive extensions .ROO", .R01, .R02, etc. After installing each floppy disk, press the button in the query dialog box. Rice. 11.4. View of the information window displaying the process of creating a multi-volume archive Extracting files from a multi-volume archive on floppy disks to a specified directory1.Insert a floppy disk with the first archive file (volume) with the extension .RAR into the drive.2.Specify the name of the disk with the archive by pressing the key combination<А1t>And .3.In the information window that appears on the screen, set the selector to the line with the name of the first archive file and press the key . A list of archive files will appear in the left panel of the window. 4. To extract all archive files to the specified directory on the hard drive, press the key combination And and enter the path to the specified directory in the dialog box that appears. Press key .5.A dialog box for selecting the extraction option will appear on the screen: Proceed with all volumes from current - extract from all files; Proceed with selected files only - extract only the selected files. 6. Select the first option, corresponding to extracting all files from the archive, and press the button . A list of files extracted from the archive will be displayed on the screen. A successfully extracted file is marked Ok. 7.After extracting all the files of the first volume, the program will offer to install the next disk (volume) with the extension .ROO, and after extracting the files from it, respectively, volumes with extensions .R01, R02, etc., if any. Note. To work with multi-volume archives, it is necessary to specify a working directory for storing temporary archiver files. Such a directory is created on the hard drive and the path to it is indicated by pressing the key combination<А1t>And . The path to the working directory should be saved in the archiver configuration file by clicking<А1t>And .
Archive programs are designed to archive (pack) files by compressing the information stored in them in order to save disk space.

Information compression is the process of converting information stored in a file into a form that reduces redundancy in its presentation and, accordingly, requires less storage space.

Compression of information in files is carried out using the deviceeliminating redundancy in various ways, for example, by simplifying the codes, excluding constant bits of symbols or a repeating sequence of symbols, introducing a symbol repetition factor, etc. Various algorithms for such information compression are used.

Either one or several files can be compressed and placed in a compressed form into an archive file or archive.

Archive file (archive, or archive file) - this is specialan organized file containing one or morehow many files in compressed or uncompressed form and service informationinformation about file names, date and time of their creation or modificationcation, size, etc.

The purpose of packing files usually ensuring a more compact placement of information on disk, reducing the time and, accordingly, the cost of transmitting information over communication channels in computer networks. Besides, packing in one arfile group file significantly simplifies their transfer from one computer to another, reduces the time of copying files to disks, allows you to protect information from unauthorized access, and helps protect against infection by computer viruses.

The degree of compression depends on the archiving program used, the compression method, and the type of source file. Text files and data files are most well compressed, for which the compression ratio can reach 80-90%; files of executable programs and load modules are compressed less - 5-40%. Archive files are almost not compressed.

Archiving programs differ in the compression methods they use, which consequently affects the compression ratio.

Unzipping (unpacking)- file recovery processfrom the archive exactly in the form they had before loading into the archiveKhiv. When unpacking, files are extracted from the archive and placed on disk or in RAM.

Large archive files can be placed in several volumes. Such archives are called multi-volume. Volume - it is an integral part of a multi-volume archive. When creating an archive from several parts, you can write its parts onto several floppy disks.


MAIN TYPES OF ARCHIVE PROGRAMS. One of the most widespread types of service programs are programs designed for archiving, packaging files by compressing the information stored in them.

Information compression is the process of converting information stored in a file into a form that reduces redundancy in its presentation and, accordingly, requires less memory for storage.

Compression of information in files is accomplished by eliminating redundancy in various ways, such as by simplifying codes or representing repeated characters, or a repeating sequence of characters as a repetition factor and corresponding characters. Various algorithms for such information compression are used.

Either one or several files can be compressed, which in compressed form are placed in a so-called archive file or archive.

An archive file is a specially organized file containing one or more files in compressed or uncompressed form and service information about file names, date and time of their creation or modification, sizes, etc.

The purpose of file packaging is usually to ensure a more compact placement of information on disk, reducing the time and, accordingly, the cost of transmitting information over communication channels in computer networks. In addition, packaging a group of files into one archive file significantly simplifies their transfer from one computer to another, reduces the time of copying files to disks, allows you to protect information from unauthorized access, and helps protect against infection by computer viruses.

The degree of file compression is characterized by the coefficient Kc, defined as the ratio of the volume of the compressed file Vc to the volume of the original file Vo, expressed as a percentage:

Kc=(Vc/ V0)*100%

The degree of compression depends on the program used, the compression method, and the type of source file. The most well-compressed files are graphic images, text files and data files, for which the compression ratio can reach 5 - 40%; files of executable programs and load modules are compressed less - 60 - 90%. Archive files are almost not compressed. Archiving programs differ in the compression methods they use, which consequently affects the compression ratio.

Archiving (packaging) - placing (downloading) source files into an archive file in compressed or uncompressed form.

Unzipping (unpacking) is the process of restoring files from an archive exactly as they were before they were loaded into the archive. When unpacking, files are extracted from the archive and placed on disk or in RAM.

Programs that pack and unpack files are called archiver programs.

Large archive files can be placed on several disks (volumes). Such archives are called multi-volume. A volume is an integral part of a multi-volume archive. When creating an archive from several parts, you can record parts of it into several parts.

The most popular archive formats
ZIP has been one of the most popular and widespread archive formats since the days of DOS, based on compression algorithms proposed in the 80s of the last century by Israeli mathematicians Lempel and Ziv. It is distinguished by an acceptable degree of information compression and fairly high performance. Today it is a de facto standard on the Internet, and almost all archiving programs must support it.
RAR - developed by Russian programmer Evgeny Roshal and allows you to get a compressed file size that is much smaller than ZIP, but the price for this is a longer archive processing process. In general, the RAR format is much better optimized than others for solving complex problems using a large number of files and gigabyte disk spaces.
ARJ is a somewhat outdated format, which is still perhaps distinguished by the widest customization options.
CAB is used in Microsoft products as a standard for packing files, and its algorithm, not published anywhere and kept by the company under seven seals, is a fairly advanced product with a high compression ratio.
GZIP, TAR - are most widespread in systems based on Unix and its most popular variety, Linux.
ACE is a fairly new format with a high compression ratio that is gaining increasing popularity.
Many programs that are quite popular in the world of archivers are based on one format or another and have similar names. For example, for Windows OS the most popular archivers are WinRAR, WinZIP, WinACE. In addition, they all have tools for working with other archive formats. Despite this, problems may arise with compatibility of archive formats in different programs. In many cases, a successful solution to the problem of compatibility of archives of various types is to create archives in the form of self-extracting programs (EXE files), which include all the necessary mechanisms for extracting information from the archive, thus eliminating the need to have a corresponding archive unpacking program on the computer.

Gif" width="25" height="25" />.php?viewcat=4"> Discuss on the forum