(2010) with some additions and clarifications.

Logging

Before we talk about file systems ah - let's take a quick look at the concept " logging«.

Logging in one form or another it is used in almost all modern file systems.

Journaling is used only for disk write operations, and is a kind of buffer for all such operations. This approach helps solve problems that arise during a disk write operation in which the computer shuts down, for example due to a power outage. Without logging in such cases, it is impossible to find out which files were recorded and which were not or were partially recorded.

When using journaling, the file is first written to a journal (or “log”). After this, the file is written to the hard drive and then deleted from the log, after which the write operation is considered complete. If the power is turned off during recording, then after turning on the system, the file system can check the log and find unfinished operations.

The biggest problem with using logging is that it requires additional system resources. In order to reduce such overhead costs, journaled file systems do not write the entire file to the journal, but only certain metadata.

Ext File Systems

Ext

Means " Extended"(extended) file system, and was the first to be designed specifically for Linux-systems There are currently 4 file systems in total. Ext. The very first of them is simply Ext— was a major update to the FS OS Minix.

Characteristics Ext:

  • maximum file size: 2GB;
  • maximum partition size: 2GB;

The developer is , and the first version appeared in 1992.

We will not consider it, because... most likely you will never encounter it again.

Ext2

- a non-journaled file system released in 1993, the main task of which was to support devices up to 2 terabytes in size. Because at Ext2 there is no logging - it performs much fewer writes to disk, which affects the performance and scope of its application.

Characteristics:

  • maximum file size: 16GB - 2TB;
  • maximum partition size: 2 - 32 TB;
  • The maximum name size is 255 characters.
  • due to the low number of data write-delete operations, it is ideal for various flash drives;
  • at the same time modern SSD-disks have improved performance life cycle(wear resistance of storage elements) and some other features that offset the disadvantages Ext2 as a non-logged FS.

Ext3

- appeared in 2001, along with the release Linux Kernel 2.4.15. Is actually the same Ext2, but with logging support. Main goal Ext3 there was a possibility of its backward compatibility with Ext2 without the need to reformat partitions. The advantages include the fact that most of the testing, bugfixes, etc. For Ext3 was the same as in Ext2 what did Ext3 more stable and faster FS.

Characteristics:

  • maximum file size: 16GB - 2TB (depending on block size);
  • maximum partition size: 2 - 32 TB (depending on block size);
  • suitable if you use Ext2, and you want to use logging;
  • due to its performance and stability, it will probably be the most suitable FS for database servers;
  • maybe not the best choice for servers, because does not support snapshot creation ( shapshot) FS and difficulties with recovering deleted files.

Ext4

- as well as Ext3 It has backwards compatible With previous versions FS. Actually, you can mount Ext2 or Ext3 How Ext4-and under certain conditions achieve greater productivity. You can also mount Ext4 How Ext3 without any side effects.

Ext4- stable version was released in 2008. Is the first FS from the “family” Ext, using the mechanism " “, which allows for less file fragmentation and increased overall file system performance. Besides, in Ext4 a lazy write mechanism has been implemented ( ), which also reduces disk fragmentation and reduces CPU load. On the other hand, although the lazy write mechanism is used in many file systems, due to the complexity of its implementation, it increases the likelihood of data loss. See for more details.

Characteristics:

  • maximum file size: 16 TB;
  • maximum file name size: 255 characters.
  • best choice for SSD;
  • best performance compared to previous ones Etx- systems;
  • it is also excellent as a file system for database servers, although the system itself is younger Ext3.

BtrFS

- developed by the company Oracle in 2007. Its scheme is similar to ReiserFS, the main principle of its operation is the so-called. . BtrFS allows you to dynamically allocate inodes, create snapshots of the FS while it is running, perform transparent file compression and defragmentation in operating mode.

Although stable version BtrFS not yet included in most distributions Linux(today, judging by the post, only SUSE And Oracle Linux) - it may well replace Ext3/4 in the foreseeable future and already provides conversion opportunities Ext3/4 V BtrFS. In addition, it is worth mentioning that one of the developers Ext, said that " BtrFS“This is a step into the future.”

Characteristics:

  • maximum partition size: 16 EB;
  • maximum file name size: 255 characters.
  • due to performance, snapshots and other features - BtrFS is an excellent file system for a server;
  • Oracle is also developing a replacement for NFS And CIFS, which is called CRFS and which is designed to improve performance for file storages With BtrFS;
  • Performance tests showed lag BtrFS from Ext4 on solid state media such as SSD and for operations with relatively small files:

ReizerFS

- introduced in 2001, it realized many possibilities that could never be realized in Ext*. In 2004 to replace ReizerFS FS was released Reizer4.

At the same time - development Reizer4 progress is very slow, and still has limited support(?) in the kernel Linux. At present, only the ReiserFS .

Characteristics:

  • max file size: 1 EB();
  • maximum partition size: 16 TB;
  • maximum filename size: 4032 bytes, but limited to 255 characters.
  • excellent performance when working with small files such as log files and is perfect for database servers or mail servers;
  • ReiserFS lends itself well to increasing the volume size - but does not support its reduction and encryption at the FS level;
  • future Reiser4 remains in question and for now BtrFS remains the preferred (?) choice between these two FS.

ZFS

— it’s worth mentioning here, because it was also developed by the company Oracle and has features similar to BtrFS And ReizerFS. She also became quite famous after the company Apple about the intention to use it as the default FS. First release ZFS took place in 2005.

Due to licensing restrictions - ZFS cannot be included in the kernel Linux, however, its support is possible using the mechanism Linux's (FUSE).

Characteristics:

  • max file size: 16 EB();
  • maximum partition size: 256 ZiB (Zebibyte);
  • maximum file name size: 255 bytes.
  • shows excellent performance when working with large disk arrays;
  • supports the ability to combine disks into arrays, create snapshots of the file system, and work with “stratified display” ( ) data;
  • There may be difficulties when trying to install and use Linux-systems, due to the need to use FUSE.

Swap

Swap- is not a file system at all. File or section with swap-om is used by the kernel virtual memory system and has no file system structure at all. You cannot mount it and read data from it, because... swap used exclusively by the kernel Linux For writing memory pages it is not a disk. Usually - swap used only when the OS lacks free RAM and “dumps” part of the data from memory to swap for her release.

To be honest, many people think about searching for the best file system for their computer. U Windows users and MacOS X have a small choice; they only have one standard file system available, NTFS and HFS+. In the operating room Linux system everything is different, there are many file systems available here to suit every taste. ext4 is widely used in Linux, but there are several reasons to try something new. For example, btrfs vs xfs. But is she really better than others? Let's first look at the most popular file systems and the features of their operation, so to speak, let's make a small comparison.

If you are unfamiliar with the basics of file systems, I will say a few words about it so that you can better understand the difference between btrfs vs ext4 vs xfs. File systems are used to control how data is written to disk, access to this data, and also store information and metadata about files. It's not easy to program, but file systems are constantly improving. New functionality is constantly being developed and they become more efficient.

Why are sections needed?

Many users have vague ideas about why disk partitions are needed. All operating systems support creating and deleting partitions. Linux uses more than one partition on the disk, even when using the standard installation procedure. One of the main purposes of partitioning disks is to increase security in case of errors.

When separated hard drive into sections, data can be grouped and divided. If errors occur, only those that were on the damaged partition will be lost. Data on all other partitions will most likely remain safe and sound. This was especially important when journaled file systems did not yet exist in Linux, and any unexpected power loss could lead to disaster.

Increased security and reliability when using partitions means that if one part of the operating system is damaged, data in other partitions will still be available. On this moment, this is the most important factor in using partitions. For example, users may use scripts or programs that fill up disk space. If the disk contains only one large partition, then when the free space runs out, the system will completely stop working. But if users store data on different partitions, then the overflow will affect only one partition, while the system and other partitions will continue to function normally.

Remember that a journaled file system only protects against damage caused by power failures and unexpected removal of storage devices. But it will not protect you from broken blocks and logical errors in the file system. In such cases, you need to use a multi-disk array (RAID).

Why choose a different file system?

The EXT4 file system is an improved version of EXT3, which, in turn, is nothing more than a reworked EXT2. EXT4 is a very stable file system that has been the default file system of choice in most Linux distributions over the past few years. But its code is already quite outdated. Also, Linux users want new features and functionality that EXT4 doesn't have, but other file systems do, such as btrfs vs xfs. Exists software, which implements these functions, but filesystem-level support will be much faster. Next, we will briefly look at each of the proposed file systems so that you can choose which btrfs or ext4 file system is best for you.

Ext4 file system

Ext4 has some limitations that, even now, are much more impressive. The maximum file size is 17 terabytes. And this is much more than the hard drive capacity available to the average buyer. At the same time, the largest partition size that can be created with ext4 is 1 exabyte, which is approximately 11529215 terabytes. As you know, Ext4 works faster than EXT3. Like all modern file systems, it is journaling, which means that EXT4 will keep a log of the location of files on disk, as well as write any data changes there. Despite all these features, it does not support transparent compression, data deduplication, or transparent encryption. Snapshots are technically supported, but are only an experimental feature.

Btrfs file system

Btrfs is a file system designed from the ground up. It exists because its developer wanted to extend the functionality of the standard file system with features such as snapshots, merging, checksums, transparent compression, and many others. Btrfs does not depend on Ext4, but implements it best ideas and advantages, as well as their additional features, which will be very useful to users, and especially businesses. For businesses running serious programs with very large databases, one file system space on multiple drives will be very useful. Data deduplication will reduce the actual disk space consumed by data. And data mirroring with Btrfs will become much easier.

Users can still create multiple partitions since they don't have to mirror data to different drives. Considering that Brtfs can cover several hard drives, it supports 16 times more disk space than Ext4. The maximum partition size in Btrfs is 16 exabytes, the maximum file size is the same. In the comparison EXT4 vs btrfs, the latter comes out on top.

XFS file system

XFS is considered an extended file system. It is a high-performance 64-bit, journaling file system. XFS support was added to the kernel in 2002. And in 2009 it was used at Red Hat Enterprise Linux 5.4. The maximum file size on this file system is eight exabytes. But XFS has some limitations. For example, the partition of this file system cannot be reduced, and poor performance is also observed when working with a large number of files. RHEL 7.0 now uses XFS as the default file system.

Conclusion and conclusions

Unfortunately, the final release date of Btrfs is unknown. But officially, this next-generation file system is still classified as unstable. However, if you install Ubuntu latest version, the installer will offer the option to select Btrfs as the main file system. It is unknown when Btrfs will become stable, but Ubuntu will not use it as the default file system until it is considered fully stable.

Currently, Btrfs is used as the default root filesystem in OpenSUSE. As you can see, the developers have a huge amount of work to do, since not all features have been implemented yet, and it also lags behind in performance when comparing Ext4 vs btrfs.

So which is better to use? So far Ext4 has been the winner despite identical performance. But why? The answer is convenience and popularity. Ext4 is still a great file system for workstations and desktop computers. It comes by default, and therefore the user will receive it simply by installing the OS. Additionally, Ext4 supports partitions up to 1 exabyte and files up to 16 terabytes, which is still a lot.

Btrfs offers large volumes of up to 16 exabytes for both partitions and files, as well as increased fault tolerance. But it is still positioned as an add-on to the file system, and is not integrated into the FS operating system. For example, to format a partition in Btrfs, the Btrfs toolkit must be installed.

Even if the data transfer speed is not very important, there is such a characteristic as the speed of working with files. Btrfs has a lot useful functions: copy-on-write, checksums, snapshots, wipes, data self-healing, deduplication, and others interesting improvements, which ensure data safety. It only lacks the ZFS function - Z-RAID, so RAID is still at an experimental stage. For regular data storage, Btrfs is better suited than Ext4, but time will tell how it will actually work. Whether to use btrfs or ext4 is just a matter of your taste.

For now, Ext4 is the best choice for general users as it is distributed as the default file system and it is also faster than Btrfs when transferring files. Btrfs is certainly worth a try, but it's too early to completely replace ext4; this will only be possible in a few years. It's funny, they said the same thing a few years ago, a lot has changed since then, but Btrfs is still not considered stable.

If you have a different opinion on this, please leave a comment!

By the way, if you use Windows and Linux on the same machine, you may be interested in my article.

Hello readers of my website website, I wanted to tell you about existing And new file systems, and also help her correctly choose. After all, the choice depends on the speed of work, comfort and health, because when the computer freezes and slows down, I don’t think you like it and it really affects your nerves :)

What is a file system and what is it for?

Simply put, this is a system that serves to store files and folders on a hard drive or other media, flash drive, phone, camera, etc. And also for organizing files and folders: moving them, copying, renaming. So this system is responsible for all your files, which is why it is so important.

If you choose the wrong file system, your computer may not work correctly, freeze, crash, information flows slowly, and even worse, data may be corrupted. This is good if not system-specific, otherwise it will appear. And the most important thing is that if your computer slows down for this reason, no amount of garbage cleaning will help!

Types of file systems?

Many file systems are a thing of the past, while some are on their last legs, because... modern technologies grow and grow every day and now a completely new file system is on the way, behind which it can future! Let's see where it all started.

Fat 12

Fat - file allocation table in translation file allocation table. At first, the file system was 12-bit, using a maximum of 4096 clusters. It was developed a long time ago, back in the days of DOS, and was used for floppy disks and small drives with a capacity of up to 16 MB. But it was replaced by the more advanced fat16.

Fat 16

This file system already contained 65525 and supported disks of 4.2 GB in size, at that time this was a luxury and for this reason it did a good job at that time. But the file size could not exceed 2GB, and it’s not the most economical the best option, the larger the file size, the more space the cluster takes up. Therefore, it is not profitable to use a volume of more than 512 MB. The table shows how much the sector size takes up depending on the size of the media.

Although the system worked well at that time, a number of shortcomings later appeared:

1. You cannot work with hard drives larger than 8 GB.

2. You cannot create files larger than 2 GB.

3. The root folder cannot contain more than 512 items.

4. Inability to work with disk partitions larger than 2 GB.

Fat 32

Modern technologies do not stand still, and over time, the fat 16 system became insufficient and came as a replacement fat 32. This system could already support disks up to 2 terabytes (2048 gigabytes) in size and already use disk space economically due to smaller clusters. Another advantage is that there are no restrictions on the use of files in the root folder and it is more reliable compared to previous versions. But the biggest disadvantage for the present time is that files can be damaged and it’s good that this will not lead to . And the second main disadvantage is that now the files exceed the size of more than 4 GB, and the system does not support a larger volume of one file. That users often have questions about why I can’t download a 7GB movie, although there is 100GB free on the disk, that’s the whole problem.

That's why cons and here it is enough:

1. The system does not support files larger than 4 GB.

2. The system is susceptible to file fragmentation, which causes the system to slow down.

3. Susceptible to file corruption.

4. On currently There are already disks larger than 2 TB.

NTFS

And here it comes as a replacement new system ntfs(New Technology File System) what is translated file system new technology , in which a number of shortcomings are eliminated, but there are also plenty of disadvantages. This system is the latest approved, not counting the new one, which I will talk about below. The system appeared back in the 90s, was approved in 2001 with the release of Windows XP and is still in use today. supports disks up to 18 TB in size, cool right? And when files are fragmented, the speed loss is not so noticeable. Security has already reached good levels; in the event of a failure, information damage is unlikely.

Minuses and here they will be:

1. Consumption random access memory, if you have less than 64 MB of RAM, then installing it is not recommended.

2. When there is 10% of free space left on the hard drive, the system begins to noticeably slow down.

3. Working with a small storage capacity can be difficult.

New ReFS

Brand new ReFS file system ( Resilient File System) translated as a fault-tolerant file system developed for the new operating system Windows systems, behind which there may be future! According to the developers, the system should be extremely reliable and, soon after modification, will be supported on other operating systems. Here is a table of differences:

As you can see, the new system supports larger amounts of disk space and a larger number of characters in the path and file name. The system promises to be more secure in which there should be a minimum of failures due to the new architecture and a different way of recording the log. While only visible of course pros, but how true this is is not yet known. After full approval, a number of cons. But for now this remains a mystery. Let's hope that the new file system will bring us only positive experiences from it.

Which file system should you choose?

It is better to install on a well-performing computer Ntfs, it will be more productive and safer for these purposes. It is not recommended to install on computers with a hard drive capacity of less than 32 GB and RAM 64 MB. And the old lady fat32 can be installed on flash drives with a small capacity, because productivity may be higher. And one more thing: formatting a flash drive for a phone, digital camera and others electronic devices in the ntfs format you may have errors, because Some devices may not support ntfs or may be slow to use it and cause crashes. So before formatting, make sure which file system is best for your device.

There are other types of file systems, for example for Linux XFS, ReiserFS (Reiser3), JFS (Journaled File System), ext (extended filesystem), ext2 (second extended file system), ext3 (third extended filesystem), Reiser4, ext4, Btrfs (B-tree FS or Butter FS), Tux2, Tux3, Xiafs, ZFS (Zettabyte File System), but that's a completely different story...

In order to Personal Computer worked fine, installation of many services and programs is required, and probably everyone knows about this.

First, let's understand what a file system is.

File systems that provide access to information located on disk when simultaneous launch several processes - this is an indispensable element of the operating system. File systems provide convenient access to the data contained on the disk, while eliminating all sorts of incorrect situations.

In addition, the file system has diversified access to information, namely, from sorting and moving it to deleting it. The following question naturally arises: Which file system is better? Is it easy to use? And what are its guarantees?

What are the most popular file systems out there?
The most popular file system that ranks first is FAT. But it has a clear drawback, which is the limited number of characters when naming a file, which greatly reduces the efficiency of data management.

This flaw was corrected in later developments of the NTFS system. Due to the fact that all operating systems place their information on disks, file systems are irreplaceable here and must perform their functions efficiently, without any failures. What can you say about file NTFS system. This file system is characterized by just such necessary properties as self-recovery after any incorrect situations.

Now let's talk about each file system separately.

NTFS file system
This file system has a very important characteristic, namely, when data changes, it either positively stops the process or cancels it altogether, which does not allow confusion and confusion to be introduced into the information field. This file system has a rather useful option - file compression. At the same time, it can be applied to individual files, which does not at all affect the work with data and its quality.

Most experts believe that the most secure file system is NTFS. And all thanks to the fact that it contains a large number of tools that are aimed at delineating the rights of objects.

FAT file system
This FAT system satisfied the needs of operating systems of an earlier stage of development. But when access to large amounts of memory became available, the file system lost its position in favor of progressive systems due to its limitations. The FAT file system is ideal for slow drives and works great with file directories small size. Unfortunately, this file system cannot handle large files.

Before choosing a file system, you must decide on the tasks that you will assign to it. Therefore, if you plan to work with large disks that are completely filled with information and at high speed, then in this case the file system that we talked about earlier, namely NTFS, is ideal.

Why may a smartphone not launch programs from a memory card? How is ext4 fundamentally different from ext3? Why will a flash drive last longer if you format it in NTFS rather than FAT? What is the main problem with F2FS? The answers lie in the structural features of file systems. We'll talk about them.

Introduction

File systems define how data is stored. They determine what limitations the user will encounter, how fast read and write operations will be, and how long the drive will work without failures. This is especially true for budget SSDs and their younger brothers - flash drives. Knowing these features, you can get the most out of any system and optimize its use for specific tasks.

You have to choose the type and parameters of the file system every time you need to do something non-trivial. For example, you want to speed up the most common file operations. This can be achieved at the file system level different ways: indexing will provide quick search, and pre-reserving free blocks will make it easier to rewrite frequently changing files. Pre-optimizing the data in RAM will reduce the number of required I/O operations.

Such properties of modern file systems as lazy writing, deduplication and other advanced algorithms help to increase the period of trouble-free operation. They are especially relevant for cheap SSDs with TLC memory chips, flash drives and memory cards.

There are separate optimizations for different levels of disk arrays: for example, the file system can support simplified volume mirroring, instant snapshots, or dynamic scaling without taking the volume offline.

Black box

Users mostly work with the default file system operating system. They rarely create new disk partitions and even less often think about their settings - they simply use the recommended parameters or even buy pre-formatted media.

For Windows fans, everything is simple: NTFS on all disk partitions and FAT32 (or the same NTFS) on flash drives. If there is a NAS and it uses some other file system, then for most it remains beyond perception. They simply connect to it over the network and download files, as if from a black box.

On mobile gadgets with Android ext4 is most often found in internal memory and FAT32 on microSD cards. Yabloko does not care at all what kind of file system they have: HFS+, HFSX, APFS, WTFS... for them there are only beautiful folder and file icons drawn the best designers. Linux users have the richest choice, but you can add support for non-native file systems in both Windows and macOS - more on that later.

Common roots

Over a hundred different file systems have been created, but a little more than a dozen can be considered current. Although they were all developed for their own specific applications, many ended up being related on a conceptual level. They are similar because they use the same type of (meta)data representation structure - B-trees (“bi-trees”).

Like any hierarchical system, a B-tree begins with a root record and then branches down to leaf elements - individual records of files and their attributes, or “leaves.” The main point of creating such a logical structure was to speed up the search for file system objects on large dynamic arrays- like hard drives with a capacity of several terabytes or even more impressive RAID arrays.

B-trees require far fewer disk accesses than other types of balanced trees to perform the same operations. This is achieved due to the fact that the final objects in B-trees are hierarchically located at the same height, and the speed of all operations is precisely proportional to the height of the tree.

Like other balanced trees, B-trees have equal path lengths from the root to any leaf. Instead of growing upward, they branch more strongly and grow wider: all branch points in a B-tree store many references to child objects, making them easy to find in fewer calls. A large number of pointers reduces the number of the most time-consuming disk operations - head positioning when reading arbitrary blocks.

The concept of B-trees was formulated back in the seventies and has since undergone various improvements. In one form or another it is implemented in NTFS, BFS, XFS, JFS, ReiserFS and many DBMSs. All of them are relatives in terms of the basic principles of data organization. The differences concern details, often quite important. Related file systems also have a common disadvantage: they were all created to work specifically with disks even before the advent of SSDs.

Flash memory as the engine of progress

Solid-state drives are gradually replacing disk drives, but for now they are forced to use file systems that are alien to them, passed down by inheritance. They are built on flash memory arrays, the operating principles of which differ from those of disk devices. In particular, flash memory must be erased before being written, an operation that NAND chips cannot perform at the individual cell level. It is only possible for large blocks entirely.

This limitation is due to the fact that in NAND memory all cells are combined into blocks, each of which has only one common connection to the control bus. We will not go into details of the page organization and describe the complete hierarchy. The very principle of group operations with cells and the fact that the sizes of flash memory blocks are usually larger than the blocks addressed in any file system are important. Therefore, all addresses and commands for drives with NAND flash must be translated through the FTL (Flash Translation Layer) abstraction layer.

Compatibility with the logic of disk devices and support for commands of their native interfaces is provided by flash memory controllers. Typically, FTL is implemented in their firmware, but can (partially) be implemented on the host - for example, Plextor writes drivers for its SSDs that accelerate writing.

It is impossible to do without FTL, since even writing one bit to a specific cell triggers a whole series of operations: the controller finds the block containing the desired cell; the block is read completely, written to the cache or to free space, then erased entirely, after which it is rewritten back with the necessary changes.

This approach is reminiscent of army everyday life: in order to give an order to one soldier, the sergeant does general construction, calls the poor fellow out of formation and commands the others to disperse. In the now rare NOR memory, the organization was special forces: each cell was controlled independently (each transistor had an individual contact).

The tasks for controllers are increasing, since with each generation of flash memory the technical process of its production decreases in order to increase density and reduce the cost of data storage. Along with technological standards, the estimated service life of chips is also decreasing.

Modules with single-level SLC cells had a declared resource of 100 thousand rewrite cycles and even more. Many of them still work in old flash drives and CF cards. For enterprise-class MLC (eMLC), the resource was declared in the range of 10 to 20 thousand, while for regular consumer-grade MLC it is estimated at 3-5 thousand. Memory of this type is actively being squeezed by even cheaper TLC, whose resource barely reaches a thousand cycles. Keeping the lifespan of flash memory at an acceptable level requires software tricks, and new file systems are becoming one of them.

Initially, the manufacturers assumed that the file system was unimportant. The controller itself must service a short-lived array of memory cells of any type, distributing the load between them in an optimal way. For the file system driver, it simulates a regular disk, and itself performs low-level optimizations on any access. However, in practice, optimization varies from device to device, from magical to bogus.

In corporate SSDs, the built-in controller is small computer. It has a huge memory buffer (half a gigabyte or more) and supports many data efficiency techniques to avoid unnecessary rewrite cycles. The chip organizes all blocks in the cache, performs lazy writes, performs on-the-fly deduplication, reserves some blocks and clears others in the background. All this magic happens completely unnoticed by the OS, programs and the user. With an SSD like this, it really doesn't matter which file system is used. Internal optimizations have a much greater impact on performance and resource than external ones.

Budget SSDs (and even more so flash drives) are equipped with much less smart controllers. The cache in them is limited or absent, and the advanced ones server technologies are not applied at all. The controllers in memory cards are so primitive that it is often claimed that they do not exist at all. Therefore, for cheap devices with flash memory, external methods of load balancing remain relevant - primarily using specialized file systems.

From JFFS to F2FS

One of the first attempts to write a file system that would take into account the principles of organizing flash memory was JFFS - Journaling Flash File System. Initially, this development by the Swedish company Axis Communications was aimed at increasing memory efficiency network devices, which Axis produced in the nineties. The first version of JFFS supported only NOR memory, but already in the second version it became friends with NAND.

Currently JFFS2 has limited use. Basically it is still used in Linux distributions for embedded systems. It can be found in routers, IP cameras, NAS and other regulars of the Internet of Things. In general, wherever a small amount of reliable memory is required.

A further attempt to develop JFFS2 was LogFS, in which index descriptors were stored in separate file. The authors of this idea are Jorn Engel, an employee of the German division of IBM, and Robert Mertens, a teacher at the University of Osnabrück. Source LogFS is available on GitHub. Judging by the fact that last change it was made four years ago, LogFS never gained popularity.

But these attempts spurred the emergence of another specialized file system - F2FS. It was developed by Samsung Corporation, which accounts for a considerable part of the flash memory produced in the world. Samsung makes NAND Flash chips for its own devices and for other companies, and also develops SSDs with fundamentally new interfaces instead of legacy disk ones. The creation of a specialized file system optimized for flash memory was a long overdue need from Samsung's point of view.

Four years ago, in 2012, Samsung created F2FS (Flash Friendly File System). Her idea was good, but the implementation turned out to be crude. The key task when creating F2FS was simple: to reduce the number of cell rewrite operations and distribute the load on them as evenly as possible. This requires performing operations on multiple cells within the same block at the same time, rather than forcing them one at a time. This means that what is needed is not instant rewriting of existing blocks at the first request of the OS, but caching of commands and data, adding new blocks to free space and delayed erasing of cells.

Today, F2FS support is already officially implemented in Linux (and therefore in Android), but in practice it does not yet provide any special advantages. The main feature of this file system (lazy rewrite) led to premature conclusions about its effectiveness. The old caching trick even fooled early versions of benchmarks, where F2FS showed an imaginary advantage not by a few percent (as expected) or even by several times, but by orders of magnitude. The F2FS driver simply reported the completion of an operation that the controller was just planning to do. However, if the real performance gain for F2FS is small, then the wear on the cells will definitely be less than when using the same ext4. Those optimizations that a cheap controller cannot do will be performed at the level of the file system itself.

Extents and bitmaps

For now, F2FS is perceived as exotic for geeks. Even in your own Samsung smartphones ext4 is still used. Many consider her further development ext3, but this is not entirely true. This is more about a revolution than about breaking the 2 TB per file barrier and simply increasing other quantitative indicators.

When computers were large and files were small, addressing was not a problem. Each file was allocated a certain number of blocks, the addresses of which were entered into the correspondence table. This is how the ext3 file system worked, which remains in service to this day. But in ext4 a fundamentally different addressing method appeared - extents.


Extents can be thought of as extensions of inodes as discrete sets of blocks that are addressed entirely as contiguous sequences. One extent can contain an entire medium-sized file, but for large files it is enough to allocate a dozen or two extents. This is much more efficient than addressing hundreds of thousands of small blocks of four kilobytes.

The recording mechanism itself has also changed in ext4. Now blocks are distributed immediately in one request. And not in advance, but immediately before writing data to disk. Lazy multi-block allocation allows you to get rid of unnecessary operations that ext3 was guilty of: in it, blocks for a new file were allocated immediately, even if it entirely fit in the cache and was planned to be deleted as temporary.


FAT Restricted Diet

In addition to balanced trees and their modifications, there are other popular logical structures. There are file systems with a fundamentally different type of organization - for example, linear. You probably use at least one of them often.

Mystery

Guess the riddle: at twelve she began to gain weight, by sixteen she was a stupid fatty, and by thirty-two she became fat, and remained a simpleton. Who is she?

That's right, this is a story about the FAT file system. Compatibility requirements provided her with bad heredity. On floppy disks it was 12-bit, on hard drives- at first it was 16-bit, but it has reached our days as 32-bit. In each subsequent version, the number of addressable blocks increased, but in essence nothing changed.

The still popular FAT32 file system appeared twenty years ago. Today it is still primitive and does not support access control lists, disk quotas, background compression, or other modern data optimization technologies.

Why is FAT32 needed these days? Everything is still solely to ensure compatibility. Manufacturers rightly believe that a FAT32 partition can be read by any OS. That's why they create it on external hard disks, USB Flash and memory cards.

How to free up your smartphone's flash memory

microSD(HC) cards used in smartphones are formatted in FAT32 by default. This is the main obstacle to installing applications on them and transferring data from internal memory. To overcome it, you need to create a partition on the card with ext3 or ext4. All file attributes (including owner and access rights) can be transferred to it, so any application can work as if it were launched from internal memory.

Windows does not know how to create more than one partition on flash drives, but for this you can run Linux (at least in a virtual machine) or an advanced utility for working with logical partitioning - for example, MiniTool Partition Wizard Free. Having discovered an additional primary partition with ext3/ext4 on the card, the Link2SD application and similar ones will offer many more options than in the case of a single FAT32 partition.


Another argument in favor of choosing FAT32 is often cited as its lack of journaling, which means faster write operations and less wear on NAND Flash memory cells. In practice, using FAT32 leads to the opposite and gives rise to many other problems.

Flash drives and memory cards die quickly due to the fact that any change in FAT32 causes overwriting of the same sectors where two chains of file tables are located. I saved the entire web page, and it was overwritten a hundred times - with each addition of another small GIF to the flash drive. Have you launched portable software? It creates temporary files and constantly changes them while running. Therefore, it is much better to use NTFS on flash drives with its failure-resistant $MFT table. Small files can be stored directly in the main file table, and its extensions and copies are written to different areas of flash memory. In addition, thanks to indexing on NTFS, searches are faster.

Another problem that most users face is that it is impossible to write a file larger than 4 GB to a FAT32 partition. The reason is that in FAT32 the file size is described by 32 bits in the file allocation table, and 2^32 (minus one, to be precise) is exactly four gigs. It turns out that neither a movie in normal quality nor a DVD image can be written to a freshly purchased flash drive.

Copy large files It’s not so bad: when you try to do this, the error is at least immediately visible. In other situations, FAT32 acts as a time bomb. For example, you copied portable software onto a flash drive and at first you use it without problems. After a long time, one of the programs (for example, accounting or email), the database becomes bloated, and... it simply stops updating. The file cannot be overwritten because it has reached the 4 GB limit.

A less obvious problem is that in FAT32 the creation date of a file or directory can be specified to within two seconds. This is not sufficient for many cryptographic applications that use timestamps. The low precision of the date attribute is another reason why FAT32 is not considered a valid file system from a security perspective. However, its weaknesses can also be used for your own purposes. For example, if you copy any files from an NTFS partition to a FAT32 volume, they will be cleared of all metadata, as well as inherited and specially set permissions. FAT simply doesn't support them.

exFAT

Unlike FAT12/16/32, exFAT was developed specifically for USB Flash and large (≥ 32 GB) memory cards. Extended FAT eliminates the above-mentioned disadvantage of FAT32 - overwriting the same sectors with any change. As a 64-bit system, it has no practically significant limits on the size of a single file. Theoretically, it can be 2^64 bytes (16 EB) in length, and cards of this size will not appear soon.

Another fundamental difference between exFAT is its support for access control lists (ACLs). This is no longer the same simpleton from the nineties, but the closed nature of the format hinders the implementation of exFAT. ExFAT support is fully and legally implemented only in Windows (starting from XP SP2) and OS X (starting from 10.6.5). On Linux and *BSD it is supported either with restrictions or not quite legally. Microsoft requires licensing for the use of exFAT, and there is a lot of legal controversy in this area.

Btrfs

Another prominent representative of file systems based on B-trees is called Btrfs. This FS appeared in 2007 and was initially created in Oracle with an eye to working with SSDs and RAIDs. For example, it can be dynamically scaled: creating new inodes directly on the running system or dividing a volume into subvolumes without allocating free space to them.

The copy-on-write mechanism implemented in Btrfs and full integration with the Device mapper kernel module allow you to take almost instantaneous snapshots through virtual block devices. Pre-compression (zlib or lzo) and deduplication speed up basic operations while also extending the lifetime of flash memory. This is especially noticeable when working with databases (2-4 times compression is achieved) and small files (they are written in orderly large blocks and can be stored directly in “leaves”).

Btrfs also supports full logging mode (data and metadata), volume checking without unmounting, and many other modern features. The Btrfs code is published under the GPL license. This file system has been supported as stable in Linux since kernel version 4.3.1.

Flight magazines

Almost all more or less modern file systems (ext3/ext4, NTFS, HFSX, Btrfs and others) belong to the general group of journaled ones, since they keep records of changes made in a separate log (journal) and are checked against it in the event of a failure during disk operations . However, the logging granularity and fault tolerance of these file systems differ.

Ext3 supports three logging modes: with feedback, organized and complete logging. The first mode involves recording only general changes (metadata), performed asynchronously with respect to changes in the data itself. In the second mode, the same metadata recording is performed, but strictly before making any changes. The third mode is equivalent to full logging (changes both in metadata and in the files themselves).

Only the last option ensures data integrity. The remaining two only speed up the detection of errors during scanning and guarantee restoration of the integrity of the file system itself, but not the contents of the files.

Journaling in NTFS is similar to the second logging mode in ext3. Only changes in metadata are recorded in the log, and the data itself may be lost in the event of a failure. This logging method in NTFS was not intended as a way to achieve maximum reliability, but only as a compromise between performance and fault tolerance. This is why people who are used to working with fully journaled systems consider NTFS pseudo-journaling.

The approach implemented in NTFS is in some ways even better than the default in ext3. NTFS additionally periodically creates checkpoints to ensure that all previously deferred disk operations are completed. Checkpoints have nothing in common with restore points in System Volume Information. These are just service log entries.

Practice shows that such partial NTFS journaling is in most cases sufficient for trouble-free operation. After all, even with a sudden power outage, disk devices do not lose power instantly. The power supply and numerous capacitors in the drives themselves provide just the minimum amount of energy that is enough to complete the current write operation. With modern SSDs, with their speed and efficiency, the same amount of energy is usually enough to perform pending operations. An attempt to switch to full logging would reduce the speed of most operations significantly.

Connecting third-party files in Windows

The use of file systems is limited by their support at the OS level. For example, Windows does not understand ext2/3/4 and HFS+, but sometimes it is necessary to use them. This can be done by adding the appropriate driver.

An open source driver for reading and writing ext2/3 partitions with partial support for ext4. The latest version supports extents and partitions up to 16 TB. LVM, access control lists, and extended attributes are not supported.


Exists free plugin For Total Commander. Supports reading ext2/3/4 partitions.


coLinux is an open and free port of the Linux kernel. Together with a 32-bit driver, it allows you to run Linux on Windows environment from 2000 to 7 without using virtualization technologies. Supports 32-bit versions only. Development of a 64-bit modification was canceled. coLinux allows, among other things, to organize Windows access to ext2/3/4 partitions. Support for the project was suspended in 2014.

Windows 10 may already have built-in support for Linux-specific file systems, it's just hidden. These thoughts are suggested by the kernel-level driver Lxcore.sys and the LxssManager service, which is loaded as a library by the Svchost.exe process. For more information about this, see Alex Ionescu’s report “The Linux Kernel Hidden Inside Windows 10,” which he gave at Black Hat 2016.


ExtFS for Windows is a paid driver produced by Paragon. It runs on Windows 7 to 10 and supports read/write access to ext2/3/4 volumes. Provides almost complete support for ext4 on Windows.

HFS+ for Windows 10 is another proprietary driver produced by Paragon Software. Despite the name, it works in all versions of Windows starting from XP. Provides full access to HFS+/HFSX file systems on disks with any layout (MBR/GPT).

WinBtrfs is an early development of the Btrfs driver for Windows. Already in version 0.6 it supports both read and write access to Btrfs volumes. It can handle hard and symbolic links, supports alternative data streams, ACLs, two types of compression and asynchronous read/write mode. While WinBtrfs does not know how to use mkfs.btrfs, btrfs-balance and other utilities to maintain this file system.