A file can be "free formed", indexed or structured collection of related bytes having meaning only to the one who created it. Or in other words an entry in a directory is the file. The file may have attributes like name, creator, date, type, permissions etc.
A file has various kinds of structure. Some of them can be:
- Simple Record Structure with lines of fixed or variable lengths.
- Complex Structures like formatted document or reloadable load files.
- No Definite Structure like sequence of words and bytes etc.
Attributes of File
Following are some of the attributes of a file
- Name. It is the only information which is in human-readable form.
- Identifier. The file is identified by a unique tag (number) within file system.
- Type. It is needed for systems that support different types of files.
- Location. Pointer to file location on device.
- Size. The current size of the file.
- Protection. This controls and assigns the power of reading, writing, executing.
- Time, date, and user identification. This is the data for protection, security, and usage monitoring.
File Access method
The way that files are accessed and read into memory is determined by Access methods. Usually a single access method is supported by systems while there are OS's that support multiple access methods.
- Data is accessed one record right after another is an order.
- Read command cause a pointer to be moved ahead by one.
- Write command allocate space for the record and move the pointer to the new End Of File.
- Such a method is reasonable for tape.
- This method is useful for disks.
- The file is viewed as a numbered sequence of blocks or records.
- There are no restrictions on which blocks are read/written, it can be done in any order.
- User now says "read n" rather than "read next".
- "n" is a number relative to the beginning of file, not relative to an absolute physical disk location.
Indexed Sequential Access
- It is built on top of Sequential access.
- It uses an Index to control the pointer while accessing files.
Windows file systems
Microsoft Windows OS use two major file systems: FAT, inherited from old DOS with its later extension FAT32, and widely-used NTFS file systems. Recently released ReFS file system was developed by Microsoft as a new generation file system for Windows 8 Servers.
FAT (File Allocation Table):
FAT file system is one of the most simple types of file systems. It consists of file system descriptor sector (boot sector or superblock), file system block allocation table (referenced as File Allocation Table) and plain storage space to store files and folders. Files on FAT are stored in directories. Each directory is an array of 32-byte records, each defines file or file extended attributes (e.g. long file name). File record references the first block of file. Any next block can be found through block allocation table by using it as linked-list.
Block allocation table contains an array of block descriptors. Zero value indicates that the block is not used and non-zero – reference to the next block of the file or special value for fileend.
The number in FAT12, FAT16, FAT32 stands for the number if bits used to enumerate file system block. This means that FAT12 may use up to 4096 different block references, FAT16- 65536 and FAT32 - 4294967296. Actual maximum count of blocks is even less and depends on implementation of file system driver.
FAT12 was used for old floppy disks. FAT16 (or simply FAT) and FAT32 are widely used forflash memory cards and USB flash sticks. It is supported by mobile phones, digital cameras and other portable devices.
FAT or FAT32 is a file system, used on Windows-compatible external storages or disk partitions with size below 2GB (for FAT) or 32GB (for FAT32). Windows can not create FAT32 file system over 32GB (however Linux supports FAT32 up to 2TB).
NTFS (New Technology File System):
NTFS was introduced in Windows NT and at present is major file system for Windows. This is a default file system for disk partitions and the only file system that is supported for disk partitions over 32GB. The file system is quite extensible and supports many file properties, including access control, encryption etc. Each file on NTFS is stored as file descriptor inMaster File Table and file content. Master file table contains all information about the file: size, allocation, name etc. The first and the last sectors of the file system contain file system settings (boot record or superblock). This file system uses 48 and 64 bit values to reference files, thus supporting quite large disk storages.
ReFS (Resilient File System):
ReFS is the latest development of Microsoft presently available for Windows 8 Servers. File system architecture absolutely differs from other Windows file systems and is mainly organized in form of B+-tree. ReFS has high tolerance to failures achieved due to new features included into the system. And, namely, Copy-on-Write (CoW): no metadata is modified without being copied; no data is written over the existing ones and rather into a new disk space. With any file modifications a new copy of metadata is created into any free storage space, and then the system creates a link from older metadata to the newer ones. As a result a system stores significant quantity of older backups in different places which provides for easy file recovery unless this storage space is overwritten.
Linux file systems
Open-source Linux OS always aimed to implement, test and use different concepts of file systems. Among huge amount of various file system types the most popular Linux file systems nowadays are:
- Ext2, Ext3, Ext4 - 'native' Linux file system. This file system falls under active developments and improvements. Ext3 file system is just an extension to Ext2 that uses transactional file write operations with journal. Ext4 is a further development of Ext3, extended with support of optimized file allocation information (extents) and extended file attributes. This file system is frequently used as 'root' file system for most Linux installations.
- ReiserFS - alternative Linux file system designed to store huge amount of small files. It has good capability of files search and enables compact files allocation by storing file tails or small files along with metadata in order not to use large file system blocks for this purpose.
- XFS - file system derived from SGI company that initially used it for their IRIX servers. Now XFS specifications are implemented in Linux. XFS file system has great performance and is widely used to store files.
- JFS - file system developed by IBM for their powerful computing systems. JFS one usually stands for JFS, JFS2 is the second edition. Currently this file system is open-source and is implemented in most modern Linux distributions. The concept of 'hard links' used in this kind of OS makes most Linux file systems similar in that the file name is not regarded as file attribute and rather defined as an alias for a file in certain directory. File object can be linked from many locations, even many times from the same directory under different names. This is one of the causes why recovery of file names after file deletion or file system damage can be difficult or even impossible.