File system organization. The concept of "file", "directory". Requirements for file and directory names. Full file name. File types. Using symbols

directories, so "methody", "Methody" and "METHODY" would be three different names.

There are several characters allowed in file and directory names that must be used with caution. These are the so-called special characters "*", "\", "&", "< ", " >", " ; ", " ( ", " ) ", " | ", as well as the space and tab characters. The fact is that these characters have a special meaning for any shell, so special care will need to be taken to ensure that the shell treats these characters as part of the file or directory name. The special meaning of the "-" symbol for Linux commands was already discussed in Lecture 2, and it was also discussed there how to change its interpretation 1The "" symbol - " means that the next word is a key, and spaces and tabs separate parameters on the command line.. Why the command shell needs special characters will be discussed in Lecture 8.

Encodings and Russian names

As you can see, so far in all the file and directory names encountered, only Latin characters and some punctuation marks have been used. This is not accidental and is caused by the desire to make the examples provided look the same on any system. In Linux, it is permissible to use any characters from any language in file and directory names, but such freedom requires sacrifices that Methodius, for example, could not make.

The fact is that for a long time, each symbol (letter) of each language was traditionally represented in the form one byte. This view imposes very strict restrictions on number of letters in the alphabet: there can be no more than 256 of them, and minus control characters, numbers, punctuation marks and other things - even less. Extensive alphabets (for example, hieroglyphic Japanese and Chinese) had to be replaced by simplified representations of them. In addition, it is best to always leave the first 128 characters of these 256 unchanged, corresponding to the ASCII standard, which includes Latin, numbers, punctuation, and the most common characters found on a typewriter keyboard. The interpretation of the remaining 128 characters depends on what encoding is installed on the system. For example, in the Russian KOI8-R encoding the 228th character of such a table corresponds to the letter “D”, and in the Western European encoding ISO-8859-1 the same character corresponds to the letter “a” with two dots above it (like our letter “e” ).

File names, recorded to disk in the same encoding, they look ridiculous if browsing the catalog another one was installed. Moreover, many encodings fill character range with numbers from 128 to 255 not completely, so the corresponding symbol may not exist at all! It means that enter such a distorted file name from the keyboard (for example, in order to rename it) will not be possible directly: you will have to resort to various tricks described in lecture 8. Finally, many languages, including Russian, historically have several encodings 2 Methodius himself several times received emails beginning with the words “bNOPNYA” or “bMHLYUMKHE” - the result of representing text encoded CP-1251 in KOI8-R encoding.. Unfortunately, there is currently no standard way to specify the encoding directly in the file name, so within one file system It is worth adhering to a single encoding when naming files.

There is a universal encoding that includes characters from all scripts in the world - UNICODE. The UNICODE standard is currently becoming increasingly widespread and claims to be a common standard for all texts stored electronically. However, it has not yet achieved the desired versatility, especially in the area of ​​file names. One a character in UNICODE can occupy more one byte - and this is its main drawback, since many useful application programs that work well with single-byte encodings, needs to be thoroughly or even completely reworked in order to teach them how to handle UNICODE. Perhaps the reason for the insufficient prevalence of this encoding is also that UNICODE is a very cumbersome standard, and it may turn out to be ineffective when working with file system, where processing speed and reliability are very essential qualities.

This does not mean that you should not use languages ​​other than English when naming files. As long as you know exactly what encoding the file name is in, there will be no problems. However, Methodius decided that there were guarantees in transferring the file named in Russian to some another the system can be achieved only by passing along with it an encoding setting, even two: in your system and in the recipient’s system (it is not known which one!). Another, much easier way to transfer a file is to use in its name only ASCII characters.

Extensions

Many users are familiar with the concept extension- the part of the file name after the period, usually limited to a few characters and indicating the type of data contained in the file. IN file system Linux has no regulations regarding extensions: the file name can have any number of dots (including none), and the last dot can be followed by any number of characters 3 In contrast to old file systems organized according to the “8+3” principle (DOS, ISO9660, etc.), where no more than one dot is allowed in the file name and the extension can be no longer than 3 characters. This limitation determined the appearance of many file extensions known today, for example, "txt" for a text file.. Although extensions are optional and not imposed by technology on Linux, they are widely used: an extension allows a person or program to determine, without opening a file, just by its name, what type of data it contains. However, it must be taken into account that an extension is only a set of naming conventions for different types of files. Strictly speaking, the data in the file may not correspond to the declared extension for one reason or another, so you cannot rely entirely on the extension.

You can also determine the type of file content based on the data itself. Many formats provide an indication at the beginning of the file how further information should be interpreted: as a program, input to a text editor, an HTML page, a sound file, an image, or something else. A Linux user always has at his disposal the file utility, which is designed specifically to determine the type of data contained in a file:

$ file -- -filename-with- -filename-with-: ASCII English text $ file /home/methody /home/methody: directory Example 3.1. Determining the data type in a file

Methodius, having forgotten what was contained in the file “-filename-with-”, which he created in the example presented in the previous lecture, wanted to look at its contents using the cat command. However, he was stopped by Gurevich, who advised him to first find out what kind of data is contained in this file. It is possible that this is a binary file of an executable program, and such a file may contain sequences that coincidentally coincide with control sequences terminal. The behavior of the terminal after this may become unpredictable, and an inexperienced user is unlikely to be able to cope with it. Methodius received a completely accurate answer from the file utility: his file contains English text in ASCII encoding. file can distinguish between many types of data and will almost certainly produce the correct information. This utility never trusts the file extension (if present) and analyzes the data itself. file distinguishes not only different data, but also different types of files, in particular, it will report if the file being examined is not a regular file, but, for example, a directory.

Directory tree

The concept of a directory allows systematize all objects located on a storage medium (for example, a disk). Most modern file systems use a hierarchical data organization model: there is one directory that combines all the data into file system- this is the “root” of everything file system, root directory . Root directory can contain any objects file system, and in particular, subdirectories (directories of the first level of nesting). Those, in turn, can also contain any objects file system and subdirectories (second level of nesting), etc. Thus, All what is written on the disk - files, directories and special files - necessarily "belongs" root directory: either directly (contained in it) or at some nesting level.

The hierarchy of directories nested within each other can be correlated with the hierarchy of data in the system: combine thematically related files into a directory, thematically related directories into one common directory, etc. If you strictly follow the hierarchical principle, then the deeper nesting level directory, especially since the data contained in it must be combined by a particular feature. If you do not follow this principle, then it will soon turn out to be much easier to put all the files in one directory and search among them for the one you need than to perform such a search in all subdirectories of the system. However, in this case, about any file organization there is no need to talk.

Structure file system can be visualized as a tree 4 Here we mean a tree in the strict mathematical sense: a directed graph without cycles with a single root vertex, in which each vertex contains exactly one edge., whose “root” is root directory, and all the rest are located at the vertices

The computer works with information, which can be text, graphic, audio or video format. All information processed on a computer is stored in files. The concept of a file is one of the basic concepts of computer literacy.

File is a named area of ​​memory on computer storage media. In other words, a file is a set of data on computer media (hard drive, CD and DVD drive, flash drive, etc.), which has its own name ( file name).

What characters can be used in a file name? It is recommended to use Russian and Latin letters, numbers, spaces and punctuation marks in file names. However, the file name should not begin with a period, nor should it use square brackets or curly braces ( ) in the name. The following service characters are invalid for file names: / \ | : * ? “< >

Is there a maximum filename length? The file name length must not exceed 255 characters. In fact, 20-25 characters are usually enough.

Windows does not differentiate between lowercase and uppercase letters for file names. This means that you cannot store files whose names differ only in case in the same directory. For example, two file names "Title.doc" and "TITLE.doc" for Windows will be the same name for the same file.

Do you think there can be several files with the same name PRIMER in one directory? This is possible provided that the PRIMER name has different extensions.

File name extension points to file type(sometimes they also say - file format). Thus,

  • "file type",
  • "file format",
  • "file extension",
  • “file name extension” –

this is, by and large, the same thing.

For example,

PRIMER.doc(x) – the file type is a Word document (or a file in Word format),

PRIMER.bmp – the file type is a picture,

PRIMER.avi – the file type is a video file,

PRIMER.wav – The file type is an audio file.

All these files have different names (due to different file name extensions) and can be stored in the same place, i.e. in one directory. If we draw an analogy with people's names, then the file name is the same as the person's name, and the file name extension is the person's last name. Accordingly, PRIMER.doc and PRIMER.bmp by this analogy are the same as Ivan Petrov and Ivan Sidorov. Files with the names PRIMER.doc and VARIANT.doc are two brothers from the same document family (with the same .doc extension), just as, for example, Ivan Petrov and Fedor Petrov are brothers from the same Petrov family.

The file name extension is the part of the file name that begins with a period followed by several characters.

Extensions consisting of three letters are common - .doc, .txt, .bmp, .gif, etc. Case does not matter, so .doc and .DOC are the same document extension.

The extension is an optional attribute in the file name, i.e. it may not exist. In this case, there is usually no dot at the end of the file name. The extension, although not necessary, is still desirable, because it tells Windows what type of file it is. Simply put, the file type tells Windows which program to open the file with. For example, the .doc extension indicates that the file should be opened using the Word editor, and the .cdr extension indicates that the file should be opened with the Corel Draw graphics program.

There are reserved (service) names that cannot be used as file names because they are device names:

PRN – printer,

COM1-COM4 – devices connected to serial ports 1-4,

AUX – same as COM1,

LPT1-LPT4 – devices connected to parallel ports 1-4 (usually printers),

CON (consol) – for input – keyboard, for output – screen,

NUL is an “empty” device.

Here are examples of file names that are invalid:

5<>8/7.txt – symbols "<», «>" and "/" are prohibited,

What's the question? - symbol "?" prohibited

PRN.bmp – here PRN is a reserved name.

Depending on the file type, different icons are displayed on the Windows screen:

Windows Explorer (Start-Programs-Accessories-Explorer) by default has a mode where file name extensions are not displayed on the screen, but file icons are displayed.

When saving a file, just write its name and select the file type from the available list. The selected extension will automatically be added to the file name. For example, in the figure below, the program itself will add the .jpg extension to the file name. As a result, Windows will remember this file with the name “drawing in paint.jpg”.

To avoid misunderstandings when saving files, always pay attention to the “file type” line, if there is one. After all, the file type is a hint for Windows, with the help of which the system determines which program this file can be opened.

If you downloaded a file from the Internet, for example, with the extension .rar, but you do not have an archiver program installed on your computer to work with such “compressed, archived” files, then do not be surprised that the file does not open. In other words, you need to be aware that if you open files, for example, in a video format, then the computer must have the appropriate program to work with this format.

An analogy can be drawn between a file (more precisely, between a file type) and a program that works with that type of file. A file is a lock, and the program that opens this file is a key. A lock cannot be opened without a key, and a key without a lock is not particularly valuable.

Computer literacy exercises:

1) Try creating two folders on your Desktop with the names: PRIMER and primer. To do this, on the Desktop, right-click on an empty space, and in the window that appears, click on the “Create” option and, finally, click on the “Folder” option. Replace "New Folder" with "PRIMER". Then repeat all this to create a second folder called “primer”. Did Windows give you the go-ahead to open a second folder?

2) For example, go to Word editor and try saving the document with the name PRN. Does Windows allow this name for a new file?

3) How to solve the problem: “I download files from the Internet, but they are in xsd (PM)/RAR format and cannot be opened or read on the computer. What to do?"

P.S. The article is over, but you can still read:

Receive the latest computer literacy articles directly to your inbox.
Already more 3,000 subscribers

.

Then most likely you are wrong. There are rules due to which you cannot name a file by any name like an ordinary physical object. First, let's clarify what a file name is and how it is used.

The concepts of “path” and “file name”

Very often in computer literature the terms “path” and “file name” are used with different meanings. Typically, the word “path” refers to the address or location of a file, i.e. the drive, folder and subfolders in which the file is located. However, Microsoft and others believe that the path to a file includes not only its location but also the file name itself. And some people mean by the word “path” only the names of the file and the folders in which it is located, without specifying the drive. Some users believe that "filename" does not include the extension. In this article, the extension is always part of the file name. In the example below, the path to the file is highlighted in blue, and the file name is highlighted in red.
X:\folder\subfolder\
file.extension

Reserved characters and names

Most commonly used characters are allowed in a file name. The file name must not contain „ < ” (less sign), „ > ” (greater sign), „ : ” (colon), „“” (double quotes), „/” (slash), „\” (backslash), „|” (vertical bar), „?” (question mark), “*” (asterisk), and cannot end with a period or space. Files also cannot be named with reserved device names: CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, And LPT9.

Limitations on filename and path lengths

There are restrictions on the length of the file name and the length of the path. The absolute limit on the length of a file name including its path is 260 characters. This limit is called the term MAX_PATH. In fact, in practice the limits for names are even smaller due to a number of other restrictions. For example, each line at the end must contain the so-called null character, which marks the end of the line. Although the end-of-line marker is not displayed, it is counted as a separate character when calculating the length, which leaves 259 characters available for the file name and path. The first three characters in the path are used to identify the drive (for example, C:\). This reduces the limit for folder, subfolder, and file names to 256 characters.

The object name (folder or file) is limited to 255 characters. This limit is only valid if the object is not located inside a folder. Since when an object is located inside a folder, the sum of the lengths of all folders in which it is located, separators and the object name is limited to 256 characters, the length limit of the object name itself is less than 255 characters.

» [Informatics Exam][Ticket No. 9]

File system. Folders and files. Name, type, path to the file.

File.

All programs and data are stored in the long-term (external) memory of the computer in the form of files.

A file is a certain amount of information (program or data) that has a name and is stored in long-term (external) memory.

The file name consists of two parts, separated by a dot: the actual file name and the extension that determines its type (program, data, etc.). The actual name of the file is given by the user, and the file type is usually set automatically by the program when it is created.

Different operating systems have different filename formats. In the MS-DOS operating system, the file name itself must contain no more than eight letters of the Latin alphabet and numbers, and the extension consists of three Latin letters, for example: proba.txt

In the Windows operating system, the file name can have up to 255 characters, and the Russian alphabet can be used, for example:
Units of information.doc

File system.

Each storage medium (floppy, hard or laser disk) can store a large number of files. The order in which files are stored on the disk is determined by the installed file system.

A file system is a system for storing files and organizing directories.

For disks with a small number of files (up to several dozen), it is convenient to use a single-level file system, when the directory (disk table of contents) is a linear sequence of file names.

If hundreds and thousands of files are stored on a disk, then for ease of searching, the files are organized into a multi-level hierarchical file system, which has a “tree” structure.

The initial, root, directory contains subdirectories of the 1st level, in turn, in each of them there are subdirectories of the 2nd level, etc. It should be noted that files can be stored in directories of all levels.

The path to the file.

In order to find a file in a hierarchical file structure, you must specify the path to the file. The path to the file includes the logical name of the disk, written through the separator “\”, and a sequence of names of nested directories, the last of which contains the desired file.

For example, the path to the files in the figure can be written like this:

C:\Music\Picnic\

Full file name.

The path to the file along with the file name is called the fully qualified file name.

Example of full file names:

C:\basic\prog123.bas

C:\Music\Picnic\Hieroglyph.mp3

Operations on files.

While working on files on a computer, the following operations are most often performed: copying (a copy of the file is placed in another directory); moving (the file itself is moved to another directory); deletion (the file entry is deleted from the directory); renaming (file name changes).

Graphical representation of the file system.

The MS-DOS hierarchical file system containing directories and files is represented in the Windows operating system through a graphical interface in the form of a hierarchical system of folders and documents. A folder in Windows is analogous to an MS-DOS directory. However, the hierarchical structures of these systems are somewhat different. In the MS-DOS hierarchical file system, the top of the object hierarchy is the root directory of the disk, which can be compared to the trunk of a tree - branches (subdirectories) grow on it, and leaves (files) are located on the branches.

In Windows, at the top of the folder hierarchy is the Desktop folder. (The next level is represented by the My Computer, Recycle Bin and Network Places folders (if the computer is connected to a local network).

Having encountered an incomprehensible phrase, the reader, as a rule, strives to find out its meaning. This article is a brief excursion for the user into the world of the unknown.

General concept of a file

Long-term storage stores all data in the form of files. What is it? A file is a named sequence of bytes, which in turn consist of bits. It has its own name and location address. The first parameter is specified by a person, and the second is set and remembered for a long time by the operating system. The search is carried out by file name, so there is no need for the user to write down its address.

It happens that information cannot be found on computer media. But even an empty file has its own name, which is an important property of the data set recorded on the hard disk. If it is absent, then such a structure cannot be called storage.

File system

Each (floppy, hard or laser disk) can contain a huge amount of it. The file system is designed to store data and organize various directories. In a broad sense, it consists of the totality of all information on the disk, sets of data structures, and a set of system software tools. The root directory consists of 2nd level attachments, which, in turn, include 3rd level folders, etc. A single-level linear system is used for disks with several files, a multi-level hierarchical system for disks with a large number of them. The second is characterized by a tree structure.

Purpose of the file system

It consists in providing a convenient interface for a person when accessing information located on a disk, and realizing the possibility of sharing objects among many people and ongoing processes. This type of structure allows you to achieve maximum results when working with data.

File types

Thanks to certain information, the computer can roughly “understand” what is in the data set and what program can be used to open it. The extension is the few letters or numbers that appear after the period in the standard file name. It defines the data type and the corresponding program. For example, information recorded on a disc with the mp3 extension will open in the player. The program picture is present in the file image. By this icon, an experienced user immediately understands where the specified data type can be applied. The document will only open in a program designed for text. Video files can be played in the player. Information in the form of pictures opens in a graphic editor. There are many different files. Each of them has an icon indicating the corresponding program.

File: file names

Users give the data set on disk symbolic names. Files are identified by them. In this case, the restrictions of the system both on the characters used and on the entire length of the phrase must be taken into account. The file name is the name specified above, which can be the same for several data sets. In this case, the sequence of directory identifiers, that is, the address where the information is located, will be different. In some systems, the same object cannot have several names, in others there is no such restriction at all. In the latter case, the data set is given a unique name. This is a numeric identifier used by any operating system program.

File name composition

Any information on a disk contains several elements. What does the file name consist of? In order to understand this, you need to have a sample before your eyes. The file name consists of two interrelated parts: the name and the extension, which determines the data type. It identifies any information on the medium.

Full name

Here's an example:

C:\Music\Holiday\Melody.mp3.

The full file name indicated in the sample is the name consisting of the file itself and the information path. The last item listed is a list of folder IDs that should be opened sequentially to get from the highest level to the data set. The full file name must be specified, starting from the root directory, and contains a list of all dependent attachments of other levels. This name is absolute. It refers to information regarding the root directory, regardless of the current folder. All name elements are separated by a slash (\) character. This character must be specified before the name of the root directory.

Short name

Constraints are the reason this term came into being. In those days, a file could only have 8 characters in its name. A little later, it became possible to put a period after the name and add 3 extension characters.

It looked like this:

Melody.mp3.

Developers began to use name extensions for technical needs. With their help, programs “learned” to recognize the file type. This file name recording scheme was called system 8.3 (after the number of characters in the name and extension, and a period between them). It had a number of disadvantages: the inability to use spaces, punctuation marks, and letters other than the English alphabet. Therefore, creating a meaningful name was very difficult. The short name does not contain a forward slash ( \ ). By this name you can refer to the data in the current directory.

Long name

Previously, when thousands of files were stored on disks, users knew quite well where certain data came from on the media. Currently, it is impossible to monitor the history of incoming information. Therefore, strict restrictions on title length were removed for the data. What does the file name consist of? Now the name can be written in Russian letters, with some punctuation marks and even spaces. The extension is indicated not only by three characters. If the name contains several periods, then the file type is indicated after the last punctuation mark.

However, traditions contain great power, which is why long extensions are not found on computers. Three characters are enough for the system to indicate the file type. There may be at least 250 characters, although this certainly seems overkill.

Problem objects

A document with a long title may not be read correctly on another computer. Therefore, when sending data, you should use Latin letters. The Russian alphabet may not be on the recipient’s computer, and instead of phrases, an incomprehensible set of characters will appear. To organize a file storage system on the user's personal computer, any letters are used.

Correct file name

It can consist of any uppercase or lowercase letters, a number, a period, and an underscore. The use of spaces is not prohibited. However, you should not overuse it, and also do not put it at the beginning of the name. You can include other characters in the name, with the exception of reserved characters (>< | ? * / \ : "). Расширение отделяется от названия последней правой точкой. Длина имени ограничивается 255 знаками. На самом деле обычному пользователю хватает 20 символов. Операционная система не различает строчные и прописные буквы в имени файла. Это означает, что сохранить в одном каталоге два элемента с одинаковым названием, написанным в разном регистре, не получится. Так может выглядеть пример совпадающих имен: «Текст.doc» и «ТЕКСТ.doc».

Incorrect file name

In addition to these restrictions, there is a prohibition on using reserved device names.

So, PRN is a printer. COM1-COM4 - devices connected to serial ports 1-4. AUX performs the same function as COM1. LPT1-LPT4 are elements connected to parallel ports 1-4 (printers), CON (consol) for input - keyboard, output - screen, NUL - “empty” device. When the user tries to specify a reserved name, the system displays an error. A warning is also displayed when prohibited characters are used. It indicates an invalid file name. Incorrectly recorded information about a data set is not saved, but takes on the previous value.

File name template

Operating system shells, as well as various programming languages, allow the user to search names and directories for specific groups. All files are checked for compliance with a given template, if any of them matches the standard, then it is taken into account, if not, it is skipped.

Why is such a sample needed? Often you need to perform the same action on a whole group of files. This takes less time than accessing each document individually. The file name template allows you to select a group that meets specified requirements from the crowd. It is even used in data retrieval.

Special characters

The file name template is specified using special characters:

  • An asterisk is a symbol for any group of characters. Their number doesn't matter. For example, one star is a template that matches all the information in the catalog. Thanks to the *.mp3 command, you can change any file of the same type. File names starting with my and ending with .txt are highlighted using the pattern my*.txt. The *2014* pattern defines all objects existing on the computer whose names contain the character group 2014.
  • is a designation for any single character. For example, for the sample music.??? data starting with the specified word and having an extension of three English letters is suitable. In the na?e.txt template, any symbol can be used instead of the standard question mark.

Other teams

There are also other rules for compiling samples. By including square brackets () in the command with a list of possible values, you can make the search more flexible. If you want to find any files starting with the letter t, not taking into account case, then the pattern should be written like this: *. When searching for data with alphabetical names, you can create a range. A similar template looks like this: ?.jpg. The system will find files with the specified extension type, whose names consist of two characters. Moreover, the first letter k, l, y or z is case insensitive.

Shell meaning

Several special characters can be used in one pattern. Templates are combined with many commands: browsing directories, copying files, searching, etc. However, actions are performed not with the template, but with the data that matches it. The required objects are selected by the command shell.

Pattern expansion is the process of replacing the * character with a consistent sequence of file names.

Some commands will never be able to find a special character in their list of parameters. So what is responsible for data retrieval? The command shell performs the necessary expansion of the pattern in such a way that all file names matching the pattern will be listed.

File name masks

They are used in group operations with data. The mask is a sequence of characters allowed in file names, which may also contain a question mark and an asterisk. With its help, you can delete any temporary file on your computer. The file names in the command can contain different symbols. A question mark marks one arbitrary character, while an asterisk marks an entire sequence. For example, using the command rm *mp3, you can delete all files ending with this fragment. If you need to erase all data in a directory, you should use the rm * command. The command works almost the same way with changing one character. Name masks can also be used with directories.

Problematic copying

The transition to long names creates compatibility problems with previously created programs that use small phrases. In order for applications to open information in accordance with the previously adopted storage structure, the file system must be able to provide unique short aliases for data that has complex names. New operating systems support long names. But sometimes the user encounters unexpected problems. Copying files with long names can be difficult.

In this case, even creating a shortcut will not help. Typically, the user only needs to rename the file and try again. Alternatively, you can archive the data, copy and unpack. But what to do if in the hundredth subdirectory in which the required file is located, the file names are so long because of the path written in them?

Backup options

If the above methods do not work, you should simply connect the network drive by right-clicking on the computer image and selecting the connection from the menu that appears. In this case, you must specify the letter for the desired media and the file path.

As a last resort, the user can use the FAR 2.0 long name copying program and even disable Recycle Bin.