Tuesday, 5 April 2016

Linux files and inodes

The following post is a section of the book 'Just Enough Linux'.  The entire book can be downloaded in pdf format for free from Leanpub or you can read it online here.
Since this post is a snapshot in time. I recommend that you download a copy of the book which is updated frequently to improve and expand the content.
---------------------------------------

Having already established that everything is a file in a Linux operating system we find ourselves in the situation where files are suddenly slightly more interesting and mysterious, and it’s useful to get a better understanding of how files are represented in a Linux operating system so that the context of other functions can be better understood (thinking of links).

Linux files can be best understood by thinking of them as being described by three parts;
  • The file name
  • An inode
  • The data

The file name

Individual file names are typically restricted to 255 characters and can be made up of almost any combination of characters. The only ‘special’ or ‘reserved’ character when assigning a file name is the forward slash (/). This character is used solely to separate directories and file names as appropriate.
The other component of a file name is an inode number. This is a number assigned by the operating system which acts as a pointer to an inode (Index Node) which is an object that can be thought of as an entry in a database that stores information specifically about that file.
We can list the inode numbers associated with files by using the -i option when listing files. For example;
The output (reproduced in edited form below) shows;
pi@raspberrypi ~ $ ls -i python_games/
45014 4row_arrow.png           63554 inkspillresetbutton.png
49906 4row_black.png           63556 inkspillsettingsbutton.png
54215 4row_red.png             63559 match0.wav
63530 cat.png                  63569 princess.png
63533 drawing.py               63488 RedSelector.png
63544 gem6.png                 63576 star_title.png
63545 gem7.png                 63578 tetrisb.mid
63546 gemgem.py                63579 tetrisc.mid
63553 inkspilllogo.png         63522 Wood_Block_Tall.png
63552 inkspill.py              63582 wormy.py
The numbers to the left of the file names are the associated inode numbers.
The file name and associated inode is in the directory that the file is contained in. Each directory (which is itself just a file) contains a table of the files it contains with the associated inode numbers.

The inode

Each file has an associated inode (index node). The inode is analogous to a database entry that describes a range of attributes of the file. These attributes include;
  • The unique inode number
  • Access Control List (ACL)
  • Extended attributes such as append only or immutability
  • Disk block location
  • Number of blocks
  • File access, change and modification time
  • File deletion time
  • File generation number
  • File size
  • File type
  • Group
  • Number of links
  • Owner
  • Permissions
  • Status flags
You may well ask what the point of having numbered inodes is. Good question. They have a range of uses including being used when there are files to be deleted with complex file names that cannot be easily reproduced at the command line or enabling linking.
Interestingly, space for inodes must be assigned when an operating system is installed. Within any file-system, the maximum number of inodes, and hence the maximum number of files, is set when the file-system is created.
Therefore there are two ways in which a file-system can run out of space;
  1. It can consume all the space for adding new data (i.e. run out of hard drive space), or
  2. It can use up all the inodes.
Running out of inodes will bring a computer to a halt because exhaustion of the inodes prohibits the creation of additional files. Even if sufficient hard drive space exists. It is particularly easy to run out of inodes if a file-system contains a very large number of very small files.

The post above (and heaps of other stuff) is in the book 'Just Enough Linux' that can be downloaded for free (or donate if you really want to :-)).

2 comments:

  1. I don't think you have mentioned mounting points in these blog posts. That's something pretty cool. In Windows, partitions are assigned alphabetic names, C:\, D:\, etc. In Linux, there is one master partition /, and the rest can be placed anywhere in the file tree, i.e. /mnt/dvd, /windows, etc. Even remote filesystems can be mounted through systems like NFS and used transparently by applications, though I think such systems are rarely used these days because Linux desktop environments already support remote access transparently.

    ReplyDelete
    Replies
    1. Thanks. I explain a bit more in the book (https://leanpub.com/jelinux of which this is a sub section) in the `mount` section. All the way through writing the book I have been fighting the urge to extend too much into networking and mounting filesystems via NFS and samba. I'd like to take a bit more time to learn some more before I go making a fool of myself :-).
      Many thanks for all the comments and advice.

      Delete