深入浅出低层次File_I/O分析

来源:互联网 发布:人工智能投资机会 编辑:程序博客网 时间:2024/05/04 07:53

版权声明:本文英文引用,和图片来自英文原版APUE。其他未标注皆为原创。

FILE I/O

1. open and openat

#include <fcntl.h>int open(const char *path, int oflag, ... /* mode_t mode */ );int openat(int fd, const char *path, int oflag, ... /* mode_t mode */ );      Both return: file descriptor if OK, −1 on error

首先我们掌握两点:

  • 系统会自动分配的FD为当前最小的。所以有种编程思路为:关闭当前标准输入、输出或者错误,来将其重定向到别的文件。
  • open和openat的最大区别,在于openat引入了相对路径的方式。

作为一开始知道的两个特性,会在以后的实验代码中有所体现。

2. creat and open

#include <fcntl.h>int creat(const char *path, mode_t mode);      Returns: file descriptor opened for write-only if OK, −1 on error

在结果上以下两种等价:

creat(path,mode)open(path,O_WRONLY | O_CREAT | O_TRUNC,mode)

我们有必要看看,每个option代表的含义:

option description O_WRONLY Open for writing only. O_CREAT Create the file if it doesn’t exist. O_TRUNC If the file exists and if it is successfully opened for either write-only or read–write, truncate its length to 0.

open函数的功能范畴要大于creat,且包括了creat。

3. read、write、lseek

#include <unistd.h>ssize_t read(int fd, void *buf, size_t nbytes);      Returns: number of bytes read, 0 if end of file, −1 on error
#include <unistd.h>ssize_t write(int fd, const void *buf, size_t nbytes);      Returns: number of bytes written if OK, −1 on error
#include <unistd.h>off_t lseek(int fd, off_t offset, int whence);      Returns: new file offset if OK, −1 on error

lseek中whence 有三种选择:SEEK_SET(0),SEEK_CUR(1)和SEEK_END(2)。根据lseek的特性,可以测试文件是否支持seek,而不对文件做任何改变:

lseek(fd, 0, SEEK_CUR) == -1

针对lseek的重要应用特性,列出几点:

  • lseek only records the current file offset within the kernel—it does not cause any I/O to take place. This offset is then used by the next read or write operation.
  • The file’s offset can be greater than the file’s current size, in which case the next write to the file will extend the file. This is referred to as creating a hole in a file and is allowed. Any bytes in a file that have not been written are read back as 0.
  • A hole in a file isn’t required to have storage backing it on disk. Depending on the file system implementation, when you write after seeking past the end of a file, new disk blocks might be allocated to store the data, but there is no need to allocate disk blocks for the data between the old end of file and the location where you start writing.

4. File Sharing

  首先应该知晓Linux的文件存储方式,从功能上理解,inode为block块的索引,系统读取档案时需要先找到 inode,并分析 inode 所具有的权限与使用者是否符合,若符合才能夠开始实际读取 block 的內容[1]

The kernel uses three data structures to represent an open file, and the relationships among them determine the effect one process has on another with regard to file sharing.

这里写图片描述

  现在如果希望深入复杂点,可以解析下linux kernel的代码,但是此处必然要耗费大量精力,我们可以得到的信息是:
  

  • 每个进程维护一个自己的列表,该列表每条存储一个描述符
  • 每个描述符指向一个内核维护的打开文件列表的其中一个条目
  • 文件列表条目指向一个inode列表条目,inode列表条目中的信息实际上指向真正存储文件信息的block块。

    是时候看看dup,和dup2的实现

#include <unistd.h>int dup(int fd);int dup2(int fd, int fd2);   Both return: new file descriptor if OK, −1 on error

现在我需要编写个程序囊括以上所有的基本点,并且还有部分拓展:

[root@localhost dhuang]# vim fileio.c#include <fcntl.h>#include <stdio.h>#include <unistd.h>#include <stdlib.h>char buffer[20];int main(int argc, int ** argv){        int fd[2];        pid_t pid;        int charnum;        if((fd[1]=open("data.txt",O_RDWR|O_CREAT|O_TRUNC,S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH)) < 0)                printf("error in creating a file\n");        if((pid = fork()) < 0 )                printf("fork error\n");        if(pid > 0 )        {                close(1);                dup(fd[1]);                if((fd[2] = open("input.txt", O_RDONLY)) < 0)                        printf("error in opening a file\n");                dup2(fd[2],STDIN_FILENO);                for(;;)                {                        if((charnum = read(STDIN_FILENO, buffer,20)) == -1 )                        {                                printf("parent:error in reading a file\n");                                break;                        }                        else if(charnum == 0)                        {                                printf("parent:end of the file opened.\n");                                break;                        }                        else                        {                                if(write(STDOUT_FILENO,buffer,charnum) == -1)                                        printf("parent:error in writing a file\n");                                        break;                        }                }                exit(0);        }        else        {                close(0);                dup(fd[1]);                for(;;)                {                        if((charnum = read(STDIN_FILENO,buffer,20)) == -1)                        {                                printf("child :error in reading a file\n");                                break;                        }                        else if(charnum == 0)                        {                                printf("child :end of the file opened.\n");                                break;                        }                        else                                printf("%s",buffer);                }                exit(0);                                                                                                                                                                                                       }                                                                                                                                                                                                                      return 0;                                                                                                                                                                                                      }    
[root@localhost dhuang]# cat input.txt abcdefg1234567hug

结果观察:

[root@localhost dhuang]# cat input.txt abcdefg1234567hug[root@localhost dhuang]# ./fileio child :end of the file opened.[root@localhost dhuang]# lltotal 24-rw-r--r--. 1 root root   20 Nov  3 10:01 data.txt-rwxr-xr-x. 1 root root 9016 Nov  3 09:52 fileio-rw-r--r--. 1 root root 1287 Nov  3 10:00 fileio.c-rw-r--r--. 1 root root   20 Nov  3 09:27 input.txt[root@localhost dhuang]# cat data.txt abcdefg1234567hug

总结我的编程思路:

  • 可以看到dup,dup2的使用
  • 为了后续关于进程通信的说明,使用多进程处理

实验中反映的最大问题就是并发操作时的先后问题,child process并没有读到data.txt的内容,所以如果我们希望我们程序可以正常运行,就必需涉及以后的知识:进程通信的概念~(请看后续实验的部分)

Atomic Operations

ssize_t pread(int fd, void *buf, size_t nbytes, off_t offset);     Returns: number of bytes read, 0 if end of file, −1 on errorssize_t pwrite(int fd, const void *buf, size_t nbytes,off_t offset);     Returns: number of bytes written if OK, −1 on error

从函数参数观察,比原版的read和write多了一个offset,如果用原版的read、write函数则需要在前面使用lseek。所以pread和pwrite只是融合这两个call,并不可被打断,并不改变当前offset!

关于不改变offset需要再做个试验去验证下,非常简单:

#include<unistd.h>#include<stdio.h>#include<fcntl.h>int main(int argc,char ** argv){        int fd;        off_t offset;        char buffer[20]="bushipeien\n";        if(argc != 2)        {                printf("error in argc,it should be two.\n");        }        if((fd = open(argv[1],O_RDWR)) < 0)                printf("error in opening a file\n");        else        {                if((offset = lseek(fd,0,SEEK_END)) == -1)                        printf("error in lseek a file\n");                else                {                        pwrite(fd,buffer,20,offset);                        printf("%d\n",offset);                }                if((offset = lseek(fd,0,SEEK_CUR)) == -1)                {       printf("error in lseek a file pwrited\n");                        return -1;                }                else                        printf("%d\n",offset);        }        return 0;}

此时input.txt数据为:

[root@localhost dhuang]# cat input.txtinputisgoodplace

执行命令:

[root@localhost dhuang]# ./pwrite input.txt 1818[root@localhost dhuang]# cat input.txtinputisgoodplacebushipeien

我们发现Pwrite并未改变file offset的值!

Fcntl,Ioctl

int fcntl(int fd, int cmd, ... /* int arg */ );       Returns: depends on cmd if OK (see following), −1 on error
  1. Duplicate an existing descriptor (cmd = F_DUPFD or F_DUPFD_CLOE
  2. Get/set file descriptor flags (cmd = F_GETFD or F_SETFD)
  3. Get/set file status flags (cmd = F_GETFL or F_SETFL)
  4. Get/set asynchronous I/O ownership (cmd = F_GETOWN or F_SETOW

  我们清楚了一些fcntl的一般用途,如果我们以抓重点的形式,fcntl对于flags的获取和处理需要关注,这里是时候关注涉及哪些flag和相关概念:
  

  • Currently, only one such flag is defined: FD_CLOEXEC, the close-on-exec flag. If the FD_CLOEXEC bit is 0, the file descriptor will remain open across an execve(2), otherwise it will be closed.[2]

  • 这里写图片描述

如果想亲自编程尝试每个选项的效果,显然有点舍本逐末。所以这里可以先放放,不过分追究一些实现。但是file status flag区分于open(…flag…),然而两者的关系又惟妙惟肖。[3]

#include <unistd.h> /* System V */#include <sys/ioctl.h> /* BSD and Linux */int ioctl(int fd, int request, ...);        Returns: −1 on error, something else if OK
  1. UNIX System implementations use ioctl for many miscellaneous device operations. Some implementations have even extended it for use with regular files.
  2. Each device driver can define its own set of ioctl commands. The system,however, provides generic ioctl commands for different classes of devices.
  3. None of these operations is easily expressed in terms of the other functions in the chapter (read,write, lseek, and so on), so the easiest way to handle these devices has always been to access their operations using ioctl

ioctl作为驱动学习必须接触的一部分,其对象是设备,举Linux网卡的例子,以下代码摘自linux kernel 4.1X:

static int dm9000_ioctl(struct net_device *dev, struct ifreq *req, int cmd){    struct board_info *dm = to_dm9000_board(dev);    if (!netif_running(dev))        return -EINVAL;    return generic_mii_ioctl(&dm->mii, if_mii(req), cmd, NULL);}

来自用户的cmd,最终调用 generic_mii_ioctl:

/** * generic_mii_ioctl - main MII ioctl interface * @mii_if: the MII interface * @mii_data: MII ioctl data structure * @cmd: MII ioctl command * @duplex_chg_out: pointer to @duplex_changed status if there was no *  ioctl error * * Returns 0 on success, negative on error. */

这里再深入的话,牵扯的东西太多了,我只想确认ioctl的作用和地位。

/dev/fd

如果第一看到书里面介绍这点,不免觉得不是重点,且又不知道这点的作用。但是我仔细思考的话,就两个基本问题点,进行学习:

  • symbolic links的作用和用法
  • open()打开symbolic links的情况

下面从文献[4]中引用部分内容:

问题一:

each of the hard links to a file is a reference to the same i-node number, where an i-node number is an index into the i-node table, which contains metadata about all files on a file system

A symbolic link is a special type of file whose contents are a string that is the pathname another file, the file to which the link refers. In other words, a symbolic link is a pointer to another name, and not to an underlying object.

言简意赅的说软硬连接的区别,在于这个连接到底是指向另一个文件的,还是指向我们最开始提到的V-inode结构(linux中为 generic i-inode)。

问题二:

system calls follow symbolic links. For example, if there were a symbolic link slink which pointed to a file named afile, the system call open(“slink” …) would return a file descriptor referring to the file afile.

只挑重点,因为这方面的内容也非常非常多,对于open来说,打开一个slink,返回的文件描述符是指向slink所指向的文件。但是有些系统调用是打开slink本身。但是现在就讨论这么多。

参考文献:

[1]鸟哥私房菜:http://linux.vbird.org/linux_basic/0230filesystem.php#harddisk-inode
[2] fcntl:https://linux.die.net/man/2/fcntl
[3] open: https://linux.die.net/man/2/open
[4] symlink: https://linux.die.net/man/7/symlink

原创粉丝点击