《Unix高级环境编程》 读书笔记(1)(ch0~ch2)

来源:互联网 发布:ae软件中文版下载 编辑:程序博客网 时间:2024/06/05 16:50

《Unix高级环境编程》 读书笔记
<Advanced Programming in the Unix Environment> Reading Notes

(使用的是2005年的英文第二版,

   英文第三版已经在2013年五月发行,主要是根据新的标准进行了修正。主要的变化是将书中一些已经过时的接口(与STREAM相关的接口,因为这些接口在新的标准里被废除了),与Linux相关的最大变化是由于内核平台由第二版的2.4变为最新的3.2,所以在Thread相关的内容上进行了重新的改写。据此我们在看书时涉及相关方面时需要注意,如也可以看看新的关于thread的章节:
    http://ptgmedia.pearsoncmg.com/images/9780321637734/samplepages/0321637739.pdf)

"One of the essential classics of UNIX programming."

—Eric S. Raymond, author of The Art of UNIX Programming


Forward by Dennis Ritchie:..."It's a most worthy second edition of a classic."...


0.Preface

   主要是简要的介绍了一下Unix以及本书的写作背景与历史,介绍了本书的基本内容。

   “This book describes theprogramming interface to the Unix system, thesystem call interface and many of the functions provided in thestandard C library. It is intended for anyone writing programs that run under Unix.”

    Organization of the book:

    1,  Overview & introduction:         ch1;

        Unix Standardization:                ch2;

    2,  I/O, files, I/O library:                ch3~6;

    3,  processes, signals:                    ch7~10;

    4,  I/O terminal,advanced I/O:        ch11~13;

    5, IPC:                                         ch14~15;

    6, examples:                                  ch16~19.

    "This text is intended for programmers familiar with Unix and programmers familiar with some other operating system who wish to learn the details of the services provided by most Unix systems."

    本书中的所有近10000行代码都是用C写的。

                      Chapter 1. UNIX System Overview

1.1 Intro

     The focus of this text is to describe the services provided by various versions of the UNIX operating system.

1.2 Architecture

    

  In a strict sense, an operating system can be defined as the software that controls the hardware resources of the computer and provides an environment under which programs can run.

   The interface to the kernel is a layer of software called thesystem calls (the shaded portion inFigure 1.1). Libraries of common functions are built on top of the system call interface, but applications are free to use both. The shell is a special application that provides an interface for running other applications.

1.3 Logging in

   When we log in to a UNIX system, we enter our login name, followed by our password. The system then looks up our login name in its password file, usually the file/etc/passwd.

   Shells:  Ashell is a command-line interpreter that reads user input and executes commands. The user input to a shell is normally from the terminal (an interactive shell) or sometimes from a file (called ashell script). 

1.4 Files and Directories

 File System:

   A directory is a file that contains directory entries. Logically, we can think of each directory entry as containing a filename along with a structure of information describing the attributes of the file.

 Filename & Pathname

  The names in a directory are calledfilenames. The only two characters that cannot appear in a filename are the slash character (/) and the null character. 

    A sequence of one or more filenames, separated by slashes and optionally starting with a slash, forms apathname. A pathname that begins with a slash is called anabsolute pathname; otherwise, it's called arelative pathname.

  Working directory & Home directory:This is the directory from which all relative pathnames are interpreted.... When we log in, the working directory is set to ourhome directory. 

1.5 Input and output

  File Descriptor: 

File descriptors are normally small non-negative integers that the kernel uses to identify the files being accessed by a particular process. Whenever it opens an existing file or creates a new file, the kernel returns a file descriptor that we use when we want to read or write the file.

   I/O: By convention, all shells open three descriptors whenever a new program is run: standard input, standard output, and standard error. If nothing special is done, all three are connected to the terminal

   Unbuffered I/O: Unbuffered I/O is provided by the functionsopen,read,write,lseek, andclose. These functions all work with file descriptors

  The constants STDIN_FILENO and STDOUT_FILENO are defined in<unistd.h> and specify the file descriptors for standard input and standard output.

 Standard I/O:

   The standard I/O functions provide a buffered interface to the unbuffered I/O functions. Using standard I/O prevents us from having to worry about choosing optimal buffer sizes. <stdio.h>.

  The standard I/O constantsstdin andstdout are also defined in the<stdio.h> header and refer to the standard input and standard output.

1.6 Programs & Processes

   Program: A program is an executable file residing on disk in a directory. A program is read into memory and is executed by the kernel as a result of one of the six exec functions.

   Process: An executing instance of a program is called aprocess.The UNIX System guarantees that every process has a unique numeric identifier called theprocess ID.

   Process Control:通过具体的一个例子介绍了如何通过fork、exec function、waitpid等对process加以控制。We call fork to create a new process, which is a copy of the caller. We say that the caller is the parent and that the newly created process is the child. Thenfork returns the non-negative process ID of the new child process to the parent, and returns 0 to the child. Becausefork creates a new process, we say that it is called once by the parent but returns twice in the parent and in the child.”“ The combination of afork, followed by anexec, is what some operating systems call spawning a new process. In the UNIX System, the two parts are separated into individual functions.

 Threads & Its IDs

   必须进一步的结合操作系统知识理解线程,它是一小段程序段加上一些分配的地址空间。“Usually, a process has only one thread of control one set of machine instructions executing at a time.”

  一个进程的所有线程都共享该进程的资源,如地址空间,打开的文件信息等。“All the threads within a process share the same address space, file descriptors, stacks, and process-related attributes. Because they can access the same memory, the threads need to synchronize access to shared data among themselves to avoid inconsistencies.”

  “Thread IDs, however, are local to a process.”

1.7 Error Handling 

    “The file<errno.h> defines the symbolerrno and constants for each value that errno can assume.”

    在程序调用发生错误时用来提供进一步的错误信息。"The  <errno.h> header file defines the integer variable errno, which is set by system calls and some library functions in the event of an error to  indicate  what  went wrong.  Its value is significant only when the return value of the call indicated an error."

   NOTE:值得注意的两点:“First, its value is never cleared by a routine if an error does not occur. Therefore,we should examine its value only when the return value from a function indicates that an error occurred. Second,the value oferrno is never set to 0by any of the functions, and none of the constants defined in<errno.h> has a value of 0.”

  Two functions are defined by the C standard to help with printing error messages:

  

   (1) char *strerror(int errnum); 

//Returns: pointer to message string

   (2) The perror function produces an error message on the standard error, based on the current value of errno, and returns.

(实验:

  Error Recovery : 

   "The errors defined in<errno.h> can be divided into two categories:fatal and nonfatal. A fatal error has no recovery action. The best we can do is print an error message on the user's screen or write an error message into a log file, and then exit. Nonfatal errors, on the other hand, can sometimes be dealt with more robustly. Most nonfatal errors are temporary in nature, such as with a resource shortage, and might not occur when there is less activity on the system."

    "The typical recovery action for a resource-related nonfatal error is to delay a little and try again later."

   并不是所有的error都是fatal的,对于可能是由于系统忙碌等资源暂时短缺的错误,我们或许(最好)可以想办法将它修正。"Ultimately, it is up to the application developer to determine which errors are recoverable. If a reasonable strategy can be used to recover from an error, we can improve the robustness of our application by avoiding an abnormal exit."

1.8 User Identification

    UID:Theuser ID from our entry in the password file is a numeric value that identifies us to the system. This user ID is assigned by the system administrator when our login name is assigned, and we cannot change it.

    We call the user whose user ID is0 eitherroot or the superuser。The superuser has free rein over the system.

  GID: Our entry in the password file also specifies our numericgroup ID.This too is assigned by the system administrator when our login name is assigned.

   We call the functionsgetuid() andgetgid() to return the user ID and the group ID.

1.9 Signals

   Signals are a technique used to notify a process that some condition has occurred.For example, if a process divides by zero, the signal whose name is SIGFPE (floating-point exception) is sent to the process.

   对于信号我们有三种选择:(1)是忽略不顾,这对于大多数的硬件中断信号是不合适的;(2)Let the default action occur.(3)我们自己写响应函数加以处理,this is called "catching" the signal

   Many conditions generate signals. Two terminal keys, called theinterrupt key often the DELETE key or Control-C and thequit key often Control-backslash are used to interrupt the currently running process. Another way to generate a signal is by calling thekill function. We can call this function from a process to send a signal to another process.

     To catch this signal, the program needs to call thesignal function, specifying the name of the function to call when the SIGINT(signal for interrupt key 即ctrl+c) signal is generated. 

(实验:

1.10 Time Values

  历史地看,UNIX系统中记录两种time value:

     (1)Calendar time. This value counts the number of seconds since the Epoch: 00:00:00 January 1, 1970, Coordinated Universal Time (UTC).These time values are used to record the time when a file was last modified, for example.

     (2)Process time. This is also called CPU time and measures the central processor resources used by a process. 

    When we measure the execution time of a process,we'll see that the UNIX System maintains three values for a process: Clock time,

  • User CPU time, 

  • System CPU time.

    The clock time, sometimes calledwall clock time, is the amount of time the process takes to run, and its value depends on the number of other processes being run on the system. Whenever we report the clock time, the measurements are made with no other activities on the system.

   用户时间是指执行普通的指令所用的时间,系统时间显然是进行系统调用,特权指令所消耗的时间。The user CPU time is the CPU time attributed to user instructions. The system CPU time is the CPU time attributed to the kernel when it executes on behalf of the process.

1.11 System calls & Library functions

   All operating systems provide service points through which programs request services from the kernel. All implementations of the UNIX System provide a well-defined, limited number of entry points directly into the kernel called system calls.

   The system call interface has always been documented in Section 2 of theUNIX Programmer's Manual. Its definition is in the C language, regardless of the actual implementation technique used on any given system to invoke a system call.

   The technique used on UNIX systems is for each system call to have a function of the same name in the standard C library. The user process calls this function, using the standard C calling sequence. This function then invokes the appropriate kernel service, using whatever technique is required on the system.

    相当于使用标准库进一步封装了各种系统调用,使得对于user而言,调用更加的方便,也更加的安全,u容易出错。For our purposes, we can consider the system calls as being C functions. 

    Section 3 of theUNIX Programmer's Manual defines the general-purpose functions available to programmers. These functions aren't entry points into the kernel, although they may invoke one or more of the kernel's system calls.

    From an implementor's point of view, the distinction between a system call and a library function is fundamental. But from a user's perspective, the difference is not as critical. From our perspective in this text, both system calls and library functions appear as normal C functions. Both exist to provide services for application programs. We should realize, however, that we can replace the library functions, if desired, whereas the system calls usually cannot be replaced。

An application can call either a system call or a library routine. Also realize that many library routines invoke a system call:

 

    Another difference between system calls and library functions is that system calls usually provide a minimal interface, whereas library functions often provide more elaborate functionality.

   The process control system calls (fork,exec, andwait) are usually invoked by the user's application code directly.

1.12 Summary

   We've described some of the fundamental terms that we'll encounter over and over again.

Chapter2   UNIX Standardization and Implementation

2.1 Introduction

   本章将回复历史上的一些标准化工作(伴随着UNIX的不断壮大而不断进行着)。简单了解即可。

   An important part of all the standardization efforts is the specification of various limits that each implementation must define, so we look at these limits and the various ways to determine their values.

2.2 UNIX Standardization

(1) ISO C

  为C语言指定的基本标准,包括语言语义以及支持的库。

   ANSI and ISO’s standard for the C programming language. 

   This standard defines not only the syntax and semantics of the programming language but also a standard library。

   This library is important because all contemporary UNIX systems, such as the ones described in this book, provide the library routines that are specified in the C standard.

(2) IEEE POSIX

   为UNIX系统制定的接口标准(但并不具体的考虑这些接口如何实现),方便在UNIX环境下程序员的开发工作。极大地促进了可移植性,方面了大家的一致性。

   Of specific interest to this book is the 1003.1 operating system interface standard, whose goal is to promote the portability of applications among various UNIX System environments. This standard defines the services that must be provided by an operating system if it is to be "POSIX compliant," and has been adopted by most computer vendors.

    Because the 1003.1 standard specifies aninterface and not an implementation, no distinction is made between system calls and library functions. All the routines in the standard are calledfunctions.

   Figure 2.2,Figure 2.3, and Figure 2.4 summarize the required and optional headers as specified by POSIX.1. Because POSIX.1 includes the ISO C standard library functions, it also requires the headers listed inFigure 2.1.:

 ...

(3) The Single UNIX Specification 

    进一步的对UNIX的实现加以规范标准化。The Single UNIX Specification, a superset of the POSIX.1 standard, specifies additional interfaces that extend the functionality provided by the basic POSIX.1 specification. The complete set of system interfaces is called theX/Open System Interface (XSI).

 ( The XSI also defines which optional portions of POSIX.1 must be supported for an implementation to be deemedXSI conforming.Only XSI-conforming implementations can be called UNIX systems.)因为这个open group掌握着UNIX商标所以它对于UNIX的具体版本实现可以加以更强的控制。

(4) FIPS 

   美国政府指定的另一个联邦标准,本书不涉及。

2.3 UNIX System Implementation

   The previous section described ISO C, IEEE POSIX, and the Single UNIX Specification; three standards created by independent organizations. Standards, however, are interface specifications. How do these standards relate to the real world? These standards are taken by vendors and turned into actual implementations.

Three branches of the tree evolved.

  1. One at AT&T that led to System III and System V, the so-called commercial versions of the UNIX System.

  2. One at the University of California at Berkeley that led to the 4.xBSD implementations.

  3. The research version of the UNIX System, developed at the Computing Science Research Center of AT&T Bell Laboratories, that led to the UNIX Time-Sharing System 8th Edition, 9th Edition, and ended with the 10th Edition in 1990.

  UNIX System V Release 4 (SVR4) was a product of AT&T's UNIX System Laboratories (USL, formerly AT&T's UNIX Software Operation).

   All software produced by the FreeBSD project is freely available in both binary and source forms.

   Linux is an operating system that provides a rich UNIX programming environment, and is freely available under the GNU Public License.

   Mac OS X is based on entirely different technology than prior versions. The core operating system is called "Darwin," and is based on a combination of the Mach kernel and the FreeBSD operating system.

   Solaris is the version of the UNIX System developed by Sun Microsystems.

2.4 Relationship Std.. & Imp..

   某一具体实现符合某一些标准,同时又各有其特点。

2.5 Limits

  标准与实现中都会存在一些参数值的设定,显然这些参数值的设定一般是并不一致的。

  The implementations define many magic numbers and constants. Many of these have been hard coded into programs or were determined using ad hoc techniques

  Compile-time limits can be defined in headers that any program can include at compile time. But runtime limits require the process to call a function to obtain the value of the limit.

  We've listed various minimum values that an implementation must support, but how do we find out the limits that a particular system actually supports?

2.6 Options

  标准中有些事可选项而非必须选项,当编程涉及这些可选项时我们需要确认系统中是否实现了这些可选项。

   If we are to write portable applications that depend on any of these optionally-supported features, we need a portable way to determine whether an implementation supports a given option.

   the Single UNIX Specification defines three ways to do this:

  (1)Compile-time options are defined in<unistd.h>.

  (2)Runtime options that are not associated with a file or a directory are identified with the sysconf function.

  (3)Runtime options that are associated with a file or a directory are discovered by calling either the pathconf or thefpathconf function.

  2.7 Feature Test Macros

   可以自己设定一些宏参数使得程序的执行时满足某一指定的标准而不是该特定版本操作系统内部实现时设置的数值。The headers define numerous POSIX.1 and XSI symbols, as we've described. But most implementations can add their own definitions to these headers, in addition to the POSIX.1 and XSI definitions.

   The constants _POSIX_C_SOURCE and _XOPEN_SOURCE are called feature test macros. All feature test macros begin with an underscore. When used, they are typically defined in the cc command, as in

  cc -D_POSIX_C_SOURCE=200112 file.c

2.8  Primitive System Data Types

  系统中一些规定的数据类型可能是由C中的数据类型定义而来的,所以可能会是与具体实现相关的。Historically, certain C data types have been associated with certain UNIX system variables. For example, the major and minor device numbers have historically been stored in a 16-bit short integer, with 8 bits for the major device number and 8 bits for the minor device number. But many larger systems need more than 256 values for these device numbers, so a different technique is needed.

  The header<sys/types.h> defines some implementation-dependent data types, called theprimitive system data types. More of these data types are defined in other headers also. These data types are defined in the headers with the Ctypedef facility. 

2.9 Conflicts between Standards

总的来看,大部分都是一致的,只在一些小的地方由于标准抽象层次的不同有所差别。

2.10 Summary

   These standards try to define certain parameters that can change with each implementation, but we've seen that these limits are imperfect. We'll encounter many of these limits and magic constants as we proceed through the text.


    




  • 原创粉丝点击