一个简单的SIP呼叫例子

来源：互联网发布：c语言提成英文编辑：程序博客网时间：2024/06/05 01:19

本文译自Alan B. Johnston的《SIP: understanding the Session Initiation Protocol》的第二版18页。

A simple Session Establishment Example

原书地址：

http://bks3.books.google.com/books?id=VMP6gCBazzIC&printsec=frontcover

翻译笔记：好久没有翻译东西了，写起来还是有些困难。自己直接看原文的时候没有觉得什么阻碍，但是当你要用中文写出来的时候，就没有那么流畅了。还好经过几个小时的坷坷绊绊，也算是把它翻译出来了。一看时间，哇塞，都快临晨2点了。真是一投入就忘记了时间。

个人能力有限，欢迎大家挑错，同时把该文送过喜欢VoIP技术的朋友。

上图显示了两个启用了SIP的设备之间的 SIP 消息交互。这两个设备可以是 SIP 电话、手持设备、掌上电脑或手机。它假定两个设备已经连接到 IP 网络比如互联网，并且已经知道彼此的 IP 地址。

主叫方Tesla通过发送的一条SIP INVITE给被叫方Marconi来启动信息交互。在这条INVITE请求消息中包含了关于主叫方请求的会话或呼叫类型方面的细节。它可以是一个简单的语音（音频）会话、类似于视频会议的多媒体会话或者它可能是一个游戏会话。

这条INVITE 消息包含以下内容：

INVITE sip:marconi@radio.org SIP/2.0

Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bKfw19b

Max-Forwards: 70

To: G. Marconi <sip:Marconi@radio.org>

From: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341

Call-ID: 123456789@lab.high-voltage.org

CSeq: 1 INVITE

Subject: About That Power Outage...

Contact: <sip:n.tesla@lab.high-voltage.org>

Content-Type: application/sdp

Content-Length: 158

v=0

o=Tesla 2890844526 2890844526 IN IP4 lab.high-voltage.org

s=Phone Call

c=IN IP4 100.101.102.103

t=0 0

m=audio 49170 RTP/AVP 0

a=rtpmap:0 PCMU/8000

因为SIP消息是基于文本编码的协议，所以这使得SIP消息看起来像UDP数据报在以太网上传输那样的在线传输。

INVITE消息中列出来的区域被称为头部区域。它们都有着这样的形式：头标记：值 CRLF。第一行被称为开始行，该行标记了一种称为INVITE的方法，后面跟着的是请求的URI（Request-URI），最后是SIP版本号码2，它们之间使用空格来加以区分。SIP消息的每一行都用过CRLF来终结。请求的URI是SIP URI的一种特殊形式，它指明了请求要被发送到的资源，它也被称作请求目标。SIP URI将会在后面部分进行更多更细讨论。

紧随其后的第二行的第一个字段是VIA，每一个SIP设备产生或者转发一条SIP消息的时候都会在Via字段里面加上自己的地址，一般都是可以通过DNS解析的IP地址。Via字段包含了SIP版本 2.0，紧跟一个“/”，之后的UDP表示通过UDP进行传输，然后接着一个空格，接着是主机名或者IP地址，接着分号，最后是端口值。在上面的这个例子中是通用的SIP端口号5060。SIP的传输采用TCP、UDP、TLS和SCTP。端口号将在章节后面些的内容进行描述。Branch参数是一个传输标记符。针对这条SIP信息的后续响应可以被相互关联上就是因为它们包含一样的传输标记。

下一行的头标记是Max-Forwards，它被初始化为一个整数值，每个SIP服务器在接受和转发这个请求的过程中都会增加这个值，这个将简化环回检测。

下一行就是To和From行了，它们标识了SIP请求的发起者和目标。如同本例一样，在名字标签被使用的情况下，SIP URI就被放在了括弧内，它将被使用来路由请求。在提醒过程中，名字标签将会被使用，但是却不会被协议本身所使用。

Call-ID行是用来保持对特定SIP会话进行记录的标识符。SIP请求的发起者创建了本地唯一的字符串，然后通常会添加@和它的IP地址以便让该标识全球唯一。针对Call-ID，会话中的每一方都会贡献一个随机的标识符。这些标识符在每一次呼叫中都不一样。这些标识符被称为tag-（标签），在每一个会话建立之后，这些标签会被包含在To和From字段。最初的INVITE中包含了一个From 标签，但是在To中没有标签。

用户代理（User Agent）产生一条INVITE来建立会话，同时也产生了唯一的Call-ID和From标签。回应这个INVITE的用户代理也将产生一个标签求。最终本地标签（包含在From）、远程标签（包含在To）以及Call-ID三个合在一起来唯一地标识建立起的会话，也被称作“对话”，对话的标识符被参与会话的双方用来识别特定会话，因为在同一时间，在它们之间可能会建立很多的会话。在该建立好的会话之后的后续请求也将使用这个会话标示符。它们将会在下面的实例中展示。

下一行的头标记是CSeq，或者是command sequence（命令队列），它包含有一个数字，以及一个方法。在本例中是INVITE。在每一个新的请求被发送的时候，这个数字就会被增加。在本例中，它被初始化为1，但是也有可能从一个其它整数开始。

Via、Max-Forwards、To、From、Call-ID和CSeq构成了任何一条SIP请求语句里面的最小组成部分。其它的部分就可以作为可选附加信息或者针对于特别请求的必要信息。在这条INVITE消息里面，头标记Contact也是需要的，因为它包含了Tesla的通讯设备的SIP URI，也称作UA（用户代理），这个URI可以被使用来直接路由信息到Tesla。可选的头标记Subject（主题）也出现在这个例子里，它没有被协议所使用，但是却可以在振铃被叫方的时候显示出来以帮助被叫方决定是否接受这个呼叫。这点有点类似于电子邮件里面的From（发件人）和Subject（主题）。其它出现在这条INVITE消息内的头标记则包含了建立呼叫所必须的媒体信息。

Content-Type 和Content-Length头标记字段标识了消息体是SDP，并且包含了158个字节的数据。关于158个字节的基本知识包含在了表2.1中。每行结尾的CRLF显示为??。每一行的字节数据值显示在右边。在消息主体和消息头部之间有一行空行把二者隔开。而消息头是以Content-Length结尾的。在本例中，有7行SDP数据描述了呼叫者Tesla希望建立呼叫的媒体属性。这些媒体信息是必须的，因为SIP不知道将要建立的媒体会话的类型，所以呼叫者必须指明它想建立会话的类型（音频、视频、游戏），SDP字段的名字在表2.2中，并且在7.1章节会讨论，但是我们将快速的预览一下必要的基本信息。

表2.1: Content-Length Calculation 例子

LINE

TOTAL

v=0??

o=Tesla 2890844526 2890844526 IN IP4 lab.high-voltage.org??

s=Phone Call??

c=IN IP4 100.101.102.103??

t=0 0??

m=audio 49170 RTP/AVP 0??

a=rtpmap:0 PCMU/8000??

158

Table 2.2: SDP 实例数据

SDP 参数

参数名称

v=0

Version number

版本号码

o=Tesla 2890844526 2890844526 IN IP4 lab.high-voltage.org

Origin containing name

原始包含名字

s=Phone Call

Subject

主题

c=IN IP4 100.101.102.103

Connection

连接

t=0 0

Time

时间

m=audio 49170 RTP/AVP 0

Media

媒体

a=rtpmap:0 PCMU/8000

Attributes

属性

表2.2包含了

连接的IP地址：100.101.102.103

媒体格式：音频

端口号：49170

媒体支持的协议：RTP

媒体编码：PCM μ Law

采样率：8000Hz

INVITE 只是SIP请求消息的一个例子，在RFC 3261和其它一些扩展RFC里共定义了5种方法或者其它的SIP请求。图2.1中的另外一条消息是回应INVITE 的180 Ringing消息。这条消息说明了被叫方已经收到了INVITE并且提醒正在进行。提醒可能是振铃、在屏幕上显示一条消息，或者其它吸引被叫方Marconi注意的方法。

180 Ringing是SIP回应消息的一个例子。回应是数字化的并且由数字的第一个数字来分类。一条180回应是消息类的，通过第一个位数字为1来标识。消息类的回应被用来传递呼叫过程中的一些非关键的信息。很多SIP回应代码是基于HTTP 版本1.1的回应代码，但是做了一些扩展和增加。任何一位浏览过网页的用户在他们想要浏览的网页不存在的时候应该都接到过来自于WEB服务器的“404 Not Found”回应。404 NOT FOUND也是一个有效的SIP“客户端错误类”的回应，如果请求的是一位未知的用户，那么也会返回404。其它类的SIP回应将在第五章节中描述。

在SIP中，单一的决定了回应方式的回应代码是由服务器或者用户来解释的。在本例中的回应，Ringing，就是标准的建议。但是可以使用任何文本来提示更多信息，比如说180，稍等，我将试图叫醒他，就既是一个非常合理的SIP回应，并且它有着和180 Ringing回应相同的意思。

180 Ringing回应有着如下的结构：

SIP/2.0 180 Ringing

Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bKfw19b

;received=100.101.102.103

To: G. Marconi <sip:marconi@radio.org>;tag=a53e42

From: Nikola Tesla <sip:n.tesla@high-voltage.org>>;tag=76341

Call-ID: 123456789@lab.high-voltage.org

CSeq: 1 INVITE

Contact: <sip:marconi@tower.radio.org>

Content-Length: 0

该消息是大部分通过复制INVITE消息里面的内容来的，包含了Via、To、From、Call-ID和CSeq行，然后添加了一行包含有SIP版本号、回应代码、原因短语的回应行。这种方法简化了对回应的消息处理。

Via行包含了原始的branch参数，但是增加了额外的received参数。这个参数包含了IP地址表明了请求是由100.101.102.103所接受的。这个IP地址也是Via行里面的URI (lab.high-voltage.org)通过DNS所解析来的。

注意在回应消息里面大家认为可能会被修改的To和From字段其实没有被修改。从消息里面看出，消息是从Tesla发送给Marconi，头部区域读取将读取相反的信息。这是因为SIP里面的To和From字段是用来显示请求的方向，而不是消息的方向。因为Tesla发起了这个请求，所有的回应将读取To为Marconi，From为Tesla。

To行现在包含了一个由Marconi产生的标签，在这次会话或者对话的所有后续请求和回应都将包含由Tesla产生的标签和Marconi产生的标签。

这个回应包含了一个Contact行，里面包含了一个地址。一旦会话建立起来之后，通过里面这个地址，Marconi就可以被直接联系到。

当被呼叫的Marconi决定接受这个呼叫（比如接听呼叫），那么一条包含200 OK的回应将会被发送。这条回应同时也表明了呼叫发起者所提出的媒体会话类型是可以接受的。这条200 OK是成功类的回应的一个例子。 200 OK的消息体包含有Marconi的媒体信息：

SIP/2.0 200 OK

Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bKfw19b

;received=100.101.102.103

To: G. Marconi <sip:marconi@radio.org>;tag=a53e42

From: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341

Call-ID: 123456789@lab.high-voltage.org

CSeq: 1 INVITE

Contact: <sip:marconi@tower.radio.org>

Content-Type: application/sdp

Content-Length: 155

v=0

o=Marconi 2890844528 2890844528 IN IP4 tower.radio.org

s=Phone Call

c=IN IP4 200.201.202.203

t=0 0

m=audio 60000 RTP/AVP 0

a=rtpmap:0 PCMU/8000

这条回应采用和180 Ringing回应一样的方式构建的，它也包含同样的To标签和Contact URI。但是媒体支持能力却一定要通过附加在SDP进行告知。和表2.2一样的SDP字段，该SDP包含：

端点IP地址： (200.201.202.203);

媒体格式 (audio);

端口 (60000);

媒体传输协议： (RTP);

媒体编码：(PCM μ-Law);

采样率 (8,000 Hz).

最后一步就是通过一个“确认”消息来确认媒体会话。确认意思就是Tesla成功的接到了Marconi的回应。媒体信息的交换可以让媒体会话使用其它协议来建立会话，在本例中是RTP。

ACK sip:marconi@tower.radio.org SIP/2.0

Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bK321g

Max-Forwards: 70

To: G. Marconi <sip:marconi@radio.org>;tag=a53e42

From: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341

Call-ID: 123456789@lab.high-voltage.org

CSeq: 1 ACK

Content-Length: 0

命令顺序-CSeq有着和INVITE一样的号码，但是方法却被设置成了ACK。到了这一点，媒体会话就将使用SIP消息上携带的媒体信息了。媒体会话使用其它的协议开始了，典型的是RTP。Via行的branch参数包含一些不同于INVITE的新传输识别标识，因为确认的200 OK的ACK被认为是一个独立的传输。

这个信息交互展示了SIP是一个端到端的行令协议。一个SIP网络，或者SIP服务器是不要求被使用的协议的。两个运行SIP协议族的端点如果知道对方的IP地址的话就可以使用SIP来建立会话。虽然不是很直观，但是这个例子也展示了SIP协议的客户端-服务器的特性。当Tesla产生INVITE请求，它就是一个SIP客户端，当Marconi回应这个请求，它就是一个SIP服务器。当媒体会话建立好了之后，Marconi产生一个BYE请求，这个过程它又是一个SIP客户端，而Tesla回应这个请求的时候，它则成了为SIP服务器。这也就是为什么SIP服务器和SIP客户端必须同时都包含SIP服务器和SIP客户端软件，因为在一个典型的会话过程中，二者都是不可获取的。这个特点不同于其它客户端-服务器端的HTTP或者FTP之类的Internet协议。WEB浏览器永远都是HTTP客户端，而WEB服务器永远都是HTTP服务端，FTP也是一样的道理。在SIP内，在会话过程中，一个端点将在客户端和服务器端进行来回切换。

在图2.1种Marconi发送了一个BYE请求来结束会话：

BYE sip:n.tesla@lab.high-voltage.org SIP/2.0

Via: SIP/2.0/UDP tower.radio.org:5060;branch=z9hG4bK392kf

Max-Forwards: 70

To: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341

From: G. Marconi <sip:marconi@radio.org>;tag=a53e42

Call-ID: 123456789@lab.high-voltage.org

CSeq: 1 BYE

Content-Length: 0

本例中的Via行包含有Marconi的主机地址并且也包含了一个新的传输标示符，因为BYE被认为是不同于上面的INVITE和ACK传输的单独的传输。 To和From行反应了这个请求时产生于Marconi。他们已经将上面的传输信息颠倒了过来。但是Tesla能够通过和INVITE中一样的本地标签、远程标签和CALL-ID识别出该会话，然后结束相应的媒体会话。注意例子中所有的branch ID都使用z9hG4bK字符串开始，这是一个特殊的字符串，它说明了branch ID是使用了RFC 3261中严格定义的规则计算而来，也说明作为一个传输识别标记，是可用的结果的[1]。

针对BYE请求的是一个200 OK的回应：

SIP/2.0 200 OK

Via: SIP/2.0/UDP tower.radio.org:5060;branch=z9hG4bK392kf

;received=200.201.202.203

To: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341

From: G. Marconi <sip:marconi@radio.org>;tag=a53e42

Call-ID: 123456789@lab.high-voltage.org

CSeq: 1 BYE

Content-Length: 0

这个回应应答了原始请求的CSeq ：1 BYE

[1]这个字符串是必须的，因为用户代理通过RFC 3261 所产生的branch ID有可能不适合来做传输标示符。在本例中，客户端必须使用To标标签、From标签、Call-ID和CSeq来创建自己的传输标识符

shows the SIP message exchange between two SIP-enabled devices. The two devices could be SIP phones, hand-helds, palmtops, or cell phones. It is assumed that both devices are connected to an IP network such as the Internet and know each other's IP address.

Figure 2.1: A simple SIP session establishment example.

The calling party, Tesla, begins the message exchange by sending a SIPINVITE message to the called party, Marconi. The INVITE contains the details of the type of session or call that is requested. It could be a simple voice (audio) session, a multimedia session such as a video conference, or it could be a gaming session.

The INVITE message contains the following fields:

     INVITE sip:marconi@radio.org SIP/2.0     Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bKfw19b     Max-Forwards: 70     To: G. Marconi <sip:Marconi@radio.org>     From: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341     Call-ID: 123456789@lab.high-voltage.org     CSeq: 1 INVITE     Subject: About That Power Outage...     Contact: <sip:n.tesla@lab.high-voltage.org>     Content-Type: application/sdp     Content-Length: 158     v=0     o=Tesla 2890844526 2890844526 IN IP4 lab.high-voltage.org     s=Phone Call     c=IN IP4 100.101.102.103     t=0 0     m=audio 49170 RTP/AVP 0     a=rtpmap:0 PCMU/8000

Since SIP is a text-encoded protocol, this is actually what the SIP message would look like "on the wire" as a UDP datagram being transported over, for example, Ethernet.

The fields listed in the INVITE message are called header fields. They have the formHeader: Value CRLF. The first line of the request message, called the start line, lists the method, which isINVITE, the Request-URI, then the SIP version number (2.0), all separated by spaces. Each line of a SIP message is terminated by a CRLF. The Request-URI is a special form of SIP URI and indicates the resource to which the request is being sent, also known as the request target. SIP URIs are discussed in more detail in later sections.

The first header field following the start line shown is a Via header field. Each SIP device that originates or forwards a SIP message stamps its own address in aVia header field, usually written as a host name that can be resolved into an IP address using a DNS query. TheVia header field contains the SIP version number (2.0), a "/", then UDP for UDP transport, a space, then the hostname or address, a colon, then a port number, in this example the "well-known" SIP port number5060. Transport of SIP using TCP, UDP, TLS, and SCTP and the use of port numbers are covered later in this chapter. Thebranch parameter is a transaction identifier. Responses relating to this request can be correlated because they will contain this same transaction identifier.

The next header field shown is the Max-Forwards header field, which is initialized to some large integer and decremented by each SIP server, which receives and forwards the request, providing simple loop detection.

The next header fields are the To and From header fields, which show the originator and destination of the SIP request. When a name label is used, as in this example, the SIP URI is enclosed in brackets and used for routing the request. The name label could be displayed during alerting, for example, but is not used by the protocol.

The Call-ID header field is an identifier used to keep track of a particular SIP session. The originator of the request creates a locally unique string, then usually adds an "@" and its host name to make it globally unique. In addition to the Call-ID, each party in the session also contributes a random identifier, unique for each call. These identifiers, calledtags, are included in the To and From header fields as the session is established. The initialINVITE shown contains a From tag but no To tag.

The user agent that generates the initial INVITE to establish the session generates the uniqueCall-ID and From tag. In the response to the INVITE, the user agent answering the request will generate the To tag. The combination of the local tag (contained in theFrom header field), remote tag (contained in the To header field), and theCall-ID uniquely identifies the established session, known as a "dialog". This dialog identifier is used by bothparties to identify this call because they could have multiple calls set up between them. Subsequent requests within the established session will use this dialog identifier, as will be shown in the following examples.

The next header field shown is the CSeq, or command sequence. It contains a number, followed by the method name,INVITE in this case. This number is incremented for each new request sent. In this example, the command sequence number is initialized to 1, but it could start at another integer value.

The Via header fields plus the Max-Forwards, To, From, Call-ID, andCSeq header fields represent the minimum required header field set in any SIP request message. Other header fields can be included as optional additional information, or information needed for a specific request type. AContact header field is also required in this INVITE message, which contains the SIP URI of Tesla's communication device, known as a user agent (UA); this URI can be used to route messages directly to Tesla. The optionalSubject header field is present in this example. It is not used by the protocol, but could be displayed during alerting to aid the called party in deciding whether to accept the call. The same sort of useful prioritization and screening commonly performed using the Subject and From header fields in an e-mail message is also possible with a SIPINVITE request. Additional header fields are present in this INVITE message, which contain the media information necessary to set up the call.

The Content-Type and Content-Length header fields indicate that the message body is SDP [3] and contains 158 octets of data. The basis for the octet count of 158 is shown inTable 2.1, where the CR LF at the end of each line is shown as a ©® and the octet count for each line is shown on the right-hand side. A blank line separates the message body from the header field list, which ends with the Content-Length header field. In this case, there are seven lines of SDP data describing the media attributes that the caller Tesla desires for the call. This media information is needed because SIP makes no assumptions about the type of media session to be established-the caller must specify exactly what type of session (audio, video, gaming) that he wishes to establish. The SDP field names are listed inTable 2.2, and will be discussed detail inSection 7.1, but a quick review of the lines shows the basic information necessary to establish a session.

Table 2.1:Content-Length Calculation Example

LINE

TOTAL

v=0©®

s=Phone Call©®

t=0 0©®

158

Table 2.2:SDP Data from Example

SDP Parameter

Parameter Name

v=0

Version number

o=Tesla 2890844526 2890844526 IN IP4 lab.high-voltage.org

Origin containing name

s=Phone Call

Subject

c=IN IP4 100.101.102.103

Connection

t=0 0

Time

m=audio 49170 RTP/AVP 0

Media

a=rtpmap:0 PCMU/8000

Attributes

Table 2.2 includes the:

Connection IP address (100.101.102.103);
Media format (audio);
Port number (49170);
Media transport protocol (RTP);
Media encoding (PCM μ Law);
Sampling rate (8,000 Hz).

INVITE is an example of a SIP request message. There are five other methods or types of SIP requests currently defined in the SIP specification RFC 3261 and others in extension RFCs. The next message inFigure 2.1 is a180 Ringing message sent in response to the INVITE. This message indicates that the called party Marconi has received theINVITE and that alerting is taking place. The alerting could be ringing a phone, flashing a message on a screen, or any other method of attracting the attention of the called party, Marconi.

The 180 Ringing is an example of a SIP response message. Responses are numerical and are classified by the first digit of the number. A180 response is an "informational class" response, identified by the first digit being a 1. Informational responses are used to convey noncritical information about the progress of the call. Many SIP response codes were based on HTTP version 1.1 response codes with some extensions and additions. Anyone who has ever browsed the World Wide Web has likely received a "404 Not Found" response from a Web server when a requested page was not found.404 Not Found is also a valid SIP "client error class" response in a request to an unknown user. The other classes of SIP responses are covered inChapter 5.

The response code number in SIP alone determines the way the response is interpreted by the server or the user. The reason phrase,Ringing in this case, is suggested in the standard, but any text can be used to convey more information. For instance,180 Hold your horses, I'm trying to wake him up! is a perfectly valid SIP response and has the same meaning as a180 Ringing response.

The 180 Ringing response has the following structure:

     SIP/2.0 180 Ringing     Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bKfw19b      ;received=100.101.102.103     To: G. Marconi <sip:marconi@radio.org>;tag=a53e42     From: Nikola Tesla <sip:n.tesla@high-voltage.org>>;tag=76341     Call-ID: 123456789@lab.high-voltage.org     CSeq: 1 INVITE     Contact: <sip:marconi@tower.radio.org>     Content-Length: 0

The message was created by copying many of the header fields from theINVITE message, including the Via, To, From, Call-ID, and CSeq, then adding a response start line containing the SIP version number, the response code, and the reason phrase. This approach simplifies the message processing for responses.

The Via header field contains the original branch parameter but also has an additionalreceived parameter. This parameter contains the literal IP address that the request was received from (100.101.102.103), which typically is the same address that the URI in theVia resolves using DNS (lab.high-voltage.org).

Note that the To and From header fields are not reversed in the response message as one might expect them to be. Even though this message is sent to Marconi from Tesla, the header fields read the opposite. This is because theTo and From header fields in SIP are defined to indicate the direction of the request, not the direction of the message. Since Tesla initiated this request, all responses will readTo: Marconi From: Tesla.

The To header field now contains a tag that was generated by Marconi. All future requests and responses in this session or dialog will contain both the tag generated by Tesla and the tag generated by Marconi.

The response also contains a Contact header field, which contains an address at which Marconi can be contacted directly once the session is established.

When the called party Marconi decides to accept the call (i.e., the phone is answered), a200 OK response is sent. This response also indicates that the type of media session proposed by the caller is acceptable. The200 OK is an example of a "success class" response. The 200 OK message body contains Marconi's media information:

     SIP/2.0 200 OK     Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bKfw19b      ;received=100.101.102.103     To: G. Marconi <sip:marconi@radio.org>;tag=a53e42     From: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341     Call-ID: 123456789@lab.high-voltage.org     CSeq: 1 INVITE     Contact: <sip:marconi@tower.radio.org>     Content-Type: application/sdp     Content-Length: 155     v=0     o=Marconi 2890844528 2890844528 IN IP4 tower.radio.org     s=Phone Call     c=IN IP4 200.201.202.203     t=0 0     m=audio 60000 RTP/AVP 0     a=rtpmap:0 PCMU/8000

This response is constructed the same way as the 180 Ringing response and contains the sameTo tag and Contact URI. The media capabilities, however, must be communicated in a SDP message body added to the response. From the same SDP fields asTable 2.2, the SDP contains:

End-point IP address (200.201.202.203);
Media format (audio);
Port number (60000);
Media transport protocol (RTP);
Media encoding (PCM μ-Law);
Sampling rate (8,000 Hz).

The final step is to confirm the media session with an "acknowledgment" request. The confirmation means that Tesla has received successfully Marconi'sresponse. This exchange of media information allows the media session to be established using another protocol, RTP in this example.

     ACK sip:marconi@tower.radio.org SIP/2.0     Via: SIP/2.0/UDP lab.high-voltage.org:5060;branch=z9hG4bK321g     Max-Forwards: 70     To: G. Marconi <sip:marconi@radio.org>;tag=a53e42     From: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341     Call-ID: 123456789@lab.high-voltage.org     CSeq: 1 ACK     Content-Length: 0

The command sequence, CSeq, has the same number as the INVITE, but the method is set to ACK. At this point, the media session begins using the media information carried in the SIP messages. The media session takes place using another protocol, typically RTP. Thebranch parameter in the Via header field contains a new transaction identifier than theINVITE, since an ACK sent to acknowledge a 200 OK is considered a separate transaction.

This message exchange shows that SIP is an end-to-end signaling protocol. A SIP network, or SIP server is not required for the protocol to be used. Two end points running a SIP protocol stack and knowing each other's IP addresses can use SIP to set up a media session between them. Although less obvious, this example also shows the client-server nature of the SIP protocol. When Tesla originates theINVITE request, he is acting as a SIP client. When Marconi responds to the request, he is acting as a SIP server. After the media session is established, Marconi originates theBYE request and acts as the SIP client, while Tesla acts as the SIP server when he responds. This is why a SIP-enabled device must contain both SIP server and SIP client software-during a typical session, both are needed. This is quite different from other client-server Internet protocols such as HTTP or FTP. The Web browser is always an HTTP client, and the Web server is always an HTTP server, and similarly for FTP. In SIP, an end point will switch back and forth during a session between being a client and a server.

In Figure 2.1, aBYE request is sent by Marconi to terminate the media session:

     BYE sip:n.tesla@lab.high-voltage.org SIP/2.0     Via: SIP/2.0/UDP tower.radio.org:5060;branch=z9hG4bK392kf     Max-Forwards: 70     To: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341     From: G. Marconi <sip:marconi@radio.org>;tag=a53e42     Call-ID: 123456789@lab.high-voltage.org     CSeq: 1 BYE     Content-Length: 0

The Via header field in this example is populated with Marconi's host address and contains a new transaction identifier since theBYE is considered a separate transaction from the INVITE or ACK transactions shown previously. The To and From header fields reflect that this request is originated by Marconi, as they are reversed from the messages in the previous transaction. Tesla, however, is able to identify the dialog using the presence of the same local and remote tags and Call-ID as the INVITE, and tear down the correct media session.

Notice that all the branch IDs shown in the example so far begin with the stringz9hG4bK. This is a special string that indicates that the branch ID has been calculated using strict rules defined in RFC 3261 and is as a result usable as a transaction identifier.^[1]

The confirmation response to the BYE is a 200 OK:

     SIP/2.0 200 OK     Via: SIP/2.0/UDP tower.radio.org:5060;branch=z9hG4bK392kf      ;received=200.201.202.203     To: Nikola Tesla <sip:n.tesla@high-voltage.org>;tag=76341     From: G. Marconi <sip:marconi@radio.org>;tag=a53e42     Call-ID: 123456789@lab.high-voltage.org     CSeq: 1 BYE     Content-Length: 0

The response echoes the CSeq of the original request:1 BYE.

^[1]This string is needed because branch IDs generated by user agents prior to RFC 3261 may have constructed branch IDs which are not suitable as transaction identifiers. In this case, a client must construct its own transaction identifier using theTo tag, From tag, Call-ID, and CSeq.