SIP FAQs

来源：互联网发布：苏州爱知编辑：程序博客网时间：2024/05/23 17:22

SIP FAQs

What is the difference between tag and branch-id?

Branch IDs allow proxies to match responses to forked requests. Without them, a proxy wouldn't be able to tell which branch a response corresponds to. Tags, in To headers, are of no help here since they are not known until responses arrive. Tags are used by the UAC to distinguish multiple final responses from different UASs.

A UAS has no reliable way of determining if the request has been forked or not. Thus, to be safe it needs to add a tag. Proxies only insert tags into the final responses they generate themselves; they never insert tags into requests or responses they forward.

Since a request can be forked several times on its way to UAS, a single "tag" (or whatever you like to call it) added to the request by one of the proxies is not sufficient for the next forking proxy along the chain to match responses on its own branches; every proxy that forked the request would need to add its own unique IDs to the branches it created. This is precisely what's being achieved by the branch parameter in the Via header.

What is the relationship between the From, Contact, Via and Record-Route/Route headers?

All these headers determine how requests and responses are routed in a network of SIP proxy servers. Roughly, the distinction is:

- From:
Used for subsequent requests from the callee to the caller if there is no Contact or Record-Route header. E.g., if Alice makes a call with

From: Alice sip:alice@example.org

to Bob, an INVITE request from Bob to Alice would use

sip:alice@example.org

as the To header and Request-URI.

- Contact:
Determines the destination placed in the Request-URI for subsequent requests and can be used to bypass proxies _not_ enumerated in a Record-Route header. Also used in responses by redirect servers and in REGISTER requests and responses.

- Record-Route/Route:
The Record-Route header is inserted into requests by proxies that want to be in the path of subsequent requests for the same call-id. It is then used by the user agent to route subsequent requests. The mechanism is similar to a source-route, as the Record-Route information is copied into a set of Route headers. The Request-URI is set to the

first Route
header.

- Via:
Via headers are inserted by servers into requests to detect loops and to allow responses to find their way back to the client. They have no influence on the routing of future requests (or responses).

Generally, in short, requests should be sent to Route if present, Contact if there is no Route, From if there is no Contact.

How are BYE requests routed?

Since a Contact header MUST be present in INVITE and 200, the BYE will go directly to the user agent if there is no Record-Route header. If there is a Record-Route, it will traverse the list of proxies indicated there.

If the caller decides to send a BYE before receiving a 200 from the callee, the BYE is be handled by the proxies just as the corresponding INVITE was handled, i.e., it may be forked.

What's the difference between a stateless and a stateful proxy server?

Stateless proxies forget about the SIP request once it has been forwarded. Stateful proxies remember the request after it has been forwarded, so they can associate the response with some internal state. In other words, stateful proxies maintain transaction state. Stateful implies transaction state, not call state.

Stateless proxies scale very well, and can be very fast. They are good for network cores. Stateful proxies can do more (they can fork, for example, see the next question) and can provide services stateless ones can't (call forward busy, for example). They don't scale as much as stateless ones. An admininstrator gets to decide which to use. These are also logical entities; a physical proxy is likely to act as a stateless proxy for some calls, stateful for others, and as a redirect server for even others.

Neither stateful nor stateless proxies need to maintain call state, although they can, but will need to make sure that they are part of subsequent transactions via the Record-Route header.

A proxy must be stateful if one of the following conditions hold:

1. It uses TCP,

2. It uses multicast,

3. It forks.

How does SIP get through a firewall?

There are several possible approaches to SIP-capable firewalls. One of the difficulties is that, unlike for, say, HTTP, connections are originated both by hosts inside and outside the firewall. A likely arrangement is that a SIP proxy sits "on" the firewall and relays SIP requests between the Internet and the intranet. Thix proxy would also open up the necessary ports in the firewall to let audio and video flow through, for example using Socks V5.

As an alternative, if a firewall or NAT allows outgoing TCP connections, the inside client can open up a TCP connection to an outside proxy. All outgoing and incoming calls would then be handled by that TCP connection. (The client would still have to use SOCKS or similar mechanism to convince the firewall to let RTP packets through.)

Take a look at the two dratfs at http://www.cs.columbia.edu/~hgs/sip/drafts_firewall.html for a more detaled discussion of getting SIP through firewalls and NATs.

Does SIP do keep-alive?

SIP itself does not have a keep-alive mechanism during the call. It was felt that loss of connectivity would be detected rapidly by the absence of media packets, typically sent at a much higher rate than any signaling keep-alive messages could be sent. In addition, the signaling path is not needed during the conversation and may well be completely different (due to proxy and redirect servers) than the media path, so that keep-alives have a limited functionality. If it is desired to test the liveness of a signaling server, it is always possible to send either OPTIONS or (re)INVITE messages.

However, knowing the call state might be useful for certain applications (e.g., when billing is involved, when firewall permissions need to be set etc.). Session timer extension has been defined to solve this. The draft can be found at http://www.cs.columbia.edu/~hgs/sip/drafts/draft-ietf-sip-session-timer-01.txt and it basically allows the servers indicate a desired refresh interval. The call is considered terminated if a re-INVITE is not received within that interval.

I want SIP to be more compact. What can I do?

First, one should realize that in general, SIP exchanges are only going to be a tiny fraction of the overall session bandwidth. A typical SIP call setup takes less than 1000 bytes, or the equivalent of one second of highly compressed (G.729) audio. Some additional space savings can be realized by using short headers. (A realistic example for an audio call setup takes a total of about 640 bytes, of which about 69 bytes are SIP headers.)

In general, more substantive savings are possible by using either payload compression (RFC 2393) or link-layer compression, e.g., at the PPP layer. For the example above, the total size is reduced to about 520 bytes with gzip compression.

What are the different addresses in SIP?

SIP INVITE requests involve three addresses:

1.The host address where the request came from. Responses are sent back to the same host address, regardless of what the From header indicates. Note that different requests for the same call can come from different hosts.

2.The From address contains the logical source of the request. It remains unmodified as a SIP request traverses proxies, for example. The From address may not be the same as the host address that generated the SIP request, although that's the typical case.

3.The session description (e.g., SDP) contains one or more addresses where the caller expects media data (audio, video) to be sent. For some services, this address may not be the same as the From address.

Can the request URI include a port number and/or transport parameter?

It can have a port number. But, let me explain when this is needed and when its not.
Lets say I send a request to joe@example.com, and the server for example.com is listening on 5061. The request URI might look like:
INVITE sip:joe@example.com:5061 SIP/2.0
this arrives at example.com. Since the request is for that server, it looks up joe in some database and translates the request URI (for example, to sip:joe@engineering.example.com). It looks up engineering.example.com in DNS, and finds an A record for that machine, forwarding the request to the given IP address. The outgoing request URI looks like:
INVITE sip:joe@engineering.example.com SIP/2.0
Note that in this case, the presence of port 5061 in the request URI sent to example.com didn't make a difference. Thats because the example.com just translated the request URI. Whether it had contained the port number or not would have had no effect on processing.
However, had the request instead been sent to a local outbound proxy instead of example.com, the port number would NEED to be there. Thats because the local outbound proxy won't translate the request URI, it will example it, determine its not for itself, look up the domain in the request URI in DNS, and forward the request there. So, the request URI needs to contain this port so that the local outbound proxy knows to forward it to 5061 as opposed to 5060 at example.com.
So, the rule of thumb is this:
if you send a request to the server listed in the domain of the request URI, URI parameters like port, transport, ttl etc MAY be present but are not needed. If you send a request to a server which is NOT the one identified in the request URI, you MUST include these parameters if they are not the defaults. Always inclduing them, when not default, means you don't need to determine which is the case, and is always the safest bet.

How long can SIP host names be?

DNS (RFC 1035, Section 3.1) limits labels (each component of a host name) to 63 characters. The total length of a domain name (i.e.,label octets and label length octets) is restricted to 255 octets or less. http://www.networksolutions.com/help/long-domains.html, however, claims that host names can be up to 80 characters long.

Note, however, that SIP implementations MUST be prepared to handle host names of any length, subject to any maximum message size restrictions that are part of local policy.

Can I remove an m= line from SDP in response or re-INVITE?

No. Once an "m=" line made it into SDP of a request or response, it cannot be removed until the call is terminated. The only way to decline a media session is by setting its port number to 0. The only way to offer a new media session is by adding it to the end of the list.

The reason for this is that we need to ensure that it is always possible to match media sessions (i.e., "m=" lines) in requests and responses. Consider an INVITE with the following SDP:

...

c=IN IP4 1.2.3.4

m=audio 54678 RTP/AVP 0 1 3

m=video 7346 RTP/AVP 28 31 (face)

m=video 7880 RTP/AVP 26 28 (presentation)

If the response contained something like

...

c=IN IP4 3.4.5.6

m=audio 6540 RTP/AVP 0 1

m=video 6578 RTP/AVP 28

the caller would not be able to tell which of the two offered video streams was accepted.

I'm a UAC. I sent an INVITE, and then decide I want to hang up before getting a final response. Do I send BYE or CANCEL?

If the caller wants to hang up a call, but hasn't yet received a final response, it can send a CANCEL or a BYE. Sending a BYE would seem easiest, but there are issues. First off, you won't have gotten a tag yet from the UAS, nor will you have received Record-Route or Contact headers (obtained in the 200 OK response). This means the BYE will be routed "afresh" by proxies. Its possible that routing logic may have changed (perhaps there was some time of day routing or randomized routing), in which case the BYE may reach a different set of participants than reached by the original INVITE. So, if the original INVITE forked, and reached A and B, and the BYE reaches B and C, B will send a 200 OK, and C a 481. The forking proxy forwards the 200 OK upstream, and the caller gets the 200 OK. However, A is still ringing, and might later send a 200 OK. This yield inconsistent call state, which will persist until the UAS times out, as it will never get an ACK.
Sending CANCEL helps solve many of these problems. CANCEL will reach the same set of recipients as the original INVITE, and it doesn't need a Record-Route or tag in it. The drawback, however, is that the CANCEL and a 200 OK from one of the UAS might pass on the wire. Thus, the UAC may still need to ACK the 200 OK, and then send BYE. The other drawback is that you wouldn't send CANCEL if the call was already established, you'd send BYE. Folks complained they didn't want to have state-dependent mechanisms for hanging up. Given the unlikelihood of the problems with sending BYE, it seems reasonable to allow it.

如果无Record-Route or tag ，UAC发出的BYE不能被发送到全部接收者？

I'm a proxy, and I forked a request, and forwarded multiple 200 OK upstream. Now, I get an ACK. What do I do with it?

Normally, using Route headers which should be present in the ACK. In the bis draft, the final 200 OK response MUST contain a Contact header. This means that either (1) the proxy record-routes, in which case the ACKs will each contain (different) route headers which tell the proxy where to send the request, or (2) the proxy doesn't record-route, in which case it gets sent directly to the UAS, since there was a contact.
That aside, should it arrive anyway, the ACK should be routed just as any other new request. Apply routing logic, which presumably causes it to be forked to both locations. The tags will help identify for which UAS the ACK is meant.

Does a UAS use the request-URI or To field to determine if a call is for it?

It uses the request URI. A UAS should be prepared to receive calls with the request URI set to values that it has registered (and placed in the Contact header of REGISTER). It should also be prepared to receive calls with the request URI set to the value it placed in the To field of the REGISTER. Its not likely to see such a request URI, unless its receiving a direct client to client call.

Is it possible to use Hide with Record-Route?

No, only Via can be hidden. Hiding a Record-Route header in the same manner is impossible because it would need to be decrypted by the upstream proxy for subsequent requests from the callee to the caller; however, the secret key would only be known to the server that encrypted the header.

Can a proxy fork a non-INVITE request? If yes, what happens if it gets multiple responses?

Yes, a proxy can fork a non-INVITE request. However, it must forward only a single response upstream, 200 or otherwise. Thus, only a single 200 is ever forwarded upstream. This is in contrast to INVITE, where all 200's received are forwarded upstream. Why is that? The reliability mechanism of non-INVITE requests dictates that. Response retransmissions are triggered on request retransmissions. Thus, the client retransmits its request until it gets *a* response. So, upon receiving the first final response, the client would cease retransmitting the request, and then there would be no way to reliably send the other final responses.
As a result of this, forking of non-INVITE requests is only useful when the method has semantics that meet certain criteria. Specifically, (1) the client doesn't care which server gets the request, (2) the client doesn't care which server sent the response, or even if multiple servers sent a response, (3) the service provided by each server is identical. In essence, forking of non-INVITE requests is useful only for an anycast type of service.

如果A 发的INVITE 被fork了，全部关于INVITE的响应都会转给A ，但如果是非INVITE，则就只有第一个响应会被转给A

When is a CANCEL used?

- A proxy has forked an INVITE request, and it receives a 200 or 600 response on one of the branches, the proxy CANCELs unanswered branches;
- The time described in the Expires header of the request has elapsed;
- No response, including provisional, was ever received from downstream nodes;
- Internal logic determines its time to end the transaction (a CPL or sip-cgi script, for example).

SIP 允许同一个媒体流上使用不同的编码（解释语音通话时使用二次拨号，G.723变换为rfc2833）

What are spirals? Why does a proxy care?

Sprials are defined in bis-03. They are requests that loop back to the same proxy, but for which the request URI has changed.
Classic example:
joe@example.com calls bob@bigcompany.com. Goes to the bigcompany.com proxy. It proxies it to bob@marketing.bigcompany.com. The marketing.bigcompany.com proxy invokes a CPL. The CPL has bob forwarding all his calls to jane@bigcompany.com. This request is then proxied to the bigcompany.com proxy. Now, this is a "loop", in the sense that the request has hit the same server, but its a valid one. Its valid because the request URI the second time around (sip:jane@bigcompany.com) is not the same as the first time around (sip:bob@bigcompany.com). So, the proxy should accept this and process it. This case is called a "spiral". Its called that since you can think of a proxy network in two dimensions; one dimension is the set of elements, and the other is the r-uri. The request returns to the same point in the first axis, but a different point in the second. Much like a spiral in 3D space, which returns to the same point in the X,Y axis, but a different one in the Z.

Now, how does the proxy know this was a spiral, and not a loop? Using the branch-ID. The branch-ID is supposed to contain a hash of the R-URI. So, when the request arrives again at the proxy, it finds its previous Via entry (because of the host name), and it matches. THen, it computes the hash of the R-URI in the incoming request, and compares it to the hash in the branch ID. If they are not the same, its a spiral. If they're the same, its a loop.

What's the difference between loose and strict source routing?

Both loose and strict routing use the Route header.
Strict routing (with more than zero intermediaries) attempts to carry information about the request target and the next hop to be reached in the Request-URI.
Loose routing leaves the request target in the Request-URI and the next hop in the Route header. Loose routing is identified by a ;lr parameter in the Route URI.

What is the relationship between MGCP and SIP?

The details of combining the two in a system are still being fleshed out. MGCP is a device control protocol, where a slave (gateway (MG)) is controlled by a master (media gateway controller (MGC), call agent). SIP may be used between controllers, in a peer-to-peer relationship. Note that to the SIP side, the MGC looks like a node with a large number of connections, but otherwise the same as a "native" SIP device. Similarly, the MG is completely unaware that the call between MGCs is established via SIP. Only the MGC needs to understand both protocols.

Additional details provided by Tom Taylor:

sip_h323_mgcp.gif (28.1 K)

The basic architecture assumed by the Megaco Working Group postulates two functional entities: a Media Gateway Controller (MGC), which owns the call model and is responsible for call signalling, and a Media Gateway (MG), responsible for manipulating (directing, transforming) media flows under the control of the MG.

MGCP and Megaco/H.GCP are both protocols used between the MGC and MG when they are realized in separate physical elements. MGCP (Media Gateway Control Protocol) was a major source of the ideas in the current Megaco/H.GCP protocol draft, and is being deployed in a number of products being announced over the next few months. It is best suited for IP telephony gateway applications. The Megaco protocol is also called H.GCP because it is being developed cooperatively between the Megaco WG and ITU-T Study Group 16.

H.323 is a complete system specification, including call signalling protocols which would run between an MGC and another MGC or other H.323 entities (Gatekeepers, endpoints). SIP can also be used as a call signalling protocol, and can therefore be viewed as a competitor to H.323. Both protocols are capable of supporting multipoint multimedia conferences. H.323 was first standardized in 1996 and has been improved since then; current standardization is focusing on networking aspects such as translations data exchange and interworking with legacy telephony signalling. SIP just reached Proposed Standard status, but has attracted wide interest which may speed its maturing stages. The Megaco/H.GCP protocol will complement both protocols by also providing support for multipoint, multimedia calls at the media level.

What is SIP+ and how does it relate to SIP

SIP+ was a proposal by Level3 on how to extend SIP to interconnect two MGCs. This functionality is now being provided by various orthogonal SIP extensions, including the carriage of multipart MIME types, the INFO method and others. These are being documented in a BCP draft. The name SIP+ is obsolete and should not be used to avoid confusion.

Where do I find description of SDP?

SDP (Session Description Protocol) specification can be found in RFC2327. SIP uses SDP to describe media capabilities of call participants and to negotiate the common media set media for a call. Appendix B of RFC2543 describes the usage of SDP in SIP messages.

What is sip-cgi and how does it relate to CPL?

Both are viewed as different approaches for creating VoIP services. Both are written offline, and both are executed when messages arrive in order to execute features.

CPL is an XML-based language, while sip-cgi is a mechanism for invoking scripts or programs written in any language. sip-cgi is very similar to web cgi scripts.

In its current version, CPL is only invoked when INVITE requests and responses arrive, while sip-cgi can intercept any request.

sip-cgi is designed to be used by SIP, while CPL can probably be used by a number of signaling protocols such as Q.931 or H.323.

CPL and sip-cgi differ in their applicability. CPL is designed for end user service creation. It is intentionally limited in capabilities and is not a general purpose programming language. Its execution on a server is generally very fast. CGI is more powerful - you can do nearly anything. It is programming language independent. It incurs a process-spawning overhead, so its less efficient than CPL. (CPL is usually executed in the same process as the server). As a service provider, I would not want to execute CGI scripts sent to me by end users. However, I would prefer to use CGI to develop my own services.

Note that CGI may be used as the execution environment for a CPL script. (Jonathan Rosenberg)

How does SIP carry DTMF (touch tones)?

First, in most cases it is not clear that SIP is the right mechanism for this, since DTMF detection is being done in devices that generate RTP, not SIP.
RTP can be used to carry DTMF, as described in RFC 2833. RFC 2833 uses "forward error correction", retransmitting DTMF digits periodically. Thus, unless there are extremely long bursts of packet errors, digits are transmitted reliably. Retransmission by SIP, either at the application layer or via TCP, is based on exponential back-off, with delays of a few seconds after several consecutive losses. If a human generates the touch tone commands, it is possible that such long retransmission delays will cause the user to press the button again, resulting in duplicate digits.
DTMF over RTP is also required to synchronize audio and touch tones at VoIP-to-PSTN gateways.
Gateways that are only interested in detecting tones do not need to buffer audio and can simply forward the audio packets while doing playout buffering and DTMF detection locally.
A number of proposals exist for carrying DTMF in SIP INFO messages, but the working group has not decided which of the approaches, if any, to pursue.

SIP中3个很重要的概念，就是dialog, session和transaction。

以下是我学习中对这三个概念的心得，贴出来和大家探讨。

dialog的建立是收到UAS的响应（To tag）时开始建立的。收到180响应时建立的
dialog叫做早期对话（early dialog）,收到2XX的应答开始才是真正的dialog建立。

session 是媒体交换之后才建立的。具体而言就是通过offer/answer方式交换sdp的媒体。
session的建立可以使INVITE-200 也可以是200-ACK。这要看媒体的交换发生的时间。
具体来说，INVITE 中的消息体用sdp语言来描述自己可处理的媒体类型，200OK中
带回UAS端可处理的媒体类型。这个时候媒体交换就算是完成了。也就是session建立起
来了。

dialog是end-point对end-point的关系。而transaction 是hop by hop的关系。dialog通过
From tag, To tag(应该说local tag, remote tag--这两个tag随着UAC和UAS而不同。)以及
Call-ID 来判别。而transaction是一个SIP entity和下一个SIP entity之间请求和应答关系，（无状态代理服务器不再此列）
是通过对Via里的branch来判别的。

Transaction:维护hop to hop状态，包括一个请求和其触发的所有响应，包括若干暂时响应和一个最终响应。生命周期从请求产生到收到最终响应。
Dialog：维护peer to peer状态，目前只有invite和subscribe请求会触发dialog。其生命周期贯穿一个端到端会话的始终。