SIP vs Skype – the battle for VoIP


VoIPIt’s not a question of whether IP telephony will overtake traditional calls, but when it will do so. VoIP telephony is now cheaper, its sound quality matches that of ordinary telephony, or is even better, and a VoIP user ID is as easy to use as a standard phone number. SIP and Skype are the two technologies fighting the battle for VoIP supremacy. They achieve the same goal: to route multimedia conversations over IP-based networks. But their approaches to do so are radically different.


Proprietary or Public

SIP is a standard created by the Internet Engineering Task Force (IETF). It handles user location, mobility and authentication, call handling, forwarding and negotiation. All non-signaling services such as session control, media transport and network traversal are handled by other IETF protocols such as SDP, RTP and STUN. VoIP providers implement SIP and associated protocols to create VoIP systems and client applications.

Skype handles all aspects of VoIP communication in one neat package. You install the client software and run it. Skype is a proprietary system; we don’t know exactly how it works. The evolution of SIP is in the public domain. Skype alone controls Skype’s evolution.



Logging in

Skype and SIP users both log in to a -central server, called the login server and -registration server respectively. Both -servers store usernames, passwords and buddy lists. The SIP registration server also stores the IP address and port used to connect to each SIP client. The Skype login server validates that there is only one user with a given username logged in to the network at any given time. SIP registration servers allow multiple user agents to connect to the registration server using the same username.

In general, many people and in particular non-specialist users find it hard to configure the mainstream SIP phones on the market. Realizing this, RTX has deve-loped a plug & play solution that, via an extremely userfriendly MMI, makes the configuration optimal for non-specialist as well as specialist users.

Skype has one login server that supports all registered Skype users worldwide whereas each SIP VoIP provider gives a registration server for its own VoIP subscribers. Because SIP is an open standard, SIP users can communicate with other SIP users who subscribe to a different VoIP provider. An AT&T user can for instance connect to a Redband AB user seamlessly. Skype users communicate with other members of the Skype community.



SIP user agents connect to the registration server login to retrieve the information necessary to contact other SIP users. Call control is conducted between user agents. Skype uses a peer-to-peer overlay network.


SIP Networks

SIPThe SIP network is made up of user agents and the registration server. To call a buddy, the client must first contact the registration server to retrieve the IP address and port of the client to be contacted.

A CALL message is then sent directly from client to client. In most cases, media is also sent directly from client to client. Echo cancellation, Voice Activity Detection (VAD) and Comfort Noise Generation (CNG) are negotiated and performed by each client. In some cases, a relay server is used to perform VAD and CNG rather than user agents but this practice remains relatively rare. The downside of this -system is that the larger the network, the more work for the registration server in order to keep track of all active client nodes.

Moreover, some ITSP (Internet Telephone Service Providers) providers do not support emergency calls. By developing the dualphone, RTX has solved this problem. With the dualphone, calls can always be directed via the PSTN line.


The Skype Network

SkypeThe Skype network is made up of two types of objects, nodes and supernodes. A node is a standard Skype client. Any node that has enough bandwidth and CPU power can become a super node; Skype users have no control over whether the -client application they are running is a node or a supernode. For a node to log in to the Skype network, it must have already connected to at least one supernode. Once logged in, the Skype node builds a list of up to 200 supernodes it can connect to. This list is called the Host Cache and is refreshed regularly to keep track of network activity. The node/supernode system is the basis of the Skype P2P3G or Global Index technology. The Global index ensures successful search for any Skype user who has logged on to the system -during the last 72 hours. The Host Cache is also used to route calls via the nodes that will give the best service quality.

Another advantage of this peer-to-peer overlay network is scalability; the number of supernodes grows as the number of nodes increases, and the network handles growth automatically.



VoIP clients encode raw sound captured from the microphone into a digital format and send it to other clients via the IP network. The encoded sound is decoded and played on the device speaker.

With SIP, media transport is controlled using the RTP stack. The codec type to be used is determined at the start of the call; the codec can be changed as the call progresses. The standard SIP codec is G.711, although G.729 is most commonly used for enterprise networks.

Skype uses the iLBC codec for low bandwidth calls and iSAC for high bandwidth. Where G.723 was designed for traditional telephone networks, iLBC and iSAC are the first codecs designed specifically for the Internet. Both SIP and Skype offer sound quality equal to that of a standard phone call; if wideband codecs are used, the sound quality may even be better than that of a call made over the traditional telephone network. A drawback for SIP is that although it offers good sound quality, there is limited Quality of Service (QoS) guaranteed before Internet Protocol version 6 becomes current.


Traversing Network Barriers

VoIP packets must be delivered directly to VoIP client applications. To do this, the software must be able to translate IP addresses altered by Network Address Translation (NAT) systems and pass through firewalls. SIP uses STUN to translate Information inside IP packets for NAT and firewall traversal. STUN is a client server protocol; user agents send packets to a STUN server located with the registration server. The STUN server examines and decides which form of NAT the user agent is behind and, thus, the correct IP address to be used to connect to a user agent. STUN is not foolproof; there are problems notably with symmetric NATs and firewalls blocking UDP sockets. Skype passes through most firewalls and NATs seamlessly, although there is some risk that there may be interference with other systems running on your PC.



SIP is an open protocol. Messages are sent in clear text through the Internet. By contrast, every packet sent by Skype is encrypted. All communication via Skype, be it voice, text or protocol, is secure.


Where next?

SIP and Skype are two different approaches to VoIP communication. SIP has the strength of flexibility, Skype is easy to install, use and has transparent network access; both offer excellent sound quality. The question is whether multiple operators implementing the SIP open standard can gain clients from Skype, the current market leader.