Jitsi

来源:互联网 发布:数据建模分析 编辑:程序博客网 时间:2024/06/05 00:32

 http://www.aosabook.org/en/jitsi.html

Jitsi is an application that allows people to make video and voicecalls, share their desktops, and exchange files and messages. Moreimportantly it allows people to do this over a number of differentprotocols, ranging from the standardized XMPP (Extensible Messagingand Presence Protocol) and SIP (Session Initiation Protocol) toproprietary ones like Yahoo! and Windows Live Messenger (MSN). Itruns on Microsoft Windows, Apple Mac OS X, Linux, and FreeBSD. It iswritten mostly in Java but it also contains parts written in nativecode. In this chapter, we'll look at Jitsi's OSGi-based architecture,see how it implements and manages protocols, and look back on whatwe've learned from building it.1

10.1. Designing Jitsi

The three most important constraints that we had to keep in mind whendesigning Jitsi (at the time called SIP Communicator) weremulti-protocol support, cross-platform operation, anddeveloper-friendliness.

From a developer's perspective, being multi-protocol comes down tohaving a common interface for all protocols. In other words, when auser sends a message, our graphical user interface needs to alwayscall the samesendMessage method regardless of whether thecurrently selected protocol actually uses a method calledsendXmppMessage orsendSipMsg.

The fact that most of our code is written in Java satisfies, to alarge degree, our second constraint: cross-platform operation. Still,there are things that the Java Runtime Environment (JRE) does notsupport or does not do the way we'd like it to, such as capturingvideo from your webcam. Therefore, we need to use DirectShow onWindows, QTKit on Mac OS X, and Video for Linux 2 on Linux. Just aswith protocols, the parts of the code that control video calls cannotbe bothered with these details (they are complicated enough as it is).

Finally, being developer-friendly means that it should be easy forpeople to add new features. There are millions of people using VoIPtoday in thousands of different ways; various service providers andserver vendors come up with different use cases and ideas about newfeatures. We have to make sure that it is easy for them to use Jitsithe way they want. Someone who needs to add something new should haveto read and understand only those parts of the project they aremodifying or extending. Similarly, one person's changes should haveas little impact as possible on everyone else's work.

To sum up, we needed an environment where different parts of the codeare relatively independent from each other. It had to be possible toeasily replace some parts depending on the operating system; haveothers, like protocols, run in parallel and yet act the same; and ithad to be possible to completely rewrite any one of those parts andhave the rest of the code work without any changes. Finally, wewanted the ability to easily switch parts on and off, as well as theability to download plugins over the Internet to our list.

We briefly considered writing our own framework, but soon dropped theidea. We were itching to start writing VoIP and IM code as soon aspossible, and spending a couple of months on a plugin frameworkdidn't seem that exciting. Someone suggested OSGi, and it seemed to bethe perfect fit.

10.2. Jitsi and the OSGi Framework

People have written entire books about OSGi, so we're not going to goover everything the framework stands for. Instead we will only explainwhat it gives us and the way we use it in Jitsi.

Above everything else, OSGi is about modules. Features in OSGiapplications are separated into bundles. An OSGi bundle is little morethan a regular JAR file like the ones used to distribute Javalibraries and applications. Jitsi is a collection of suchbundles. There is one responsible for connecting to Windows LiveMessenger, another one that does XMPP, yet another one that handlesthe GUI, and so on. All these bundles run together in an environmentprovided, in our case, by Apache Felix, an open source OSGiimplementation.

All these modules need to work together. The GUI bundle needs to sendmessages via the protocol bundles, which in turn need to store themvia the bundles handling message history. This is what OSGi servicesare for: they represent the part of a bundle that is visible toeveryone else. An OSGi service is most often a group of Javainterfaces that allow use of a specific functionality like logging,sending messages over the network, or retrieving the list of recentcalls. The classes that actually implement the functionality are knownas a service implementation. Most of them carry the name of theservice interface they implement, with an "Impl" suffix at the end(e.g.,ConfigurationServiceImpl). The OSGi framework allowsdevelopers to hide service implementations and make sure that they arenever visible outside the bundle they are in. This way, other bundlescan only use them through the service interfaces.

Most bundles also have activators. Activators are simple interfacesthat define astart and a stop method. Every time Felixloads or removes a bundle in Jitsi, it calls these methods so that thebundle can prepare to run or shut down. When calling these methodsFelix passes them a parameter called BundleContext. The BundleContextgives bundles a way to connect to the OSGi environment. This way theycan discover whatever OSGi service they need to use, or register onethemselves (Figure 10.1).

[OSGi Bundle Activation]

Figure 10.1: OSGi Bundle Activation

So let's see how this actually works. Imagine a service thatpersistently stores and retrieves properties. In Jitsi this is what wecall the ConfigurationService and it looks like this:

package net.java.sip.communicator.service.configuration;public interface ConfigurationService{  public void setProperty(String propertyName, Object property);  public Object getProperty(String propertyName);}

A very simple implementation of the ConfigurationService looks likethis:

package net.java.sip.communicator.impl.configuration;import java.util.*;import net.java.sip.communicator.service.configuration.*;public class ConfigurationServiceImpl implements ConfigurationService{  private final Properties properties = new Properties();  public Object getProperty(String name)  {    return properties.get(name);  }  public void setProperty(String name, Object value)  {    properties.setProperty(name, value.toString());  }}

Notice how the service is defined in thenet.java.sip.communicator.service package, while theimplementation is innet.java.sip.communicator.impl. Allservices and implementations in Jitsi are separated under these twopackages. OSGi allows bundles to only make some packages visibleoutside their own JAR, so the separation makes it easier for bundlesto only export their service packages and keep theirimplementations hidden.

The last thing we need to do so that people can start using ourimplementation is to register it in theBundleContext andindicate that it provides an implementation of theConfigurationService. Here's how this happens:

package net.java.sip.communicator.impl.configuration;import org.osgi.framework.*;import net.java.sip.communicator.service.configuration;public class ConfigActivator implements BundleActivator{  public void start(BundleContext bc) throws Exception  {    bc.registerService(ConfigurationService.class.getName(), // service name         new ConfigurationServiceImpl(), // service implementation         null);  }}

Once the ConfigurationServiceImpl class is registered in theBundleContext, other bundles can start using it. Here's anexample showing how some random bundle can use our configurationservice:

package net.java.sip.communicator.plugin.randombundle;import org.osgi.framework.*;import net.java.sip.communicator.service.configuration.*;public class RandomBundleActivator implements BundleActivator{  public void start(BundleContext bc) throws Exception  {    ServiceReference cRef = bc.getServiceReference(                              ConfigurationService.class.getName());    configService = (ConfigurationService) bc.getService(cRef);    // And that's all! We have a reference to the service implementation    // and we are ready to start saving properties:    configService.setProperty("propertyName", "propertyValue");  }}

Once again, notice the package. Innet.java.sip.communicator.plugin we keep bundles that useservices defined by others but that neither export nor implement anythemselves. Configuration forms are a good example of such plugins:They are additions to the Jitsi user interface that allow users toconfigure certain aspects of the application. When users changepreferences, configuration forms interact with theConfigurationService or directly with the bundles responsiblefor a feature. However, none of the other bundles ever need tointeract with them in any way (Figure 10.2).

[Service Structure]

Figure 10.2: Service Structure

10.3. Building and Running a Bundle

Now that we've seen how to write the code in a bundle, it's time totalk about packaging. When running, all bundles need to indicate threedifferent things to the OSGi environment: the Java packages they makeavailable to others (i.e. exported packages), the ones that they wouldlike to use from others (i.e. imported packages), and the name oftheir BundleActivator class. Bundles do this through the manifest ofthe JAR file that they will be deployed in.

For the ConfigurationService that we defined above, themanifest file could look like this:

Bundle-Activator: net.java.sip.communicator.impl.configuration.ConfigActivatorBundle-Name: Configuration Service ImplementationBundle-Description: A bundle that offers configuration utilitiesBundle-Vendor: jitsi.orgBundle-Version: 0.0.1System-Bundle: yesImport-Package: org.osgi.framework,Export-Package: net.java.sip.communicator.service.configuration

After creating the JAR manifest, we are ready to create the bundleitself. In Jitsi we use Apache Ant to handle all build-relatedtasks. In order to add a bundle to the Jitsi build process, you needto edit thebuild.xml file in the root directory of theproject. Bundle JARs are created at the bottom of thebuild.xml file, withbundle-xxx targets. In order tobuild our configuration service we need the following:

<target name="bundle-configuration">  <jar destfile="${bundles.dest}/configuration.jar" manifest=    "${src}/net/java/sip/communicator/impl/configuration/conf.manifest.mf" >    <zipfileset dir="${dest}/net/java/sip/communicator/service/configuration"        prefix="net/java/sip/communicator/service/configuration"/>    <zipfileset dir="${dest}/net/java/sip/communicator/impl/configuration"        prefix="net/java/sip/communicator/impl/configuration" />  </jar></target>

As you can see, the Ant target simply creates a JAR file using ourconfiguration manifest, and adds to it the configuration packages fromtheservice and impl hierarchies. Now the only thingthat we need to do is to make Felix load it.

We already mentioned that Jitsi is merely a collection of OSGibundles. When a user executes the application, they actually startFelix with a list of bundles that it needs to load. You can find thatlist in ourlib directory, inside a file calledfelix.client.run.properties. Felix starts bundles in the orderdefined by start levels: All those within a particular level areguaranteed to complete before bundles in subsequent levels startloading. Although you can't see this in the example code above, ourconfiguration service stores properties in files so it needs to useourFileAccessService, shipped within the fileaccess.jarfile. We'll therefore make sure that the ConfigurationService startsafter the FileAccessService:

⋮    ⋮    ⋮felix.auto.start.30= \  reference:file:sc-bundles/fileaccess.jarfelix.auto.start.40= \  reference:file:sc-bundles/configuration.jar \  reference:file:sc-bundles/jmdnslib.jar \  reference:file:sc-bundles/provdisc.jar \⋮    ⋮    ⋮

If you look at the felix.client.run.properties file, you'll seea list of packages at the beginning:

org.osgi.framework.system.packages.extra= \  apple.awt; \  com.apple.cocoa.application; \  com.apple.cocoa.foundation; \  com.apple.eawt; \⋮    ⋮    ⋮

The list tells Felix what packages it needs to make available tobundles from the system classpath. This means that packages that areon this list can be imported by bundles (i.e. added to theirImport-Package manifest header) without any being exported byany other bundle. The list mostly contains packages that come fromOS-specific JRE parts, and Jitsi developers rarely need to add newones to it; in most cases packages are made available by bundles.

10.4. Protocol Provider Service

The ProtocolProviderService in Jitsi defines the way allprotocol implementations behave. It is the interface that otherbundles (like the user interface) use when they need to send andreceive messages, make calls, and share files through the networksthat Jitsi connects to.

The protocol service interfaces can all be found under thenet.java.sip.communicator.service.protocol package. There aremultiple implementations of the service, one per supported protocol,and all are stored innet.java.sip.communicator.impl.protocol.protocol_name.

Let's start with the service.protocol directory. The mostprominent piece is theProtocolProviderService interface.Whenever someone needs to perform a protocol-related task, they haveto look up an implementation of that service in theBundleContext. The service and its implementations allow Jitsito connect to any of the supported networks, to retrieve the connectionstatus and details, and most importantly to obtain references to theclasses that implement the actual communications tasks like chattingand making calls.

10.4.1. Operation Sets

As we mentioned earlier, the ProtocolProviderService needs toleverage the various communication protocols and theirdifferences. While this is particularly simple for features that allprotocols share, like sending a message, things get trickier for tasksthat only some protocols support. Sometimes these differences comefrom the service itself: For example, most of the SIP services outthere do not support server-stored contact lists, while this is arelatively well-supported feature with all other protocols. MSN andAIM are another good example: at one time neither of them offered theability to send messages to offline users, while everyone elsedid. (This has since changed.)

The bottom line is our ProtocolProviderService needs to have away of handling these differences so that other bundles, like the GUI,act accordingly; there's no point in adding a call button to an AIMcontact if there's no way to actually make a call.

OperationSets to the rescue(Figure 10.3). Unsurprisingly, they are sets ofoperations, and provide the interface that Jitsi bundles use tocontrol the protocol implementations. The methods that you find in anoperation set interface are all related to a particular feature.OperationSetBasicInstantMessaging, for instance, containsmethods for creating and sending instant messages, and registeringlisteners that allow Jitsi to retrieve messages it receives. Anotherexample, OperationSetPresence, has methods for querying thestatus of the contacts on your list and setting a status foryourself. So when the GUI updates the status it shows for a contact,or sends a message to a contact, it is first able to ask thecorresponding provider whether they support presence andmessaging. The methods thatProtocolProviderService defines forthat purpose are:

public Map<String, OperationSet> getSupportedOperationSets();public <T extends OperationSet> T getOperationSet(Class<T> opsetClass);

OperationSets have to be designed so that it is unlikely that a newprotocol we add has support for only some of the operations defined inan OperationSet. For example, some protocols do not support server-storedcontact lists even though they allow users to query each other's status.Therefore, rather than combining the presence management and buddy listretrieval features inOperationSetPresence, we also defined anOperationSetPersistentPresence which is only used with protocolsthat can store contacts online. On the other hand, we have yet to comeacross a protocol that only allows sending messages without receivingany, which is why things like sending and receiving messages can besafely combined.

[Operation Sets]

Figure 10.3: Operation Sets

10.4.2. Accounts, Factories and Provider Instances

An important characteristic of the ProtocolProviderService isthat one instance corresponds to one protocol account. Therefore, atany given time you have as many service implementations in theBundleContext as you have accounts registered by the user.

At this point you may be wondering who creates and registers theprotocol providers. There are two different entities involved. First,there isProtocolProviderFactory. This is the service thatallows other bundles to instantiate providers and then registers themas services. There is one factory per protocol and every factory isresponsible for creating providers for that particularprotocol. Factory implementations are stored with the rest of theprotocol internals. For SIP, for example we havenet.java.sip.communicator.impl.protocol.sip.ProtocolProviderFactorySipImpl.

The second entity involved in account creation is the protocol wizard.Unlike factories, wizards are separated from the rest of the protocolimplementation because they involve the graphical user interface. Thewizard that allows users to create SIP accounts, for example, can befound in net.java.sip.communicator.plugin.sipaccregwizz.

10.5. Media Service

When working with real-time communication over IP, there is oneimportant thing to understand: protocols like SIP and XMPP, whilerecognized by many as the most common VoIP protocols, are not the onesthat actually move voice and video over the Internet. This task ishandled by the Real-time Transport Protocol (RTP). SIP and XMPP areonly responsible for preparing everything that RTP needs, likedetermining the address where RTP packets need to be sent andnegotiating the format that audio and video need to be encoded in(i.e. codec), etc. They also take care of things like locating users,maintaining their presence, making the phones ring, and manyothers. This is why protocols like SIP and XMPP are often referred toas signalling protocols.

What does this mean in the context of Jitsi? Well, first of all itmeans that you are not going to find any code manipulating audio orvideo flows in either thesip or jabber jitsi packages.This kind of code lives in our MediaService. The MediaService and itsimplementation are located innet.java.sip.communicator.service.neomedia andnet.java.sip.communicator.impl.neomedia.

Why "neomedia"?

The "neo" in the neomedia package name indicates that it replaces asimilar package that we used originally and that we then had tocompletely rewrite. This is actually how we came up with one of ourrules of thumb: It is hardly ever worth it to spend a lot of timedesigning an application to be 100% future-proof. There is simply noway of taking everything into account, so you are bound to have tomake changes later anyway. Besides, it is quite likely that apainstaking design phase will introduce complexities that you willnever need because the scenarios you prepared for never happen.

In addition to the MediaService itself, there are two other interfacesthat are particularly important: MediaDevice and MediaStream.

10.5.1. Capture, Streaming, and Playback

MediaDevices represent the capture and playback devices that we useduring a call (Figure 10.4). Your microphone andspeakers, your headset and your webcam are all examples of suchMediaDevices, but they are not the only ones. Desktop streaming andsharing calls in Jitsi capture video from your desktop, while aconference call uses an AudioMixer device in order to mix the audio wereceive from the active participants. In all cases, MediaDevicesrepresent only a single MediaType. That is, they can only be eitheraudio or video but never both. This means that if, for example, youhave a webcam with an integrated microphone, Jitsi sees it as twodevices: one that can only capture video, and another one that canonly capture sound.

Devices alone, however, are not enough to make a phone or a videocall. In addition to playing and capturing media, one has to also beable to send it over the network. This is where MediaStreams comein. A MediaStream interface is what connects a MediaDevice to yourinterlocutor. It represents incoming and outgoing packets that youexchange with them within a call.

Just as with devices, one stream can be responsible for only oneMediaType. This means that in the case of an audio/video call Jitsihas to create two separate media streams and then connect each to thecorresponding audio or video MediaDevice.

[Media Streams For Different Devices]

Figure 10.4: Media Streams For Different Devices

10.5.2. Codecs

Another important concept in media streaming is that of MediaFormats,also known as codecs. By default most operating systems let youcapture audio in 48KHz PCM or something similar. This is what weoften refer to as "raw audio" and it's the kind of audio you get inWAV files: great quality and enormous size. It is quite impractical totry and transport audio over the Internet in the PCM format.

This is what codecs are for: they let you present and transport audioor video in a variety of different ways. Some audio codecs like iLBC,8KHz Speex, or G.729, have low bandwidth requirements but soundsomewhat muffled. Others like wideband Speex and G.722 give you greataudio quality but also require more bandwidth. There are codecs thattry to deliver good quality while keeping bandwidth requirements at areasonable level. H.264, the popular video codec, is a good exampleof that. The trade-off here is the amount of calculation requiredduring conversion. If you use Jitsi for an H.264 video call you see agood quality image and your bandwidth requirements are quitereasonable, but your CPU runs at maximum.

All this is an oversimplification, but the idea is that codec choiceis all about compromises. You either sacrifice bandwidth, quality, CPUintensity, or some combination of those. People working with VoIPrarely need to know more about codecs.

10.5.3. Connecting with the Protocol Providers

Protocols in Jitsi that currently have audio/video support all use ourMediaServices exactly the same way. First they ask the MediaServiceabout the devices that are available on the system:

public List<MediaDevice> getDevices(MediaType mediaType, MediaUseCase useCase);

The MediaType indicates whether we are interested in audio or videodevices. The MediaUseCase parameter is currently only considered inthe case of video devices. It tells the media service whether we'dlike to get devices that could be used in a regular call(MediaUseCase.CALL), in which case it returns a list of availablewebcams, or a desktop sharing session (MediaUseCase.DESKTOP), in whichcase it returns references to the user desktops.

The next step is to obtain the list of formats that are available fora specific device. We do this through theMediaDevice.getSupportedFormats method:

public List<MediaFormat> getSupportedFormats();

Once it has this list, the protocol implementation sends it to theremote party, which responds with a subset of them to indicate whichones it supports. This exchange is also known as the Offer/AnswerModel and it often uses the Session Description Protocol or some formof it.

After exchanging formats and some port numbers and IP addresses, VoIPprotocols create, configure and start the MediaStreams. Roughlyspeaking, this initialization is along the following lines:

// first create a stream connector telling the media service what sockets// to use when transport media with RTP and flow control and statistics// messages with RTCPStreamConnector connector =  new DefaultStreamConnector(rtpSocket, rtcpSocket);MediaStream stream = mediaService.createMediaStream(connector, device, control);// A MediaStreamTarget indicates the address and ports where our// interlocutor is expecting media. Different VoIP protocols have their// own ways of exchanging this informationstream.setTarget(target);// The MediaDirection parameter tells the stream whether it is going to be// incoming, outgoing or bothstream.setDirection(direction);// Then we set the stream format. We use the one that came// first in the list returned in the session negotiation answer.stream.setFormat(format);// Finally, we are ready to actually start grabbing media from our// media device and streaming it over the Internetstream.start();

Now you can wave at your webcam, grab the mic and say, "Helloworld!"

10.6. UI Service

So far we have covered parts of Jitsi that deal with protocols,sending and receiving messages and making calls. Above all, however,Jitsi is an application used by actual people and as such, one of itsmost important aspects is its user interface. Most of the time theuser interface uses the services that all the other bundles in Jitsiexpose. There are some cases, however, where things happen the otherway around.

Plugins are the first example that comes to mind. Plugins in Jitsioften need to be able to interact with the user. This means they haveto open, close, move or add components to existing windows and panelsin the user interface. This is where our UIService comes into play. Itallows for basic control over the main window in Jitsi and this is howour icons in the Mac OS X dock and the Windows notification area letusers control the application.

In addition to simply playing with the contact list, plugins can alsoextend it. The plugin that implements support for chat encryption(OTR) in Jitsi is a good example for this. Our OTR bundle needs toregister several GUI components in various parts of the userinterface. It adds a padlock button in the chat window and asub-section in the right-click menu of all contacts.

The good news is that it can do all this with just a few method calls.The OSGi activator for the OTR bundle, OtrActivator, contains thefollowing lines:

Hashtable<String, String> filter = new Hashtable<String, String>();// Register the right-click menu item.filter(Container.CONTAINER_ID,    Container.CONTAINER_CONTACT_RIGHT_BUTTON_MENU.getID());bundleContext.registerService(PluginComponent.class.getName(),    new OtrMetaContactMenu(Container.CONTAINER_CONTACT_RIGHT_BUTTON_MENU),    filter);// Register the chat window menu bar item.filter.put(Container.CONTAINER_ID,           Container.CONTAINER_CHAT_MENU_BAR.getID());bundleContext.registerService(PluginComponent.class.getName(),           new OtrMetaContactMenu(Container.CONTAINER_CHAT_MENU_BAR),           filter);

As you can see, adding components to our graphical user interfacesimply comes down to registering OSGi services. On the other side ofthe fence, our UIService implementation is looking for implementationsof its PluginComponent interface. Whenever it detects that a newimplementation has been registered, it obtains a reference to it andadds it to the container indicated in the OSGi service filter.

Here's how this happens in the case of the right-click menuitem. Within the UI bundle, the class that represents the right clickmenu, MetaContactRightButtonMenu, contains the following lines:

// Search for plugin components registered through the OSGI bundle context.ServiceReference[] serRefs = null;String osgiFilter = "("    + Container.CONTAINER_ID    + "="+Container.CONTAINER_CONTACT_RIGHT_BUTTON_MENU.getID()+")";serRefs = GuiActivator.bundleContext.getServiceReferences(        PluginComponent.class.getName(),        osgiFilter);// Go through all the plugins we found and add them to the menu.for (int i = 0; i < serRefs.length; i ++){    PluginComponent component = (PluginComponent) GuiActivator        .bundleContext.getService(serRefs[i]);    component.setCurrentContact(metaContact);    if (component.getComponent() == null)        continue;    this.add((Component)component.getComponent());}

And that's all there is to it. Most of the windows that you see withinJitsi do exactly the same thing: They look through the bundle contextfor services implementing the PluginComponent interface that have afilter indicating that they want to be added to the correspondingcontainer. Plugins are like hitch-hikers holding up signs with thenames of their destinations, making Jitsi windows the drivers who pickthem up.

10.7. Lessons Learned

When we started work on SIP Communicator, one of the most commoncriticisms or questions we heard was: "Why are you using Java? Don'tyou know it's slow? You'd never be able to get decent quality foraudio/video calls!" The "Java is slow" myth has even been repeatedby potential users as a reason they stick with Skype instead of tryingJitsi. But the first lesson we've learned from our work on the projectis that efficiency is no more of a concern with Java than it wouldhave been with C++ or other native alternatives.

We won't pretend that the decision to choose Java was the result ofrigorous analysis of all possible options. We simply wanted an easyway to build something that ran on Windows and Linux, and Java and theJava Media Framework seemed to offer one relatively easy way of doingso.

Throughout the years we haven't had many reasons to regret thisdecision. Quite the contrary: even though it doesn't make itcompletely transparent, Java does help portability and 90% of the codein SIP Communicator doesn't change from one OS to the next. Thisincludes all the protocol stack implementations (e.g., SIP, XMPP, RTP,etc.) that are complex enough as they are. Not having to worry aboutOS specifics in such parts of the code has proven immensely useful.

Furthermore, Java's popularity has turned out to be very importantwhen building our community. Contributors are a scarce resource as itis. People need to like the nature of the application, they need tofind time and motivation—all of this is hard to muster. Notrequiring them to learn a new language is, therefore, an advantage.

Contrary to most expectations, Java's presumed lack of speed hasrarely been a reason to go native. Most of the time decisions to usenative languages were driven by OS integration and how much accessJava was giving us to OS-specific utilities. Below we discussthe three most important areas where Java fell short.

10.7.1. Java Sound vs. PortAudio

Java Sound is Java's default API for capturing and playing audio. Itis part of the runtime environment and therefore runs on all theplatforms the Java Virtual Machine comes for. During its first yearsas SIP Communicator, Jitsi used JavaSound exclusively and thispresented us with quite a few inconveniences.

First of all, the API did not give us the option of choosing whichaudio device to use. This is a big problem. When using their computerfor audio and video calls, users often use advanced USB headsets orother audio devices to get the best possible quality. When multipledevices are present on a computer, JavaSound routes all audio throughwhichever device the OS considers default, and this is not good enoughin many cases. Many users like to keep all other applications runningon their default sound card so that, for example, they could keephearing music through their speakers. What's even more important isthat in many cases it is best for SIP Communicator to send audionotifications to one device and the actual call audio to another,allowing a user to hear an incoming call alert on their speakers evenif they are not in front of the computer and then, after picking upthe call, to start using a headset.

None of this is possible with Java Sound. What's more, the Linuximplementation uses OSS which is deprecated on most of today's Linuxdistributions.

We decided to use an alternative audio system. We didn't want tocompromise our multi-platform nature and, if possible, we wanted toavoid having to handle it all by ourselves. This is wherePortAudio2 came in extremely handy.

When Java doesn't let you do something itself, cross-platform opensource projects are the next best thing. Switching to PortAudio hasallowed us to implement support for fine-grained configurable audiorendering and capture just as we described it above. It also runs onWindows, Linux, Mac OS X, FreeBSD and others that we haven't had thetime to provide packages for.

10.7.2. Video Capture and Rendering

Video is just as important to us as audio. However, this didn't seemto be the case for the creators of Java, because there is no defaultAPI in the JRE that allows capturing or rendering video. For a whilethe Java Media Framework seemed to be destined to become such an APIuntil Sun stopped maintaining it.

Naturally we started looking for a PortAudio-style video alternative,but this time we weren't so lucky. At first we decided to go with theLTI-CIVIL framework from KenLarson3. This is a wonderfulproject and we used it for quite a while4. However it turned out to be suboptimalwhen used in a real-time communications context.

So we came to the conclusion that the only way to provide impeccablevideo communication for Jitsi would be for us to implement nativegrabbers and renderers all by ourselves. This was not an easy decisionsince it implied adding a lot of complexity and a substantialmaintenance load to the project but we simply had no choice: we reallywanted to have quality video calls. And now we do!

Our native grabbers and renderers directly use Video4Linux 2, QTKitand DirectShow/Direct3D on Linux, Mac OS X, and Windows respectively.

10.7.3. Video Encoding and Decoding

SIP Communicator, and hence Jitsi, supported video calls from itsfirst days. That's because the Java Media Framework allowed encodingvideo using the H.263 codec and a 176x144 (CIF) format. Those of youwho know what H.263 CIF looks like are probably smiling right now; fewof us would use a video chat application today if that's all it had tooffer.

In order to offer decent quality we've had to use other libraries likeFFmpeg. Video encoding is actually one of the few places where Javashows its limits performance-wise. So do other languages, as evidencedby the fact that FFmpeg developers actually use Assembler in a numberof places in order to handle video in the most efficient way possible.

10.7.4. Others

There are a number of other places where we've decided that we neededto go native for better results. Systray notifications with Growl onMac OS X and libnotify on Linux are one such example. Others includequerying contact databases from Microsoft Outlook and Apple AddressBook, determining source IP address depending on a destination, usingexisting codec implementations for Speex and G.722, capturing desktopscreenshots, and translating chars into key codes.

The important thing is that whenever we needed to choose a nativesolution, we could, and we did. This brings us to our point: Eversince we've started Jitsi we've fixed, added, or even entirelyrewritten various parts of it because we wanted them to look, feel orperform better. However, we've never ever regretted any of the thingswe didn't get right the first time. When in doubt, we simply pickedone of the available options and went with it. We could have waiteduntil we knew better what we were doing, but if we had, therewould be no Jitsi today.

10.8. Acknowledgments

Many thanks to Yana Stamcheva for creating all the diagrams in thischapter.

Footnotes

  1. To refer directly to thesource as you read, download it fromhttp://jitsi.org/source. If you are using Eclipse or NetBeans,you can go tohttp://jitsi.org/eclipse orhttp://jitsi.org/netbeans for instructions on how configurethem.
  2. http://portaudio.com/
  3. http://lti-civil.org/
  4. Actually we still have it asa non-default option.
0 0