The simplest definition of how video conferencing works is simply by the integration of video, audio and peripherals to enable two or more people to communicate simultaneously over some type of telecommunications lines. In other words, you are transmitting synchronized images and verbal communications between two or more locations in lieu of them being in the same room. How video conferencing works is a little bit harder to explain than answering the question, “What is video conferencing?”
Millions of people use video conferencing every day around the globe, but very few people know just how the technical aspects of the process work. The main ingredients of successful video conferencing are video cameras, microphones, appropriate computer software and computer equipment and peripherals that will integrate with the transmission lines to relay the information.
The analog information recorded by the microphones and cameras is broken down into discreet units, translating it to ones and zeros. A Codec encodes the information to a digital signal that can then be transmitted to a codec at the other end, which will retranslate these digital signals back into analog video images and audio sounds.
The theory’s the same, the transmission has changed
In the earlier days of video conferencing, T1, ATM and ISDN lines were used almost exclusively but were really only practical for room-based video conferencing systems. These dedicated lines were expensive and only large corporations tended to have the facilities and money to invest in this type of set-up.
As the Internet became more a part of the everyday lives of all businesses, however, it changed how video conferencing was conducted. The TCP/IP connections of the Internet are much less expensive and can carry large quantities of information, including video packets for conferencing, relatively easily. Because of this, video conferencing has become much more prevalent in small businesses and in desktop packages that can be set up with software for computer-to-computer networking.
Compression makes video transmission practical
The problem that arises when you convert analog to digital for transmission is the loss of clarity in an image. Analog signals are a continuous wave of amplitudes and frequencies showing shades and ranges of color as well as depth and brightness. When you convert to digital, which is strictly 0’s and 1’s, you then need to develop a grid to represent values, intensities and saturations of different color values so that the image can be interpreted and reformed at the receiving end.
This vast amount of digital information requires huge bandwidth and means that the time it would take to transmit video images would be impractical for most applications. That’s where compression is crucial. When determining how video conferencing works, one of the most important elements is the compression ratio.
The higher the compression ratio, the more quickly the information is capable of being transmitted. In many cases, however, this also means some loss in clarity or audio/video quality. For instance, a compression ratio of 4:1 would be terribly slow but have a fantastic picture quality. But by the time it was transmitted, everyone at the other end would probably have left the room for a cup of coffee. Lossy compression discards unneeded or irrelevant sections of a signal in order to transmit only the essentials, speeding up the transmission time significantly but sometimes resulting in loss of quality.
Compression can either be intra-frame or inter-frame for material that is repetitive or redundant, such as that wall behind the conference participant. Since the wall remains static and never changes, this image is redundant and can be eliminated from transmissions to an extent with proper compression. Intra-frame compression assumes the redundancy will be present in parts of a frame that are close to each other. Inter-frame compression assumes that there is redundancy over time (i.e., like that wall). Either of these can achieve a fairly high degree of accuracy and reduce the bandwidth needed for transmittal of signals.
A newer version of compression/decompression is SightSpeed technology, developed by Cornell University. SightSpeed compresses only images considered essential and eliminating what is considered ‘filler,’ relying on the brain to fill in the decompression at the other end. Based on an artificial intelligence model, SightSpeed achieves compression of about 90:1, compared to the typical 15:1 for video conferencing.
Any video conferencing session you use will provide compression of the transmission signal. The key is determining the balance between speed and video picture quality that is right for your needs.
Point to point video conferencing
Point to point video conferencing is just what it sounds like – a link between two different points on the planet, or two different video conferencing terminals. It could be between an office in New York City and a conference room in Munich. Point to point video conferencing can easily be initiated by someone on one end contacting the other end as though making a standard telephone call. There are no special arrangements to be made other than knowing that the participants will be there.
Multipoint conferencing is more complex
Multipoint conferencing is more complicated because it has to coordinate several different locations simultaneously. Since you can’t be in direct contact with several places at once while they are all in contact with others, you need one source that will tie them all together. In video conferencing, this is called a multipoint bridge or multipoint conferencing unit (MCU).
An MCU enables multi-location video conferencing by providing a sort of “central processing center” for all of the locations through which all the information flows. The MCU receives all information from the various locations and then sends it out to each location. In some cases the MCU is located on a particular PC, and in other cases it is located on a remote server (the most common structure, particularly for more powerful MCU networks).
Audio is usually sent and received simultaneously in all locations with an MCU with no problem because of the relatively small bandwidth needed for transmittal. It is broadcast in what is called “full duplex” mode, meaning everyone can talk and hear at the same time with no cutting off when one person or another speaks.
Video transmission, however, can be broadcast in a number of ways with an MCU depending upon the quality of the software and the complexity of the system. Some common types of video transmission for video conferencing include:
Continuous Presence video conferencing, which allows up to four conference sites to be seen simultaneously on split screens. This is usually used if you have a small group or individuals in separate locations and will primarily be seeing close-up shots.
Universal Control video conferencing is controlled by the initiating conference site. The primary site determines who sees what at all other sites. Voice Activated video conferencing is by far the most common type used today. The image with these systems shifts to the site that is currently activating the microphone so that you can always see whoever is speaking. However, if there is a good deal of background noise participants should mute their microphones when they aren’t talking in order to avoid the image jumping about needlessly. Overcoming the language barrier
Obviously, communicating through video conferencing can’t be achieved unless both ends of the conference are “speaking the same language.” That is, whatever is being transmitted electronically will need to be reassembled properly and heard and seen clearly at the other end. The Codec system (Coder-Decoder) is useless if both ends aren’t using the same virtual language to interpret the signals.
The International Telecommunications Union (ITU) developed a set of standards in 1996 dubbed H.323 to outline specific guidelines for Video Conferencing standards and protocols so that compliance and support across networks would be easier to achieve and maintain. Since then, many manufacturers and developers of video conferencing tools have adopted the H.323 guidelines as their own.
Web conferencing solutions such as Click to Meet, Lotus’s SameTime, and WebEx also offer corporate solutions that are based on Internet video conferencing. These systems have shared protocols that can be downloaded and used anywhere at any location for subscribers through the Internet. These are becoming more popular with companies who like the convenience and user-friendliness. They will no doubt become more and more refined over time, vying with and perhaps surpassing the H.323 standards.
Overcoming firewall issues
There are, of course, obstacles to overcome when you take a look at how video conferencing works. After all, you’re sending vast amounts of translated data either directly or through a gatekeeper system (the MCU) that is switching and transferring information between a variety of computers. Just about any business these days has a firewall system to provide security and protect the system from potential viruses. Trouble is, many firewalls also block the transmission of data for video conferencing.
Recent innovations have largely circumvented these problems by designing firewall solutions that recognize video conferencing signaling requests and allow the information packets to bypass the firewall or router without disabling the firewall protection for other traffic. Even with this, however, there may be occasions when packets are dropped because of heavy traffic on the system, so investing in a firewall system that can handle substantial traffic is essential to quality video conferencing performance.
How video conferencing works will certainly evolve over time and improve in the coming years, but a basic understanding of what it is and how it works now will help you make the best choice for you when you’re ready to begin using video conferencing yourself.
This article on the "How Video Conferencing Works" reprinted with permission. Copyright © 2004 Evaluseek Publishing.