Voice over data networks. IP telephony. Principles of packet speech transmission. IP telephony architecture levels. Three main IP telephony scenarios. General approach to building an IP network for transmitting telephone traffic

As noted, voice over IP (VoIP) is an OSI Layer 3 solution rather than a Layer 2 solution. This feature allows VoIP to work autonomously over Frame Relay and ATM networks. But most importantly, VoIP works on regular local networks, right down to desktop PCs. In this sense, VoIP is more of an application than a service, and this has been taken into account during the evolution of VoIP protocols.

All VoIP protocols are divided into two categories: centralized and distributed. Centralized models adhere to the client/server architecture, while distributed ones are based on the interaction of peer-to-peer network nodes. All VoIP technologies use a common medium for voice transmission in the form of RTP packets over IP, and also support a variety of codecs for data compression. The difference lies in the way the signals are transmitted and where the call logic and mode are served: at the endpoints or at the central server. Both architectures have their advantages and disadvantages. Distributed models scale well and are more flexible (reliable) since they do not have a central node that can fail. Conversely, centralized call control models offer easier management and support for traditional value-added services (such as conferencing), but may have scalability limitations based on the power of the central server. Hybrid and inter-network models are currently being developed to realize the benefits of these approaches.

The oldest architecture, H.323, and the newest, Session Initiation Protocol (SIP), belong to distributed VoIP call control schemes. Centralized call control methods include the Media Gateway Control Protocol (MGCP) and proprietary protocols such as the Skinny Station Protocol developed by Cisco Systems. A brief description of each of these protocols is provided below.

Voice encoder/decoder (codec) technology has advanced significantly over the past few years, driven by advances in Digital Signal Processor (DSP) architecture and research into human speech recognition. New codecs do more than just perform analog-to-digital conversion. They use sophisticated predictive models to analyze the input voice signal and then transmit the voice using minimal bandwidth. This section will provide some examples of voice codecs and the bandwidth they use. In all cases, speech is transmitted in RTP packets over the IP protocol.

Simple Pulse Code Modulated (PCM) voice modulation is described by the ITU-T G.711 standard. It allows two main types of PCM at 64 Kbps: mu-law and A-law. Both of these methods use logarithmic compression to achieve 12-13-bit linear PCM quality on 8 bits. However, they have less significant compression features (mu-law has a slight advantage at low signal-to-noise ratios). Historically, the use of these methods has followed geographic boundaries: North America uses mu-law modulation, and Europe uses A-law modulation. The conversion of mu-law compression to A-law is performed by a country using mu-law modulation. When troubleshooting PCM systems, mismatched modulation types lead to unnatural-sounding, but nevertheless intelligible speech.

Another commonly used compression method is Adaptive Differential Pulse Code Modulation (ADPCM). A typical use case for ADPCM is ITU-T G.726 encoding using 4-bit quanta, providing a bit rate of 32 Kbps. Unlike PCM, 4 bits do not encode speech amplitude, but only the difference in amplitude and the rate of change of amplitude, using a rather primitive linear prediction.

PCM and ADPCM are examples of waveform codecs that use redundant waveform characteristics in their compression methods. New compression methods developed over the past 10-15 years also use knowledge of the original features of speech formation. Such methods use signal processing techniques that compress speech, sending only simplified parametric information about the original shape of the audio signal and vocal tract. Less bandwidth is required to transmit this information. These methods can be combined into a common group of codecs by source. It includes varieties such as Linear Predictive Coding (LPC), Code Excited Linear Prediction (CELP) and Multipulse, Multilevel Quantization (MP-MLQ).

The types of codecs listed above can be divided into subcategories. For example, CELP methods include a low-latency version called LD-CELP (low delay CELP), as well as more complex methods for modeling the vocal tract with algebraic transformations of conjugate structures. Such codecs are designated as CSA-CELP (conjugate structure algebraic CELP). This list can be continued indefinitely, but it is important for network developers to know only the areas of application of these approaches in networks and applications.

Sophisticated predictive codecs rely on a mathematical model of the human vocal apparatus and, instead of sending compressed speech, send a mathematical representation of it that allows the recipient to generate it. However, debugging such equipment requires serious research. For example, some of the first codecs reproduced the voices of their developers well and were actively implemented - until it was discovered that they did not reproduce women's speech and Asian dialects very well. These codecs were then redesigned to accommodate a wider range of human voice types.

The ITU has standardized the most common voice coding and packetization techniques in telephony by adopting the following standards.

G.711. Briefly described earlier, the PCM voice coding method with a transmission rate of 64 Kbps. G.711 voice coding always provides the correct format for digital voice transmission over an open telephone network or PBX.

G.726. ADPCM encoding method with bit rates of 40, 32, 24 and 16 Kbps. ADPCM-encoded speech can also be transmitted between packet voice networks, public telephone networks, and PBX networks, provided the latter support ADPCM.

G.729. CELP compression method that allows you to encode speech into streams with a transmission rate of 8 Kbps. The two versions of this standard (G.729 and G.729 Annex A) differ significantly in computational complexity, but both provide about as good speech quality as 32 Kbps ADPCM.

G.723.1. A technique that can be used to compress voice and other audio components of multimedia messages at very low bit rates. As part of the general H.324 family of standards, this encoder has two bit rates: 5.3 and 6.3 Kbps. The higher speed is based on MP-MLQ technology and provides higher quality; the lower one is based on the CELP method and provides good quality and also provides system developers with additional flexibility.

As codecs increasingly rely on subjectively customizable compression techniques, standard objective quality metrics such as total harmonic distortion and signal-to-noise ratio have less relevance to codec quality metrics. A common test for determining the effectiveness of voice codecs is the Mean Opinion Score (MOS). Because voice and sound quality are usually subjective and listener-dependent, a wide range of listeners and speech samples is important in this method. MOS tests are administered to a group of listeners who give voice samples a rating from 1 (poor) to 5 (excellent). The scores are then averaged and the average expert score is obtained. MOS testing is also used to compare the performance of the same codec under different conditions, such as background noise levels, encoding and decoding methods, etc. Subsequently, this data can be used for comparison with other codecs.

In table Figure 19.1 provides MOS scores for several ITU-T codecs and also shows the relationship between several low-bitrate codecs and the PCM standard.

1 For Texas Instruments DSP 54x.

This table provides information useful for comparing different implementations of common voice codecs. Relative bandwidth and processing complexity, expressed in Millions of Instructions Per Second (MIPS), determine the applications of different codecs. In general, a higher average expert rating corresponds to more complex codecs or higher bandwidth.

Literature:

Internetworking Technology Handbook, 4th Edition. : Per. from English - M.: Publishing house "William", 2005. - 1040 pp.: ill. - Paral. tit. English

At the first stage, voice digitization is carried out. The digitized data is then analyzed and processed to reduce the physical volume of data transmitted to the recipient. As a rule, at this stage, unnecessary pauses and background noise are suppressed, as well as compression.

At the next stage, the received data sequence is divided into packets and protocol information is added to it - the recipient's address, the sequence number of the packet in case they are not delivered sequentially, and additional data for error correction. In this case, the necessary amount of data is temporarily accumulated to form a packet before it is directly sent to the network.

Packet switched network operators benefit from the inherent benefits of a shared telecommunications infrastructure by its very nature. Simply put, they may sell more than they actually have based on statistical analysis of the network's performance. Since subscribers are not expected to use all of their paid bandwidth 24/7, it is possible to serve more subscribers without expanding the backbone infrastructure. At the same time, turnover and profits increase.

In other words, a subscriber who paid for a 64 kbit/s band uses the channel on average only 25%. Consequently, the operator is able to sell its existing resource to four times as many users without overloading its network. This scenario is beneficial to both parties - the client and the seller - since the operator increases its income and reduces subscription fees by reducing costs. This winning solution has already been recognized in the world of data communications and is now starting to be used in the telephony market.

Bandwidth directly depends on the load on the Internet with packets containing data, voice, graphics, etc., which means that delays in the passage of packets can be very different. By using dedicated channels exclusively for voice packets, a fixed (or nearly fixed) transmission rate can be guaranteed. Due to the widespread use of the Internet, the implementation of an Internet telephony system is of particular interest, although it should be recognized that the quality of telephone communication is not guaranteed by the operator.

In order to carry out long-distance (international) communication using telephone servers, the organization or service operator must have a server in the places where calls are planned to and from. The cost of such communication is an order of magnitude less than the cost of a telephone call over conventional telephone lines. This difference is especially great for international negotiations.

The general operating principle of Internet telephony telephone servers is as follows: on the one hand, the server is connected to telephone lines and can connect to any telephone in the world. On the other hand, the server is connected to the Internet and can communicate with any computer in the world. The server takes a standard telephone signal, digitizes it (if it is not originally digital), greatly compresses it, splits it into packets, and sends it over the Internet to its destination using the IP protocol. For packets coming from the network to the telephone server and leaving the telephone line, the operation occurs in the reverse order. Both components of the operation (the signal entering the telephone network and its output from the telephone network) occur almost simultaneously, which allows for a full-duplex conversation.

Since the operator provides a certain service and charges money for it, he is obliged to guarantee its quality. Even if the customer is willing (although this is unlikely in a highly competitive telecommunications market) to put up with less than excellent quality from time to time, he may still be able to make a claim if the problems are serious or lasting. Be that as it may, the operator is forced to monitor the quality of the services provided, for which, in the case of their large-scale provision, he needs appropriate equipment and software, which is quite expensive and is not available at all points of the network.

From a scalability point of view, IP telephony seems to be a completely complete solution. Firstly, because an IP-based connection can begin (and end) at any point in the network from the subscriber to the backbone. Accordingly, IP telephony can be introduced into the network section by section, which, by the way, is beneficial from a migration point of view. The IP telephony solution is characterized by a certain modularity: the number and power of various nodes - gateways, gatekeepers (“gatekeepers” - this is how number plan processing servers are called in VoIP terminology) - can be increased almost independently, in accordance with current needs.

In recent years, several solutions have been proposed to create a universal infrastructure for transmitting heterogeneous traffic. In conditions of increased requirements for quality of service and bandwidth, networks with high quality services and increased transmission speed are needed.

IP plays a key role in providing service flexibility. To increase overall network profitability, providers must provide services that are IP-based or capable of “understanding” IP, since most applications that require WAN services use IP. And as consumers continue to demand more functionality from their providers, providers must continually seek new services that can complement and enhance consumer applications. It is safe to say that these services must be IP based.

IP is becoming the standard protocol for corporate, intranet and extranet networks. In the 80s, geographically distributed corporate networks were built on the basis of dedicated E1/T1 channels. Multiplexers were used to compress channels and were used to integrate voice and data in public and private networks. At the same time, the principles of building telephone networks did not change radically. In such networks, telephone connections are established along predefined routes (main and alternative) and “suffer” from many limitations: the high cost of maintaining a large number of routing tables for each PBX and their reconfiguration when telephone flows change, inefficient use of bandwidth, deterioration in speech quality when the use of compression mechanisms in networks with multiple PBXs and others.

In recent years, devices have been developed that provide voice transmission over networks originally intended for data transmission, such as Frame Relay and IP networks. The driving force behind this is the desire to reduce the cost of using leased communication lines and increase the efficiency of using dedicated corporate communications.

A new impetus for the development of telephone networks was given by the emergence of voice transmission technology over ATM networks, which provides for the ability to connect PBXs to ATM switches capable of processing both data streams and telephone signals.

This article describes:

technologies for transmitting voice and data over IP networks;
problems of building integrated networks;
mechanisms that provide increased bandwidth efficiency and flow control flexibility (compression, suppression of speech pauses);
equipment from leading manufacturers.

What is IP telephony

Telephony over IP is a relatively new service that typically uses a managed IP network to transmit telephone traffic.

The VoIP (voice over IP) services market is expected to grow at a phenomenal rate over the next five years. According to Killen & Associates, in Fortune 1000 companies, less than 1% of voice traffic now travels over IP networks; by 2002 this share should reach 18%, and by 2005 - 33%.

Users and service providers are attracted by the economic benefits of using IP for the transmission of telephone traffic, conference calls with simultaneous exchange of information, IP call service centers, and transparent routing of user requests.

Comparing the quality of standard telephone communications over public networks with the first generation of VoIP devices is not in favor of the latter, primarily due to low reliability and low quality of service. However, the advent of sophisticated, modern applications and devices—high-performance switches and routers that leverage the advanced quality of service (QoS) mechanisms of digital signal processors (DSPs)—eliminates many of the challenges of second-generation VoIP systems.

IP telephony refers to the technology of using an IP network (Internet or any other) as a means of organizing and conducting telephone conversations and transmitting faxes in real time. IP telephony is one of the most complex applications of computer telephony.

In general terms, voice transmission in an IP network occurs as follows. Incoming call and signaling information from the telephone network is transmitted to an edge network device called a telephone gateway and processed by a dedicated voice service device card. The gateway, using control protocols of the H.323 family, redirects signaling information to another gateway located on the receiving side of the IP network. The receiving gateway ensures the transmission of signaling information to the receiving telephone equipment according to the number plan, ensuring end-to-end connection. Once a connection is established, the voice at the incoming network device is digitized (if it was not digital), encoded according to standard ITU algorithms such as G.711 or G.729, compressed, encapsulated into packets, and sent to its destination at the remote device using a stack TCP/IP protocols.

Thus, using an IP network, digital information can be exchanged to send voice or fax messages between two computers in real time. The use of the Internet will make it possible to implement this service on a global scale.

The main challenges in building an IP network to carry telephone traffic are latency management mechanisms and maintaining sufficient bandwidth. In addition, methods for setting tariffs for services and billing for their use are important, as well as options for paying for additional services in the IP network, such as call forwarding, caller ID, routing depending on the time of day, etc.

An important issue is assessing the profitability of a new technology. Does IP connectivity really offer significant savings? The answer to this question can only be obtained through a comprehensive consideration of the problem. Perhaps this is exactly the case. If the cost of transmitting information over the network is only 15-20% of the total cost of maintaining the network infrastructure, then the 70% savings in network costs may not seem so attractive compared to the amount of work that would need to be done to transfer all functions to a universal basis. and also in comparison with the amount of money spent on creating a universal infrastructure and the ability to use existing equipment.

And this is only a small part of all the problems associated with the implementation of universal communication lines. Therefore, as a rule, the offering of integrated networks by service providers begins with the creation of small specialized networks on which integration technologies are tested and answers to questions that arise when combining different types of communications take place. However, we can already talk about the reality of building an integrated infrastructure.

General approach to building an IP network for transmitting telephone traffic

"computer - computer"
This option is not an example of IP telephony, since voice is transmitted only over a data network, without access to the telephone network. To organize traffic transmission, the user purchases the necessary equipment and software, and also pays the provider for the operation of the communication channel. The advantage of this option is maximum cost savings. Disadvantage: minimal connection quality.
"telephone - telephone"
To organize such communication, it is necessary to have certain network devices and interaction mechanisms. Voice traffic is transmitted over an IP network, usually in a separate, expensive section. The devices that organize interaction are gateways, connected, on the one hand, to the public telephone network, and on the other, to the IP network. Voice communication in this mode, compared to the computer-to-computer option, is more expensive, but its quality is much higher and it is more convenient to use. In order to use this service, you need to call the provider servicing the gateway, enter the code and number of the called subscriber from the telephone set and talk in the same way as with regular telephone communication. All necessary call routing operations will be performed by the gateway.
"computer - phone"
This opens up more use cases for corporate users, since most often the corporate network is used, servicing calls from computers to the gateway, which are then transmitted over the public telephone network. Enterprise solutions using computer-to-phone communication can help save money, and the equipment needed to do this is discussed below.

So, it is obvious that to build an IP telephony network, two main elements are required (Fig. 1).

The first is a gateway, which provides conversion functions between the packet-switched IP network and the public telephone network, analog-to-digital conversion, control of transmission formats and VoIP call procedures. It is possible to use multiple gateways in the network.

The second main element is a control device (gatekeeper), which provides a number of functions for controlling access to and from the IP network, bandwidth and addressing. In addition, the control device monitors all gateways and terminals, performs directory service functions, and controls user accounts.

The gateway can be supplied as a separate network device or installed on a personal computer. When using a gateway, the VoIP function is transparent to the user using a regular telephone or fax machine. Let's take a closer look at the main functions of the gateway when transmitting voice over an IP network.

1. Search function. When an outbound IP gateway places a phone call over an IP network, it takes the caller's number and converts it to the destination gateway's IP address, either from a table in the outbound gateway or from a centralized server. Table lookup in an outbound gateway often takes less time than in a centralized server and reduces connection time from 4-5 seconds to 1-2 seconds.

2. Communication function. The originating gateway establishes a connection with the destination gateway, exchanging information about connection parameters and device compatibility.

3. Digitization. Analog telephone signals are digitized by the gateway and usually converted into a 64 Kbit/s PCM (pulse code modulation) signal. This feature requires the gateway to support a variety of analog telephony interfaces.

In many cases, digital network support with integration of T1/E1 services and interfaces is also required. The Integrated Services Digital Network and T1/E1 interfaces operate in PCM format, so A/D conversion is not required. The digital integrated services network BRI has one or two PCM channels, T1 has up to 24 PCM channels and E1 has up to 30 PCM channels. A digital network with PRI services integration can have up to 24 or 30 PCM channels.

4. Demodulation. Since some gateways can only accept voice or only fax signals, trunk links to the voice or fax processing modules must be predefined. More sophisticated gateways can handle both types of data, automatically determining whether a digital signal is audio or fax and processing the signal depending on its type. The fax signal is demodulated by the signal processor (DSP) back into the 2.4-14.4 Kbps digital format, that is, into the original representation before being output from the fax machine (the fax machine presents the output signal in analog form). This demodulated signal is then placed into IP packets for transmission to the destination gateway (Figure 2).

The demodulated information is then converted back into an analog fax signal by the destination gateway for delivery to the fax machine.

Fax transmission can be carried out using UDP/IP or TCP/IP protocols. UDP/IP, unlike TCP/IP, does not require error correction that occurs during packet transmission.

5. Compression. Once a signal is determined to be voice, it is typically compressed by a signal processor using one of the compression/decompression (CODEC) techniques (Table 1) and placed into IP packets. It is important to ensure good speech quality and low delay when digitizing the signal.

Table 1. Speech compression methods

Compression method	Complexity	Quality	Delay
G.726, G.727, ADPCM 40, 32, 24 Kbps	low (8 MIPS)	good (40K), bad (16K)	very low (10-17 ms)
G.729 CS-ACELP 8 Kbps	high (30 MIPS)	good	low
G.729A CA-ACELP 8 Kbit/s	moderate	average	low
G.723.1 MP-MLQ 6.4/5.3 Kbps	moderately high (20 MIPS)	good (6.4), average (5.3)	high
G.728 LD-CELP 16 Kbps	very high (40 MIPS)	good	low

The audio packet is transmitted as a UDP/IP packet rather than a TCP/IP packet to avoid the rather large delays that occur when retransmitting TCP/IP packets. If FEC (Forward Error Correction) mode is used, a corrupted or missing audio packet can be reconstructed based on the data from the previous audio packet. If FEC is not used, the corrupted packet is simply discarded and the gateway uses the previous good packet. This mechanism works unnoticed by the user in case of low percentage of packet distortion/loss (< 5%).

The data digitized by CODEC does not contain the IP packet address and control information (“header”) (Fig. 3), which usually amounts to an additional 7 Kbps unless the IP router separately compresses the header, otherwise - 2-3 Kbps

The complexity of a CODEC's implementation determines the required signal processor power, measured in millions of instructions per second (MIPS), to process the voice signal, excluding echo cancellation and silence suppression functions.

6. Decompression/demodulation. The gateway, while executing steps 1-4 above, at the same time receives packets from other IP gateways and decompresses the packets into a form understandable by the corresponding analogue telephone, integrated services digital network, or T1/E1 devices. The gateway also demodulates the digital fax signal into its original form and then into the appropriate telephone interface.

In addition, the gateway can perform the functions of matching the interfaces of the call initiator and the call recipient.

IP speech quality

To ensure high speech quality, the VoIP gateway must use a codec with good speech quality and low latency. In addition, there are several additional technologies needed to ensure good speech quality: two of them are packet prioritization and echo cancellation. Echo cancellation is a function of the signal processor, packet priority system is a function of the router and gateway.

When a two-wire telephone cable is connected to a four-wire PBX or central office (CO) telco interface, a special electrical connection called a hybrid circuit is used to match the two-wire and four-wire connections. Although hybrid circuits are very efficient at performing matching functions, a small percentage of the telephone signal energy is not converted but rather reflected back to the caller. This signal is called an "echo signal".

If the caller is near the PBX or central switch, the echo returns quickly enough to be inaudible to humans. However, if the delay is more than 10 ms, the caller may hear a reflected signal. To prevent echo, gateway vendors include special code in signal processors that listen for the echo and remove it from the audio signal. Echo cancellation is especially important for gateway providers because latency on an IP network can easily exceed 40-50 milliseconds, so the echo will be clearly felt at the near end. Compensating for the echo coming from the far end of the line can have a significant impact on signal quality.

The main sources of speech quality degradation are network latency and packet jitter. Network latency is the average time it takes for a packet to travel over a network. Fluctuation is a deviation from the average packet transmission time. Both parameters are important for determining speech quality.

Since network transmission time (total time including codec processing time) often exceeds 150 ms, communication between two subscribers will increasingly resemble half-duplex communication with the desired pause during the conversation. If pauses are poorly recorded, then the speech of one interlocutor seems to “run into” the speech of the other.

One of the main means of combating network congestion should be to ensure Quality of Service (QoS).

What is the point of QoS? QoS means dynamically providing guaranteed bandwidth to various applications and transmitting data according to user-defined requirements. There is still no universally accepted interpretation of the term “QoS”; Most often, QoS refers to setting traffic priorities without guarantees on bandwidth, providing a fixed bandwidth when transmitting data between two given network nodes based on permanent or switched virtual channels, and a total bandwidth guaranteed by Internet service providers.

Good voice quality over an IP network is largely due to low packet jitter rather than low network latency. Network packet jitter values are supported by the intelligence of routers, which can manage the priorities of voice packets in an IP network. The router is configured to look for IP voice packets and place them before data packets waiting to be transmitted. The voice packet priority system is especially important in regional communication networks with speeds from 56 to 512 Kbps. At T1/E1 line speeds, this may not be necessary.

Thus, at present, the required quality of service is provided mainly by means of traffic priority control. Note that in IP networks more complex quality management procedures are possible.

IP packet segmentation is another important VoIP latency control mechanism to ensure that a very long data packet does not delay a voice packet leaving the router. This is achieved by configuring the router to segment all outgoing data packets according to the speed of the communication network. The combination of a voice/fax priority system and packet segmentation mechanisms creates good prerequisites for building a VoIP network.

Another technology used by some gateways to ensure good speech quality is forward error correction (FEC).

Bandwidth Management

As already noted, the second important problem in implementing speech transmission technologies over an IP network is minimizing the used bandwidth of the communication channel. The mechanisms of compression and pause suppression play an important role here. Mechanisms that use silence suppression technology detect periods of silence between subscribers during a communication session or fax transmission and stop sending IP packets during these periods.

The desire for more efficient use of bandwidth is driving the development of speech compression mechanisms. A standard PCM voice signal, as noted, requires a bandwidth allocation of 64 Kbps (ITU-T Recommendation G.711), which is actually too much.

One of the long-used speech compression algorithms is called ADPCM (Adaptive Differential Pulse Code Modulation; the G.726 standard was adopted in 1984). This algorithm provides almost the same quality of speech reproduction as PCM, however, to transmit information when using it, a bandwidth of only 16 Kbps is required. The method is based on encoding not the signal amplitude itself, but its change compared to the previous value; therefore, you can get by with fewer digits. In ADPCM, a change in signal level is encoded as a four-bit number, while the frequency of measuring the signal amplitude remains unchanged.

All encoding methods that rely on certain assumptions about the waveform are unsuitable for transmitting signals with sudden changes in amplitude. This is the type of signal generated by modems or fax machines, so equipment that supports compression must automatically recognize signals from fax machines and modems and process them differently from voice traffic.

Many coding methods originate from the linear predictive coding method (LPC, Linear Predictive Coding). LPC uses a sequence of digital amplitude values as an input signal, but the encoding is applied not to individual digital values, but to specific blocks of them. For each such block of values, its characteristic parameters are calculated: frequency, amplitude and a number of others. It is these values that are transmitted over the network. With this approach to speech coding, firstly, the requirements for the computing power of specialized processors used to process the signal increase, and secondly, the transmission delay increases, since coding is applied not to individual values, but to a certain set of them, which is the beginning of the conversion should be accumulated in a specific buffer. It is important that the delay in speech transmission is not only associated with the need to process the digital signal (this delay can be reduced by increasing processor power), but is also determined by the compression method. This method allows you to achieve very high compression ratios with a bandwidth of 2.4 or 4.8 Kbps, but the sound quality suffers greatly. Therefore, it is not used in commercial applications, but is used mainly for conducting official negotiations.

More complex speech compression methods are based on the use of LCP in combination with waveform coding elements. These algorithms use feedback coding, where code optimization occurs during signal transmission. Having encoded the signal, the processor tries to restore its shape and compares the result with the original signal, after which it begins to vary the encoding parameters, achieving the best match. Having achieved a match, the equipment transmits the received code via communication lines; at the opposite end the audio signal is restored. It is clear that using this method requires even more serious computing power.

One of the most common varieties of the described encoding method is the LD-CELP (Low-Delay Code-Excited Linear Prediction) method. This method allows you to achieve satisfactory playback quality with a bandwidth of 16 Kbps; it was standardized by the International Telecommunications Union (ITU) in 1992 as the G.728 speech coding algorithm. The algorithm is applied to a sequence of digits resulting from the analog-to-digital conversion of a voice signal with 16-bit resolution. Five consecutive digital values are encoded in one 10-bit block - this gives 16 Kbps. This method requires a lot of computing power: in particular, to implement G.728 directly, a processor with a speed of 44 MIPS is required.

In March 1995, the ITU adopted a new standard, G.723, which is intended to be used for speech compression for videoconferencing over telephone networks. This standard is part of the more general H.324 standard, which describes an approach to organizing such video conferencing. The purpose of its adoption is to enable video conferencing using conventional modems. The basis of G.723 is the MP-MLQ (Multipulse Maximum Likelihood Quantization) speech compression method. It allows you to achieve very significant speech compression while maintaining a fairly high sound quality. The method is based on the optimization procedure described above; with the help of various improvements, speech can be compressed to a level of 4.8; 6.4; 7.2 and 8.0 Kbps. The structure of the algorithm allows you to change the degree of voice compression during transmission. The delay introduced by coding does not exceed 20 ms.

While speech compression mechanisms increase bandwidth efficiency, they can also lead to decreased speech quality and increased latency. Some basic speech compression algorithms and the delays created by this are given in Table. 1.

Quantitative characteristics of speech quality degradation are the signal quality degradation parameters during quantization (QDU, Quantization Distortion Units). One QDU corresponds to quality degradation when digitizing using a standard PCM procedure; QDU values for the main compression methods are given in table. 2. Additional speech processing leads to further loss of quality. According to ITU-T recommendations, for international calls the QDU value should not exceed 14. Note that transmission of a conversation over international trunk channels degrades speech quality, as a rule, by 4 QDU.

Table 2. Speech quality degradation when using various compression algorithms

Compression methods	QDU
ADPCM 32 Kbps	3,5
ADPCM 24 Kbps	7
LD-CELP 16 Kbps	3,5
CS-CELP 8 Kbps	3,5

Therefore, when transmitting a conversation over national networks, no more than 5 QDU should be lost. Therefore, for high-quality speech transmission, it is advisable to use the compression/decompression procedure only once in the network. In some countries this is a mandatory regulatory requirement for networks connected to public networks.

Silence suppression is an important function of equipment that provides voice transmission over IP networks. The essence of pause suppression technology is to determine the difference between moments of active speech and silence during the connection period. As a result of using this technology, packets are generated only during active conversations. Since pauses account for up to 60% of the time in a typical telephone conversation, a twofold optimization of the amount of data transmitted over the line is possible. Combining speech compression technology and suppressing speech pauses in switches leads to an eight-fold reduction in the data flow in the channel.

To be continued

ComputerPress 5"1999

Alexey Sheremetyev, Alexander Nepomnyashchiy, Alexey Lyubimov

Traditional approaches to building telecommunication networks for voice and data transmission do not imply a close relationship between these processes. Managing these two different systems naturally involves significant investment in equipment and personnel. In recent years, several fundamental options for implementing a unified network infrastructure for transmitting heterogeneous traffic have been proposed to solve this problem.

In the 80s, geographically distributed corporate networks were built on the basis of dedicated E1/T1 channels. To compress channels, multiplexers were used, which became, perhaps, the very first hardware platforms for integrating voice and data in private and public networks and are the most common way of transmitting heterogeneous traffic over a single network at present. At the same time, the principles of building telephone networks did not change radically. In such networks, telephone connections are established along predefined routes (main and alternative) and suffer from many limitations: the high cost of maintaining a large number of routing tables for each PBX and their reconfiguration when telephone flows change, inefficient use of bandwidth, deterioration in speech quality when using mechanisms compression in networks with many PBXs, etc.

In recent years, devices have been developed that provide voice transmission over networks originally intended for data transmission, such as frame relay (FR) and IP networks. The driving force behind this is the desire to reduce the cost of using leased communication lines and increase the efficiency of using dedicated corporate communications.

A new impetus for the development of telephone networks was given by the emergence of mechanisms that provide voice transmission over ATM networks, which provide the ability to connect PBXs to ATM switches capable of switching both data and voice streams.

Initially focused on data transmission, the frame relay protocol quickly became the basis for the transmission of mixed traffic: computer network data, voice and fax messages, while providing high quality and reasonable cost solutions. The widespread use of frame relay is explained by low encapsulation losses (3 - 4%), the ability to allocate guaranteed bandwidth (CIR - Committed Information Rate), as well as predictability and minimal delays in information transfer. Using frame relay allows you to build networks with integration not only on dedicated channels, but also on the basis of existing global networks. Voice transmission over frame relay is built, as a rule, on the basis of proprietary protocols, for example, Voice Relay from Motorola.

It is known that transmission of one voice channel requires a bandwidth of 64 kbit/s (STC). However, this value can be reduced using voice compression mechanisms and pause suppression technology. Algorithms implemented in specialized processors for digital signal processing (Digital Signal Processor - DSP) provide compression of the digital voice signal to a level of 32, 16, 8 kbit/s or less. As a rule, a telephone conversation consists of only 40 - 50% speech, so if you allocate pauses and do not transmit silence along communication lines, but use the freed-up time to transmit data, you can achieve even greater savings in bandwidth.

In order to ensure the required quality of voice transmission, it is necessary, according to ITU-T recommendations, that delays in voice transmission over international communication lines do not exceed 150 ms. This is possible, firstly, by using a priority system, secondly, by fragmenting data packets, and thirdly, by reducing the number of compression/decompression procedures. In this case, packets containing a voice signal must be transmitted before packets with data, and network traffic from packets of variable length, leading to significant pauses in the reconstructed speech and poor quality of its sound, must be divided into small packets of a fixed length, so that the transmission time each packet was from 5 to 10 ms. The permissible number of compression/decompression procedures depends on the compression algorithm used, the length of the lines and other factors. As will be shown below, for high-quality speech transmission, it is advisable to use the compression/decompression procedure only once in the network.

For successful voice transmission over frame relay, it is necessary to solve the problem of correct congestion handling. As you know, when transmitting data packets over these networks, no confirmation of packet receipt is sent, and data integrity is checked only by means of higher-level protocols. Since the reliability of packet transmission in modern networks is quite high, this approach does not lead to significant losses, while at the same time allowing to significantly reduce the overhead costs of data transfer. When transmitting voice data, packet loss causes disruption of voice reproduction at the receiving end, so every effort should be made to solve this problem. One of the situations that can lead to packet loss is congestion, which occurs when a particular switch is unable to pass all the traffic coming to it on its outgoing links. When a congestion occurs, the switch sends a special message to all access devices from which the traffic that caused the congestion originated. The response to this message should be to reduce the speed of data transmission to the network, but not all access devices have this ability. Effective congestion handling is absolutely necessary for correct voice transmission, otherwise it is difficult to expect that all forwarded voice information will reach its intended destination in the event of a congestion.

Let's see how the above problems are solved by Motorola in the already mentioned Voice Relay technology.

When the MPR 6520/6560 router detects a call on one of its voice ports, firstly, it records the length of all packets transmitted over the called FR channel, both voice and data packets, and secondly, for quality transmission voice gives higher transmission priority to voice packets, and no more than two data packets can be transmitted between two voice packets.

Motorola equipment uses a mechanism called smoothing delay, the essence of which is that if the source sends a sequence of VDDVX packets (where V is a voice packet, D is a data packet, and the arrow indicates the order of packet transmission), then On the receiver side, the first voice packet is “held” to eliminate delay and the packet sequence looks like VVDDX. Motorola “knows how” to recover lost packets. As already noted, frame relay is a protocol in which there is no data verification for correctness during transmission. Verification is carried out by upper-level protocols. For voice packets (in order to reduce delays) no check is performed, but if a bad packet is received or if a packet is lost, the MPR 6520/6560 interpolates the lost packet due to the analog nature of voice.

Voice Relay incorporates an echo cancellation mechanism and uses pauses in conversations to transmit data. When the call ends, the MPR 6520/6560 waits for 30 seconds to see if the conversation starts again and unfreezes the packet length, returning to variable length. Accordingly, all functions for transmitting voice information are disabled, i.e., only data is transmitted.

Despite all these difficulties, the technology of voice transmission over frame relay networks has many supporters and is widely used for communication between the headquarters of a company and its branches.

However, FR networks are limited by T1/E1 bandwidth, which may not be sufficient to carry rapidly growing volumes of voice and data traffic.

IP telephony

IP is becoming the standard protocol for corporate backbone networks when building intra- and extranets, as well as when connecting businesses to the Internet. VoIP solutions, i.e. voice over IP, are becoming more and more plug and play, making them easy to install and train users. However, if you are using a software solution, then the appropriate software must be installed on each PC with which you want to communicate; It is also necessary to train each user to use the VoIP program. As a more complex VoIP solution, a gateway installed on a single server (personal computer) in the central office and branches can be used. This server can also perform other functions. When using a gateway, the VoIP function is transparent to the user using a regular telephone and fax machine.

1. Search function. When an outbound IP gateway places a phone call over an IP network, it takes the caller's number and converts it to the destination gateway's IP address based on table data in the outbound gateway or centralized server. Viewing a table in an outgoing gateway often takes less time than viewing it in a centralized server: connection time is 1 - 2 seconds, not 4 - 5, as in the second case.

2. Communication function. The originating gateway establishes a connection with the destination gateway, exchanging information about connection parameters and device compatibility.

3. Digitization function. Analog telephone signals entering the gateway are digitized by the gateway and converted, typically to a 64 kbit/s PCM signal. This feature requires the gateway to support a variety of telephony interfaces.

4. Demodulation function. Some gateways can only accept voice or fax, but not both, so the trunk links to the voice or fax processing modules must be predefined. More sophisticated gateways can handle both types of data, automatically determining whether a digital signal is audio or fax and processing the signal according to its type. If the signal is a facsimile, it is demodulated by a signal processor (DSP) back into a digital format of 2.4 - 14.4 kbit/s. This demodulated signal is then placed into IP packets for transmission to the destination gateway. The demodulated information is then remodulated back into an analog fax signal by the destination gateway for delivery to the fax machine.

Fax transmission can be carried out using either UDP/IP or TCP/IP formats. UDP/IP, unlike TCP/IP, does not require error correction in packet transmission. It would seem that the UDP/IP format is preferable since a damaged fax packet would only affect one fax line. However, if packet loss occurs while the negotiation page is being processed, the fax transmission may end. When using TCP/IP, the host software hides the retransmission of TCP fax data packets without affecting the document.

5. Compression function. Once a signal is determined to be voice, it is typically compressed by a signal processor (DSP) using one of the compression/decompression techniques (Table 1) and placed into IP packets. It is important to ensure good speech quality and low delay during digitization.

Table 1. Speech compression methods

Audio packets are transmitted as UDP/IP packets rather than TCP/IP packets to avoid the rather large delays that occur when retransmitting the latter. If FEC (Forward Error Correction) mode is used, a corrupted or missing audio packet can be reconstructed based on the data from the previous audio packet. If FEC is not used, the malformed packet is simply discarded and the gateway uses the previous packet. This mechanism works unnoticed by the user in case of low percentage of packet distortion/loss (<5%), в противном случае действует хорошо отработанная техника коррекции звука: “А? Не слышу+ Громче+”.

The data digitized by CODEC does not contain the IP packet's address and control information, or “header,” which is typically an additional 7 kbps unless the IP router separately compresses the header, otherwise 2 to 3 kbps.

Processing complexity is determined by the required DSP processing power required to process the voice signal, excluding echo cancellation and silence suppression functions, measured in millions of instructions per second (MIPS). Lower complexity means lower DSP processing overhead.

6. Decompression/remodulation function. The gateway, while executing steps 1 through 4 above, at the same time receives packets from other IP gateways and decompresses the packets back into a form that can be understood by the corresponding analogue telephone, integrated services digital network, or T1 devices. /E1. It can also remodulate a digital fax back to its original form and then to the appropriate telephone interface.

In addition, the gateway can perform the functions of matching the interfaces of the call initiator and the call recipient, making the necessary transformations.

The VoIP gateway voice process described above is different from the voice over frame relay mechanism used by voice/fax routers and frame relay access devices (FRADs). A VoIP gateway is a local area network device that can reduce costs when transmitting voice and faxes over a regional communications network, with connections across the communications network made by a router rather than a gateway. Voice/Fax Router or FRAD, however, are regional network devices that connect the local area network to the regional network, which largely determines their functions, increases the complexity of their implementation, and also places higher demands on their fault tolerance and manageability compared to gateway.

When transmitting voice over an IP network, as in the case of frame relay, the problem of delay of voice packets arises, and unlike frame relay, unpredictable congestion and delays that occur on the Internet, radically reducing the quality of voice transmission, cannot be controlled by the user.

At the same time, users should not forget that the technology for voice transmission over IP networks has not yet become established. If the call initiator and the recipient do not have identical voice over IP software installed, then they most likely will not be able to talk.

In order for software from different vendors to be interoperable, support for standards such as H.323 is required.

Users hope that mass adoption of the Internet will help solve the problem of bandwidth and thereby avoid delays in voice transmission. However, it is impossible to control this process, so manufacturers are working on other problems.

Much hope is placed on the Resource Reservation Protocol (RSVP), which makes it possible to reserve Internet bandwidth end-to-end for latency-sensitive traffic. RSVP was designed to carry multimedia traffic over IP networks, but it can also be used to carry voice. According to this protocol, routers exchange signals, requesting a clear path through the network. The main disadvantage of this approach is that there may be so many requests that routers will not be able to service them all.

ATM has all the prerequisites to become a single highway for various types of traffic. This technology allows flexible use of network bandwidth. Some applications may be given higher priority than others. With ATM, available bandwidth can be dynamically shared among different applications without users even knowing how they exchange phone calls or how data gets from one place to another.

Why do many people prefer ATM over IP? One of the reasons for this is the very low latency (about 20 ms) when transmitting voice over these networks. When transmitting voice over IP, the delay increases to 300 ms, as a result of which the output voice sounds different than in the case of regular telephone communication.

Since 1997, the ATM Forum has ratified six new specifications designed to ensure interoperability between products from different manufacturers and to ease the transition of end users to ATM. Among them are Multiprotocol over ATM (MPOA), LAN Emulation (LANE) 2.0, inverse multiplexing for ATM, dynamic bandwidth utilization, user-network interface based on FUNI 2.0 frames.

One of the most anticipated was the Voice and Telephone over ATM (VTOA - to the Desktop Specification), which describes dial-up voice services over the ATM network for regular telephones. Just as MPOA and LANE leverage existing protocols and services, VTOA supports traditional voice traffic over ATM protocols.

VTOA provides access to private and public network voice services and has broadband terminal support capabilities. These terminals can be connected to any telephone on the network that uses G.711 pulse code modulation to encode 64 kbps voice channels. G.711 is adopted by ITU-T and is currently the most common desktop voice service.

To support features such as alerts and call progress messages, the VTOA to the Desktop Specification requires the use of a User-Network Interface (UNI) version 4.0 that defines the connection between end users or an end station and the local switch, or interface “network-to-network” (Private Network-to-Network Interface - PNNI) 1.0, which defines the interaction between two switches.

When there are high bandwidth efficiency requirements, voice traffic is best carried using a Variable Bit Rate (VBR) service.

Where bandwidth efficiency is not a dominant factor, using a CBR service will be a cheaper and simpler solution.

Adaptation mechanisms - ATM Adaptation Layer, AAL (Table 2) - provide transmission of various types of traffic, including voice between PBXs. Thus, the AAL0 level supports non-standard voice adaptation algorithms, the AAL1 level supports voice transmission in CBR mode (an appropriate bandwidth is allocated). AAL1 cells in SDT and UDT formats provide the ability to transmit time stamp information over the network, partially filled data cells to reduce latency, and a cell counter value to detect cell losses. Operating at the AAL5 level is the cheapest way to bring voice over ATM to the workplace using CBR mode. However, it is not suitable for networks using AAL1.

Table 2. ATM adaptation mechanisms

In general, the AAL adaptation mechanism accepts packets from upper-layer protocols, splits them into 48-byte segments, and generates the ATM cell payload field.

The desire for more efficient use of bandwidth is driving the development of speech compression mechanisms. A standard PCM signal (PCM) for speech transmission requires, as already noted, a bandwidth allocation of 64 kbit/s (ITU-T recommendation G.711), which is clearly redundant.

One of the oldest speech compression algorithms is ADPCM (Adaptive Differential Pulse Code Modulation; G.726 standard was adopted in 1984). This algorithm provides almost the same quality of speech reproduction as PCM, however, to transmit information when using it, a bandwidth of only 16 - 32 kbit/s is required. The method is based on the fact that sharp jumps in intensity are impossible in an analog signal transmitting speech. Therefore, if you encode not the signal amplitude itself, but its change compared to the previous value, then you can get by with a smaller number of bits. In ADPCM, a change in signal level is encoded as a four-bit number, while the frequency of signal amplitude measurement remains unchanged.

Many coding methods originate from the LPC (Linear Predicative Coding) method. LPC uses a sequence of digital amplitude values as an input signal, but the encoding algorithm is applied not to individual digital values, but to specific blocks of them. For each such block of values, its characteristic parameters are calculated: frequency, amplitude and a number of others. It is these values that are transmitted over the network. With this approach to speech coding, firstly, the requirements for the computing power of specialized processors used to process the signal increase, and secondly, the transmission delay increases, since coding is applied not to individual values, but to a certain set of them, which is the beginning of the conversion should be accumulated in a specific buffer. It is important that the delay in speech transmission is associated not only with the need to process the digital signal (this delay can be reduced by increasing processor power), but also directly with the nature of the compression method. This method allows you to achieve very high compression ratios, which correspond to a bandwidth of 2.4 or 4.8 kbit/s, but the sound quality suffers greatly here. Therefore, it is not used in commercial applications, but is used mainly for conducting official negotiations.

More sophisticated speech compression methods rely on the use of LPC in combination with waveform coding elements. These algorithms use feedback coding, where code optimization occurs during signal transmission. Having encoded the signal, the processor tries to restore its shape and compares the result with the original signal, after which it begins to vary the encoding parameters, achieving the best match. Having achieved such a match, the equipment transmits the received code over communication lines; at the opposite end the audio signal is restored. It is clear that using this method requires even more serious computing power.

One of the most common varieties of the described encoding method is the LD-CELP (Low-Delay Code-Excited Linear Prediction) method. It allows you to achieve satisfactory playback quality at a bandwidth of 16 kbps. This method was standardized by the International Telecommunications Union (ITU) in 1992 as the G.728 speech coding algorithm. The algorithm is applied to a sequence of digits resulting from the analog-to-digital conversion of a voice signal with 16-bit resolution.

Five consecutive digital values are encoded in one 10-bit block - this gives the same 16 kbit/s. This method requires a lot of computing power; in particular, a straightforward implementation of G.728 requires a 44 MIPS processor.

In March 1995, the ITU adopted a new standard - G.723, which is supposed to be used in speech compression for organizing video conferencing over telephone networks. This standard is part of the more general H.324 standard, which describes an approach to organizing such video conferencing. The goal is to organize video conferences using regular modems. The basis of G.723 is the MP-MLQ (Multipulse Maximum Likelihood Quantization) speech compression method. It allows you to achieve very significant speech compression while maintaining a fairly high sound quality.

The method is based on the optimization procedure described above; with the help of various improvements, speech can be compressed to a level of 4.8; 6.4; 7.2 and 8.0 kbit/s. The structure of the algorithm allows, on the basis of software, to change the degree of voice compression during transmission. The delay introduced by coding does not exceed 20 ms.

While increasing the efficiency of bandwidth use, speech compression mechanisms can at the same time lead to deterioration in speech quality and increased delays. Some basic speech compression algorithms and the delays created by this are given in Table. 1.

Quantitative characteristics of speech quality degradation are QDU (Quantization Distortion Units): 1 QDU corresponds to quality degradation when digitized using a standard PCM procedure; QDU values for the main compression methods are given in table. 3. Additional speech processing leads to further loss of quality. According to ITU-T recommendations, for international calls the QDU value should not exceed 14, and transmission of a conversation over international trunk channels degrades the speech quality, as a rule, by 4 QDU.

Table 3. Deterioration in speech quality with use

various compression algorithms

Therefore, when transmitting a conversation over national networks, no more than 5 QDU should be lost. Therefore, for high-quality speech transmission, it is advisable to use the compression/decompression procedure only once in the network. In some countries, this is a regulatory requirement for enterprise networks connected to public networks.

Silence suppression is an important feature of ATM switches. The essence of pause suppression technology is to determine the difference between moments of active speech and silence during the connection period. As a result of using this technology, cells are generated only during active conversations. Since there is silence up to 60% of the time during a typical telephone conversation, there is a twofold optimization in the amount of data that must be transferred over the line. Combining speech compression technology and suppressing speech pauses in switches leads to a reduction in the data flow in the channel by up to eight times.

Newbridge Technologies

Upgraded MainStreet Series devices, such as the MainStreet 3600+, allow you to define different classes of service for voice traffic. They support HCV (High Capacity Voice) 8.16 kbit/s, LD-CELP (16 kbit/s) and A-CELP (8 kbit/s) algorithms based on ITU standards. A variety of voice interfaces are supported: E&M (Type I, II, III, IV, V), LS/GS Subscriber (FXS), LS/GS Exchange (FXO), T1 D4 and ESF (Extended Superframe) formats, E1 CAS and CCS, R2D (E&M), MRD Channel Unit.

In addition, in accordance with the recently published Frame Relay Forum FRF.11 Implementation Agreement specification, they implement Voice over Frame Relay (VoFR) technology. This method allows you to compress voice traffic and split it into packets for transmission in variable rate mode (VBR).

Newbridge said the products will enable telecommunications companies to provide state-of-the-art telephony services while reducing network operating costs. If the user finds it necessary to change application drivers, the network configuration can be easily changed by downloading the appropriate software to provide a different class of service. Newbridge's comprehensive solutions make it possible to centrally manage the entire network from anywhere.

VoFR, developed by Newbridge Networks and its subsidiary Castleton Network Systems, reduces the cost of operating networks by reducing average bandwidth and using frame relay bandwidth to carry voice and data simultaneously. This method provides more efficient allocation of bandwidth in narrow-band edge networks, where voice and data traffic volumes vary widely and efficient use of available bandwidth is critical. In addition, VoFR packets can be easily transmitted over ATM highways as regular frame relay traffic. Newbridge's overall strategy involves using Castleton's AssuredVoice technology - voice over FR, ATM and IP networks - in its devices.

Newbridge Networks products also implement advanced Quality of Service (QoS) mechanisms defined by the Frame Relay Forum at the frame relay transport layer. These mechanisms give priority to latency-sensitive traffic (such as voice). They control the latency of packets and manage the transmission of voice traffic in conditions of network congestion, which guarantees the quality of audio transmitted over frame relay networks.

This year, the company introduced the 3608 MainStreet Packet Access Mux, which provides cost-effective, on-demand bandwidth for frame relay voice applications. This multiplexer allows you to combine high-quality telephone communication with the transmission of faxes and data from a remote office over a single frame relay connection. This device also features A-CELP voice compression technology.

Equipment Cisco Systems

Cisco Systems has announced a five-phase open systems and technology strategy to help users integrate data, voice and video end-to-end, from smaller access nodes to larger backbone nodes. This strategy uses Cisco IOS software to provide voice over IP, frame relay, and ATM over a multiservice network environment.

The third phase of the data, voice and video integration strategy focuses on gateways between different environments in an environment where diverse services and technologies coexist. These include gateways from network protocols to telephone exchange protocols to PSTN and ISDN protocols, gateways from low-speed access to broadband backbone switching, and gateways from circuit switching media to packet (IP, FR) and cell switching media. ATM).

Some devices, such as the new Cisco MC3810 Multifunction Access Concentrator, use variable bit rate real-time (VBR-rt) ATM or frame relay to carry compressed voice. Through the use of statistical multiplexing methods and compression algorithms, the efficiency of bandwidth use increases. Cisco 3600 and 2600 series routers have similar capabilities, but in the world of IP networks.

Other products, such as the Circuit Emulation modules for the Cisco 7200 Router, LightStream 1010 ATM Switch, and Catalyst 5500 Campus Switch, use Constant Data Rate (CBR) for voice over ATM networks. This approach provides the ability to interoperate ATM equipment from different suppliers in networks and allows the use of systems that do not support PBX protocols to transmit telephone traffic over a single enterprise network infrastructure.

The Cisco MC3810 Multifunction Access Concentrator, introduced by Cisco Systems, integrates LAN, synchronous data, video, voice, and fax communications for transport over public or private frame relay, ATM, or time-division networks (TDM).

The MC3810 device software supports ATM and frame relay. Voice, fax and data are transported over ATM using AAL5 (VBR).

Any branch exchange (PBX) or telephone can be connected to the Cisco MC3810 device. It provides compression of voice messages up to 8 kbit/s using ACELP, a standard G.729 CS algorithm. The devices use Voice Activity Detection (VAD) technology to reduce voice traffic during pauses in order to reduce the cost of operating the channel when paying per hour.

The Cisco 3600 and 2600 series family of multifunctional modular routers are designed to connect remote small offices within corporate systems, as well as to connect such offices to global public networks.

The modular architecture of routers allows them to be adapted to the real conditions of customer networks. Wide Area Network (WAN) interface modules for the Cisco 1600, 2600, and 3600 families support multiple serial interfaces, ISDN BRI, and a suite of CSU/DSU link interfaces for primary and backup links. The voice module for Cisco 3600 and 2600 series routers allows for integrated transmission of voice, fax messages and data, and has an interface with existing telephones, fax machines, and office PBXs (FXO, FXS, E&M). The quality of the transmitted voice is ensured through the use of a compression mechanism and Cisco IOS QoS (Quality of Service) technologies, such as RSVP and the WFQ (Weighted Fair Queuing) mechanism. Each voice module supports two or four voice channels, as well as the H.323 standard and several compression protocols, including G.711 and G.729.

To build multi-service global networks, Cisco IGX 8400 series switches are used with interfaces for transporting ATM data, frame relay, synchronous and asynchronous data transmission systems, the Internet, as well as video signals and voice traffic.

The IGX switch uses an internal bus with a bandwidth of 1.2 Gbit/s for the transmission of ATM cells to connect interface and backbone modules as part of the system and provides the transmission of voice messages with the implementation of E1, T1 and Y1 interfaces, functions for determining the activity of the voice channel, suppression of pauses and call switching, as well as speech compression mechanisms ADPCM-32K, 24K and 16K, LD-CELP-16K.

The use of Voice Network Switching (VNS) technology provides switched virtual circuits (SVCs) for voice and data transmission over a multiservice network (IGX, BPX). Private branch exchange (PBX) users using Digital Private Network Signaling System (DPNSS), QSIG, or Q931A (Japanese ISDN) signaling protocols can communicate with each other on demand voice, just as if they were calling on PSTN. When VNS is used in conjunction with the IGX's Universal Voice Module Model C (UVM-C), which performs CAS-to-QSIG conversion, VNS also switches calls from the PBX using Channel Associated Signaling (CAS). The supported signaling protocols are variants of the Integrated Services Digital Network (ISDN). VNS provides a direct PBX connection, eliminating the need for tandem connections. In other words, using VNS in a network reduces the number of E1 trunks required to interconnect PBXs and provides a replacement for tandem PBXs.

Allowed for protection.

"___"___________________ 2007

Head of the IS Department

Doctor of Technical Sciences, Prof.

Petrova I.Yu.

D graduation project

Text documentation DP 230201.007.2007

Astrakhan - 2007

FEDERAL AGENCY FOR EDUCATION

State educational institution of higher professional education

"Astrakhan State University"

Faculty of Mathematics and Information Technologies

Speciality "Information systems and technologies"

Department "Information Systems"

I approve

Head of Department __________________

"____" ___________________20__

according to a student's graduation project

Kutepov Peter Viktorovich

1. Project topic Organization of a voice transmission network IP protocol based on the ASU distributed local area network

approved by order of the university dated “___” ____________2006. No. __________

2. Date of assignment for the diploma project"_____"________________20__

3. Initial data for the project.

A general approach to building an IP network for transmitting telephone traffic over a distributed network of ASU. Mechanisms for managing and solving problems of voice over IP. Ensuring IP speech quality. Bandwidth management. Configuring network equipment. Creating an IP network diagram for voice transmission.

4. Functions implemented by the system:

· functions related to data transfer protocols;

· Subject area survey

· Statement of the problem of generating initial data with the subsequent implementation of IP technology.

· Development of a working project - setting up network equipment, debugging, testing, creating documentation for use

· Calculation of economic and social efficiency from the implementation of the developed subsystem

· Determination of ergonomic conditions for the workplace of an employee of the educational unit

6. List of graphic material

ASU IP network structure

1) Connection diagram to the corporate network

2) Network structure of the main building of ASU

3) Structure of the ASU telephone network

5) Integration scheme with the corporate structure and the current telephone system

6) ASU network structure with IP telephony technology

7) Network structure of the main building of ASU with IP telephony technology

Supervisor ________________________________________

The task was accepted for execution ___________________________________________

CALENDAR PLAN

Name of stages graduation project	Project stage completion time	Completion mark, manager’s signature
1	Submission of a draft assignment for a diploma project	until 01.10.2006
2	Coordination of the assignment for the diploma project with the thesis supervisor and the head of the department	until 11/10/2006
3	Introduction. Survey of the subject area and preparation of the 1st chapter of the diploma project (10%)	until 01.12.2006
4	Technical project. Chapter 2. Detailed description of the functions of the designed system (25%)	until 01/10/2007
5	Report on pre-diploma practice with demonstration of the work of the created software product (60%)	until 04/07/2007
6	Chapter 3. Development of a detailed design (80%)	until 04/28/2007
7	Chapter 4. Calculation of economic and social effect (90%)	until 12.05.2007
8	Chapter 5. Ensuring workplace ergonomics (100%)	until May 25, 2007
9	Drawing up an explanatory note	until May 25, 2007
10	Preparing a presentation video	until May 25, 2007
11	Preliminary defense of the diploma project	until May 30, 2007

Student ___________________________________________

Supervisor ________________________________________

PROJECT CONSULTANTS

Supervisor _________________________

(signature)

The task was accepted for execution _________________________

(signature)

1 ABSTRACT

Local area network, telephony, digital automatic telephone exchange, Cisco 3845 router, IP telephone, voice transmission, long-distance communication.

The explanatory note is presented on 92 pages and includes 7 tables and 30 diagrams and images. 28 literature sources were used.

The object of work is Astarakhan State University.

Objective of the project – reduce the cost of long-distance and international calls using IP telephony technology, based on the local computer network of Astrakhan State University.

This project is intended for:

· reduction of costs for communication services

· improving the quality of telephone communications.

Ordinary telephone calls require an extensive communication network of telephone exchanges connected by fixed telephone lines. High phone company costs result in expensive long-distance calls.

Due to the increase in subscription fees for using the telephone network, IP telephony is becoming a more relevant and profitable option for transmitting voice and fax data.

Astrakhan State University has a well-organized IP network. It is built using the Cisco 3845 router and Cisco Systems Catalyst 2950 series switches. The use of this equipment makes it possible to organize a voice and fax data transmission network using the IP protocol.

The economic efficiency of the project implementation was calculated, and the following indicators were calculated:

· Capital expenditures - 101160 rub

· Depreciation - 860 rub

· Saving - 34879 rub

· Project payback - 4 months

A block diagram of the implementation of IP telephony in the ASU network has been developed, a diagram of connecting the digital PBX TOS 120 with the Cisco 3845 router has been developed, equipment has been selected for the implementation of the project, and an IP telephony service provider has been selected.

Introduction. 9

1. Description of the subject area.. 10

1.1. Basic concepts of IP telephony and types of structure of IP telephony networks. 10

1.2. ASU network structure.. 14

1.3. Cisco Systems solutions for IP telephony. 15

1.4. Cisco Systems Routers. 16

1.5. Catalyst 2950 Series Switch. 18

1.6. IP phone. 18

1.7. Functions of IP phones. 19

1.8. Setting up a VPN network. 20

1.9. Methods and means of protecting information. 21

2. Technical design. 23

2.1. Network structure of the main building of ASU.. 23

2.2. Structure of the ASU telephone network.. 23

2.3. Description of the organization of the IP telephony network. 26

2.4. Communication quality parameters. 27

3. Working draft. 29

3.1. IP telephony market research. 29

3.2. Companies presenting IP telephony solutions. 31

3.3. Search for the optimal IP provider according to your requirements. 36

3.4. Cisco Call Manager 40

3.5. Cisco Unity Express Module. 41

3.6. Cisco Systems VWIC-2MFT-E1 module for 60 voice channels. 42

3.7. Connecting the ASU digital PBX to the Cisco 3845 router. 43

3.9. Configuring Cisco CallManager 46

3.10. Types of connections when using IP telephony. 48

3.11. Selecting IP telephony service operators. 49

3.12. Operating principles of SIPNET. 51

3.13. Configuring SIPNET routing. 52

3.14. SIP protocol. General information. 53

3.15. Principles of the SIP protocol. 55

3.16. Integration of SIP with IP networks. 56

3.17. How VPN works... 59

4. Economic and social effect from the implementation of the project 61

4.1. Feasibility study of the project. 61

4.2. Savings on long-distance and international calls. 61

4.3. Accelerated return on capital expenditures. 62

4.4. Calculation of current costs. 64

4.5. Depreciation. 65

4.6. Calculation of financial results of the project. 65

4.7. Conclusions... 66

5. Ensuring workplace ergonomics... 68