CN100525297C - Realization method of speech service - Google Patents

Realization method of speech service Download PDF

Info

Publication number
CN100525297C
CN100525297C CNB2004100737457A CN200410073745A CN100525297C CN 100525297 C CN100525297 C CN 100525297C CN B2004100737457 A CNB2004100737457 A CN B2004100737457A CN 200410073745 A CN200410073745 A CN 200410073745A CN 100525297 C CN100525297 C CN 100525297C
Authority
CN
China
Prior art keywords
jitter
buffer
packets
ran
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2004100737457A
Other languages
Chinese (zh)
Other versions
CN1747465A (en
Inventor
路讴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CNB2004100737457A priority Critical patent/CN100525297C/en
Publication of CN1747465A publication Critical patent/CN1747465A/en
Application granted granted Critical
Publication of CN100525297C publication Critical patent/CN100525297C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

A method for implementing speech service includes such steps as setting up VDP layer on RAN for the conversion between speech frame data and speech packet data, receiving the encoded speech frame data from MS by the RAM of transmitting end, converting it to speech packet data, transmitting it to packet network, receiving it by the RAM of receiving end, converting it to speech frame data by VDP layer, transmitting it to MS, decoding it and reproducing.

Description

A kind of method that realizes speech business
Technical field
The present invention relates to voice transmission technology, particularly relate to the method that in packet network, realizes speech business.
Background technology
The voice transfer of traditional circuit mode needs the bandwidth of 64kbps, and along with the development of 2.5G and 3G wireless network, introduced Packet data service, the packet voice technology adopts the coded system less than 8Kbps usually, because the existence of user's communication quiet period, the actual average bandwidth of packet network transferring packet voice tend to be low to moderate 2 to 3Kbps.So the packet voice technology is conserve bandwidth effectively, thereby reduced the networking and the operation cost of packet network, and, because packet network has networking flexibility, it is more and more come into one's own, so be widely used in the wireless network.
In the system that realizes the packet voice business, the unit of the conversion of number of speech frames certificate and packets of voice data need be set, carry out the conversion between number of speech frames certificate and the packets of voice data.And, there is packet jitter in the packet network, the size of shake is reflected within the certain hour, and on the difference of the end-to-end time delay of different grouping data, the difference of end-to-end time delay is big more between the grouped data, illustrates that the shake of packet network is serious more.When utilizing the packet network transmitting audio data, shake can cause speech data in time not play, and causes the sound vacancy, thus the voice quality of influence conversation, so, need carry out dithering process to the packets of voice data.
Cdma system realizes that the mode of packet voice business has multiple, wherein more traditional a kind of system configuration that realizes the packet voice business as shown in Figure 1, the specific implementation of packet voice business is under this networking mode: voice messaging is gone up coding and is sent to transmit leg packet voice gateway from transmit leg MS via wireless access network (RAN) and mobile switching centre (MSC) with the form of number of speech frames certificate at transmit leg travelling carriage (MS), at transmit leg packet voice gateway the number of speech frames certificate is converted to the packets of voice data, pass to the recipient by packet network, recipient's packet voice gateway is to carrying out dithering process and the packets of voice data are converted to the number of speech frames certificate from the packets of voice data of packet network, be sent to recipient MS by MSC and RAN then, recipient MS is decoded as voice signal with the number of speech frames certificate.
The defective of this networking mode is special packet voice gateway device need be set number of speech frames certificate and packets of voice data are changed, and carries out dithering process, has increased extra cost.
Defective for the implementation that solves above-mentioned packet voice, another kind of cdma system occurs and realized the mode of packet voice business, the system configuration that this packet voice business is adopted as shown in Figure 2, the specific implementation of packet voice business is under this networking mode: directly speech data is packaged as the packets of voice data on transmit leg MS, RAN is sent to packet network by transmit leg, packet network with the packets of voice transfer of data to recipient RAN, recipient RAN sends the packets of voice data to recipient MS, and it is voice signal with this data direct decoding also that recipient MS carries out dithering process to the packets of voice data.
As can be seen, second kind of networking mode compared with first kind of networking mode, need not to be provided with special packet voice gateway, and still, also there are some defectives in this networking mode:
At first, need by mobile phone realize packets of voice data and number of speech frames according between conversion, and early stage mobile phone majority can not be supported this function.
Secondly, this networking mode need carry out dithering process by mobile phone, causes the handset processes complexity, needs higher handling property, has improved the cost of mobile phone.
Summary of the invention
Main purpose of the present invention is to provide a kind of method that realizes speech business, by the conversion between RAN realization number of speech frames certificate and the packets of voice data.
The objective of the invention is to be achieved through the following technical solutions:
A kind of method that realizes speech business, this method comprises the steps:
On the protocol layer of wireless access network RAN, increase the speech data agreement VDP layer that is used to carry out packets of voice data and the mutual conversion of number of speech frames certificate;
Transmit leg RAN receive that the transmit leg mobile station MS sends through the number of speech frames of coding according to after, by self VDP layer with number of speech frames according to being converted to the packets of voice data, and the packets of voice data are sent to packet network;
After recipient RAN receives packets of voice data from packet network, the packets of voice data are converted to the number of speech frames certificate by self VDP layer, and with number of speech frames according to being sent to recipient MS, recipient MS is to the speech frame data decode and play.
Wherein, this method further comprises: the jitter-buffer of the temporary transient storaged voice grouped data of link assignment one constant volume of in RAN, connecting for each bar,
Described recipient RAN receives after the packets of voice data, with these packets of voice data be converted to number of speech frames according to before, further comprise: recipient RAN deposits the packets of voice data that receive in jitter-buffer, and takes out the packets of voice data to the VDP layer from jitter-buffer when satisfying the taking-up condition.
Wherein, described recipient RAN with the method that the packets of voice data that receive deposit jitter-buffer in is: recipient RAN judges whether current jitter-buffer has remaining space, if then deposit the packets of voice data in jitter-buffer; Otherwise, abandon the packets of voice data that deposit in the earliest in the jitter-buffer, deposit the packets of voice data that receive in jitter-buffer then.
Wherein, the method for described recipient RAN from jitter-buffer taking-up packets of voice data to the VDP layer is:
A, recipient RAN judge whether jitter-buffer is empty, if then forward step B to; Otherwise, take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then, continue to judge whether jitter-buffer is empty;
B, recipient RAN judge the current condition of taking out data from jitter-buffer that whether satisfies, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise, return step B, continue to judge the current condition of taking out data from jitter-buffer that whether satisfies.
Wherein, described judge currently whether satisfy the method for taking out the condition of data and be from jitter-buffer: recipient RAN judges whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with, if then take out the packets of voice data to the VDP layer from jitter-buffer; Otherwise, continue to judge whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with.
Wherein, this method further comprises: the jitter buffer timer is set in RAN,
In step B, judge current whether satisfy take out data from jitter-buffer before, further comprise: start the jitter buffer timer,
Described judge currently whether satisfy the method for taking out the condition of data and be from jitter-buffer:
B01, recipient RAN judge whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise forward step B02 to;
B02, recipient RAN judge whether the jitter buffer timer is overtime, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise return step B01.
Wherein, this method further comprises: the jitter buffer timer is set in RAN,
In step B, judge current whether satisfy take out data from jitter-buffer before, further comprise: start the jitter buffer timer,
Described judge currently whether satisfy the method for taking out the condition of data and be from jitter-buffer:
B11, recipient RAN judge whether the jitter buffer timer is overtime, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise forward step B12 to;
B12, recipient RAN judge whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise return step B11.
Wherein, the fixed value that is provided with for speech quality of the capacity of described jitter-buffer according to the jitter conditions of packet network of prediction and customer requirements.
Wherein, the capacity of described jitter-buffer is dynamically determined according to the jitter conditions of current group network for recipient RAN.
Wherein, described jitter conditions according to the current group network determines that dynamically the method for the capacity of jitter-buffer is:
Determine the jitter value of the current speech grouped data of reflection current group network jitter situation;
Determine the capacity of current jitter-buffer according to the jitter value of current speech grouped data.
Wherein, the method of the jitter value of described definite current speech grouped data is: judge whether the current speech grouped data is first packets of voice data that recipient RAN receives in this speech business, if then the recipient RAN shake initial value that will be provided with is as the jitter value of current speech grouped data; Otherwise recipient RAN determines the current speech transmission of packet data time, determines the jitter value of current grouped data then according to the jitter value of transmission time of current speech transmission of packet data time, last packets of voice data and last packets of voice data.
Wherein, the method in the transmission time of described definite packets of voice data is: the time that recipient RAN will receive the packets of voice data deducts the data rise time that is included in these data, obtains the transmission time of packets of voice data.
Wherein, describedly determine that the method for current speech grouped data jitter value is according to the transmission time of current speech transmission of packet data time, last packets of voice data and the jitter value of last packets of voice data:
Jitter value n=jitter value N-1+ (| the transmission time n-the transmission time N-1|-jitter value N-1)/σ,
Wherein, jitter value nBe the jitter value of current speech grouped data, jitter value N-1Be the jitter value of last packets of voice data, transmission time nBe current speech transmission of packet data time, transmission time N-1Be the transmission time of last packets of voice data, σ is a convergence coefficient.
Wherein, described convergence coefficient is 16.
Wherein, described jitter value according to the current speech grouped data determines that the method for jitter-buffer capacity is:
Maximum jitter thresholding, minimum jitter thresholding, heap(ed) capacity thresholding and minimum capacity thresholding are set on RAN;
The jitter value of current speech grouped data and maximum jitter thresholding and minimum jitter thresholding are compared, if the jitter value of current speech grouped data is greater than the maximum jitter thresholding, judge that then whether current jitter-buffer capacity is less than the heap(ed) capacity thresholding, if then increase the jitter-buffer capacity; Otherwise, keep current jitter-buffer capacity constant;
If whether the jitter value of current speech grouped data, then judges current jitter-buffer capacity less than the minimum jitter thresholding greater than the minimum capacity thresholding, if then reduce the jitter-buffer capacity; Otherwise, keep current jitter-buffer capacity constant;
If the jitter value of current speech grouped data is more than or equal to the minimum jitter thresholding and be less than or equal to the maximum jitter thresholding, keep the capacity of current jitter-buffer constant.
Wherein, the method for described increase jitter-buffer capacity is:
The capacity step value is set on RAN;
The jitter-buffer capacity that increases is current jitter-buffer capacity and capacity step value sum,
The described method that reduces the jitter-buffer capacity is:
The capacity step value is set on RAN;
The jitter-buffer capacity that reduces is the poor of current jitter-buffer capacity and capacity step value.
Wherein, the method for described increase jitter-buffer capacity is:
Volume percent is set on RAN;
The jitter-buffer capacity that increases is the product that current jitter-buffer capacity adds this capacity and volume percent,
The described method that reduces the jitter-buffer capacity is:
Volume percent is set on RAN;
The jitter-buffer capacity that reduces is the product that current jitter-buffer capacity deducts this capacity and volume percent.
Wherein, this method further comprises: in RAN, distribute the jitter-buffer of the temporary transient storaged voice frame data of a constant volume,
Described recipient RAN is converted to the packets of voice data after the number of speech frames certificate, further comprises:
Recipient RAN deposits the number of speech frames certificate in jitter-buffer;
Recipient RAN takes out the number of speech frames certificate from jitter-buffer.
Wherein, described packet network is a CDMA network.
The invention provides a kind of method that realizes speech business, this method is to increase the VDP layer that is used to carry out packets of voice data and the mutual conversion of number of speech frames certificate on the protocol layer of RAN; Transmit leg RAN receive that transmit leg MS sends through the number of speech frames of coding according to after, by self VDP layer with number of speech frames according to being converted to the packets of voice data, and the packets of voice data are sent to packet network; When recipient RAN receives from the packets of voice data of packet network, the packets of voice data are converted to the number of speech frames certificate by self VDP layer, and with number of speech frames according to being sent to recipient MS, recipient MS is to the speech frame data decode and play.The method of prior art is by the mutual conversion between packet voice gateway realization packets of voice data and the number of speech frames certificate, or by the conversion between mobile phone realization packets of voice data and the number of speech frames certificate.From the present invention and prior art to recently, method of the present invention need not to be provided with special equipment carry out packets of voice data and number of speech frames according between conversion, thereby simplified network configuration, avoid extra cost; In addition, the present invention need not the conversion that mobile phone carries out above-mentioned data format, thereby mobile phone is not had extra requirement, has simplified the operation of mobile phone simultaneously.
And, from technical scheme as can be seen, method of the present invention is carried out dithering process by recipient RAN to the packets of voice data, and this has also been avoided in the prior art because packet voice gateway or mobile phone carry out network cost increase or the mobile phone operation complicated problems that dithering process caused.
Description of drawings
Fig. 1 is the system configuration schematic diagram of the realization speech business of prior art one;
Fig. 2 is the system configuration schematic diagram of the realization speech business of prior art two;
Fig. 3 is the structural representation of the protocol layer of RAN of the present invention;
Fig. 4 is the system configuration schematic diagram according to realization speech business of the present invention;
Fig. 5 is the method flow diagram according to realization speech business of the present invention;
Fig. 6 is recipient RAN deposits the packets of voice data in to jitter-buffer a flow chart.
Fig. 7 is recipient RAN takes out the packets of voice data from jitter-buffer a flow chart.
Fig. 8 is the method flow diagram according to dynamic adjustment jitter-buffer size of the present invention.
Embodiment
In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention is further described below in conjunction with the drawings and specific embodiments.
Method of the present invention is improved prior art on the basis of networking mode shown in Figure 2, on RAN, increase speech data agreement (VDP) layer, Fig. 3 is the structural representation of the protocol layer of RAN of the present invention, as can be seen from Figure 3, RAN of the present invention with the mutual protocol layer of packet network in, except comprising original transmitted in packets layer, link layer and physical layer, also on the transmitted in packets layer, increased the VDP layer, to realize the mutual conversion between number of speech frames certificate and the packets of voice data.RAN is had the number of speech frames certificate is converted to the packets of voice data, and the packets of voice data are converted to the function of number of speech frames certificate.
Fig. 4 is the system configuration schematic diagram according to realization speech business of the present invention, method of the present invention is sent to transmit leg RAN with speech data with the form of speech frame on transmit leg MS, transmit leg RAN is the packets of voice form with the speech frame format conversion, be sent to packet network then, packet network with the packets of voice transfer of data to recipient RAN, recipient RAN carries out dithering process to the packets of voice data that receive, then the packets of voice data are converted to the number of speech frames certificate, and being sent to recipient MS, recipient MS is a voice signal with the data direct decoding of speech frame form.
Dithering process is that recipient RAN deposits the packets of voice data that receive in set in advance jitter-buffer, when satisfying certain taking-up condition, the packets of voice data are taken out from jitter-buffer, carry out the step that the packets of voice data is converted to the number of speech frames certificate then.
Fig. 5 is the method flow diagram according to realization speech business of the present invention, and as can be seen from Figure 5, method of the present invention comprises the steps:
Step 501: on transmit leg RAN and recipient RAN, increase the VDP layer respectively and be used to carry out the jitter-buffer of dithering process, and the space hold rate threshold value and the buffering timer of jitter-buffer are set.
The effect of VDP layer is the mutual conversion that realizes between number of speech frames certificate and the grouped data, and increase VDP layer can make RAN have the number of speech frames certificate is converted to the packets of voice data on RAN, and the packets of voice data is converted to the function of number of speech frames certificate.
Step 502: transmit leg MS encodes to the user's voice data, is digital bit stream with speech conversion, in the form of eating dishes without rice or wine with speech frame speech data is sent to RAN.
The mode that speech data is encoded has multiple, for example, can adopt and strengthen variable bit rate encoding and decoding (EVRC, Enhanced Variable Rate Codec) or two kinds of coded systems of Qualcomm Code Excited Linear Prediction (QCELP) (QCELP, QUALCOMM Code Excited Linear Prediction) of proposing of high pass.Number of speech frames for example, can adopt the form of 20ms/ frame according to also adopting different forms, the speech data of storage 20ms in promptly every frame.
Step 503: transmit leg RAN receive number of speech frames that transmit leg MS sends according to after, according to certain packets of voice mode the number of speech frames certificate is converted to the packets of voice data at the VDP layer, and fill in destination address, import the packets of voice data into packet network then.
The packets of voice mode has multiple, for example, a speech frame can be packaged as a packets of voice, or a plurality of speech frames are packaged as a packets of voice, the number of the speech frame that comprises in each packets of voice is determined by the parameter that is provided with, and the number of number cuts both ways: the speech frame that each packets of voice comprises network delay more at least is more little, but the quantity of packets of voice is many, causes the packet network burden to increase; The speech frame that each packets of voice comprises network delay more at most is big more, but the quantity of packets of voice is few, so the burden of packet network is less.Since there are above-mentioned pros and cons, thus need dynamically adjust the parameter of controlling the packets of voice mode according to the voice quality and the offered load of difference network actual measurement constantly, to determine to set up the number of speech frames of packets of voice.
Step 504: the packets of voice data that the jitter-buffer that recipient RAN utilization has been provided with is sent packet network are carried out dithering process.
Step 505: recipient RAN is converted to the number of speech frames certificate at the VDP layer with the packets of voice data, and the number of speech frames certificate is sent to recipient MS.
Step 506: recipient MS is according to the speech frame data decode of frame format to receiving of transmit leg MS, and the broadcast speech data.
At recipient RAN not when MS sends speech data, for feeling tangible speech, the user who does not make recipient MS interrupts, can send comfort noise to recipient MS, recipient MS decode this comfort noise and playback, comfort noise is meant the special noise that human body is not produced stimulation.
In step 504, recipient RAN carries out dithering process to the packets of voice data that packet network is sent, the flow process of dithering process is made up of two parallel flow processs, one to be recipient RAN deposit the flow process of packets of voice data in to jitter-buffer, and another is recipient RAN takes out the packets of voice data from jitter-buffer a flow process.Introduce two flow processs of dithering process below in detail.
To be recipient RAN deposit the flow chart of packets of voice data in to jitter-buffer to Fig. 6, and as can be seen from the figure, this flow process comprises the steps:
Step 601: recipient RAN judges whether to receive the packets of voice data that packet network is sent, if then forward step 602 to; Otherwise, turn back to step 601, continue to judge whether to receive the packets of voice data.
Step 602: recipient RAN judges whether jitter-buffer has remaining space, if then forward step 604 to; Otherwise, forward step 603 to.
Step 603: recipient RAN gives up the packets of voice data that deposit in the earliest in the jitter-buffer, forwards step 604 then to.
Step 604: recipient RAN deposits the packets of voice data in jitter-buffer, and repeated execution of steps 601 then.
Fig. 7 is that recipient RAN takes out the flow chart of packets of voice data from jitter-buffer, and as can be seen from the figure, this flow process comprises the steps:
Step 701: recipient RAN judges whether jitter-buffer is empty, if then forward step 702 to; Otherwise, forward step 704 to.
Step 702: recipient RAN starts the jitter buffer timer.
Step 703: recipient RAN judges the current condition that whether satisfies taking-up packets of voice data from jitter-buffer, and two kinds of judgment modes are arranged, and can choose one wantonly:
(1) recipient RAN judges that whether the space hold rate of jitter-buffer has reached the space hold rate threshold value of the buffer pool size of setting, if reach threshold value, then forwards step 704 to; If do not reach threshold value, judge then whether the jitter buffer timer is overtime, if overtime, then forward step 704 to; If not overtime, then return step 703.
(2) recipient RAN judges whether the jitter buffer timer is overtime, if overtime, then forwards step 704 to; If not overtime, judge then whether the space hold rate of buffering area reaches space hold rate threshold value, if reach threshold value, then forward step 704 to; If do not reach threshold value, then return step 703.
Step 704: recipient RAN takes out the packets of voice data to the VDP layer from jitter-buffer.
Need explanation, taking out the scheme of packets of voice data from jitter-buffer, also the jitter buffer timer can be set, judge the current condition of taking out the packets of voice data that whether satisfies and directly whether reach space hold rate threshold value by the space hold rate of judging current jitter-buffer.
In addition, the capacity of jitter-buffer can be the parameter of network configuration, also can be according to the jitter conditions of packet network and the dynamic parameter of change.The foregoing description belongs to the previous case, adjusts the capacity of buffering area when network is configured according to the speech quality of the jitter conditions of packet network of prediction and customer requirements, in case configuration finishes, then can not adjust this parameter automatically; Under latter event, need to set an initial jitter-buffer capacity, dynamically change this capacity according to the jitter conditions of packet network then, dynamically the method for change jitter-buffer capacity is: after recipient RAN receives new packets of voice data, according to the packets of voice data computation jitter value that receives, and adjust the capacity of jitter-buffer according to jitter value.
Method to dynamic adjustment jitter-buffer capacity is illustrated below.
In order dynamically to adjust the jitter-buffer capacity, except increasing VDP layer and jitter-buffer on the RAN and parameters be set, also need to be provided with some and dynamically adjust the required parameter of jitter-buffer capacity, such as initial jitter value, maximum jitter thresholding, minimum jitter thresholding, jitter-buffer heap(ed) capacity thresholding and minimum capacity thresholding.
Because implementing dynamically, the prerequisite of adjustment jitter-buffer is that record transmit leg RAN generates the time of packets of voice data and the time that recipient RAN receives the packets of voice data, so, when transmit leg RAN formed the packets of voice data, the record current system time also should be attached in the packets of voice data as the grouped data rise time time; When recipient RAN received speech data from packet network, the record current system time was as grouped data time of advent.
After recipient RAN receives packets of voice data from packet network, these packets of voice data are carried out before the dithering process, increase the step of dynamically adjusting buffer pool size, as shown in Figure 8, the step of dynamic adjustment jitter-buffer capacity of the present invention is as follows:
Step 801: recipient RAN calculates jitter value according to formula (1):
Jitter value n=jitter value N-1+ (| the transmission time n-the transmission time N-1|-jitter value N-1)/σ (1)
Wherein, jitter value nIt is the jitter value of current calculating gained; Jitter value N-1It is the jitter value that calculates the last time; Transmission time nBe the transmission time of current grouped data on network, its value be current grouped data time of advent with grouping in grouped data rise time of carrying poor; Transmission time N-1It was the last transmission time of grouped data on network; σ is the convergence coefficient that can suitably adjust according to network condition, and preferably, it is worth desirable 16.
If the current packets of voice data that receive are first packets of voice data, at first calculate the transmission of packet data time, transmission time is the poor of the grouped data time of advent and grouped data rise time, does not calculate jitter value this moment, the initial jitter value of jitter value for being provided with.
After recipient RAN receives second packets of voice data, with transmission time of last packets of voice data as the transmission time N-1, with the jitter value of last packets of voice data as jitter value N-1, and with transmission time of current grouped data as the transmission time n, calculate jitter value with formula (1) n
Step 802: will calculate the jitter value of current time of gained and maximum jitter thresholding and minimum jitter thresholding and compare, if jitter value greater than the maximum jitter thresholding, then forwards step 803 to; If jitter value less than the minimum jitter thresholding, then forwards step 804 to; If jitter value between maximum jitter thresholding and minimum jitter thresholding, comprises maximum jitter thresholding and minimum jitter thresholding, then forward step 805 to.
Step 803: whether the capacity of judging current jitter-buffer is less than the heap(ed) capacity thresholding, if then forward step 806 to; Otherwise, forward step 805 to.
Step 804: whether the capacity of judging current jitter-buffer is greater than the minimum capacity thresholding, if then forward step 807 to; Otherwise, forward step 805 to.
Step 805: keep the capacity of current jitter-buffer constant, finish then.
Step 806: increase capacity, finish then with jitter-buffer.
The method of increase capacity has two kinds: a kind of is to increase fixing step value; Another kind is the certain percentage that increases current capacity, for example, increases 20% of current capacity.
Step 807: reduce capacity with jitter-buffer.
The method that reduces capacity has two kinds: a kind of is the step value that reduces to fix; Another kind is the certain percentage that reduces current capacity, for example, reduces 20% of current capacity.
After having determined buffer pool size, recipient RAN utilizes jitter-buffer that the packets of voice data are carried out dithering process at the VDP layer.
In above-mentioned realization packet voice operational approach, recipient RAN carries out dithering process to the packets of voice data that receive, and then the packets of voice data is converted to the number of speech frames certificate.In actual treatment, recipient RAN also can be converted to the number of speech frames certificate with the packets of voice data that receive earlier, then the number of speech frames certificate is carried out dithering process, also is to utilize jitter-buffer to carry out dithering process to number of speech frames according to the method for carrying out dithering process, in this case, what store in the jitter-buffer is the number of speech frames certificate, rather than the packets of voice data.The capacity of jitter-buffer can be the fixed value that sets in advance, and also can calculate jitter value according to the jitter conditions of current group network as mentioned before, and dynamically change the capacity of jitter-buffer according to the jitter value that calculates.
In concrete implementation process, can carry out suitable improvement, to adapt to the concrete needs of concrete condition to the method according to this invention.Therefore be appreciated that according to the specific embodiment of the present invention just to play an exemplary role that in order to restriction protection scope of the present invention, for example, the present invention is not limited in cdma network and uses, and can be applicable in the packet network of other type yet.

Claims (19)

1, a kind of method that realizes speech business is characterized in that, this method comprises the steps:
On the protocol layer of wireless access network RAN, increase the speech data agreement VDP layer that is used to carry out packets of voice data and the mutual conversion of number of speech frames certificate;
Transmit leg RAN receive that the transmit leg mobile station MS sends through the number of speech frames of coding according to after, by self VDP layer with number of speech frames according to being converted to the packets of voice data, and the packets of voice data are sent to packet network;
After recipient RAN receives packets of voice data from packet network, the packets of voice data are converted to the number of speech frames certificate by self VDP layer, and with number of speech frames according to being sent to recipient MS, recipient MS is to the speech frame data decode and play.
2, the method for realization speech business according to claim 1 is characterized in that, this method further comprises: the jitter-buffer of the temporary transient storaged voice grouped data of link assignment one constant volume of in RAN, connecting for each bar,
Described recipient RAN receives after the packets of voice data, with these packets of voice data be converted to number of speech frames according to before, further comprise: recipient RAN deposits the packets of voice data that receive in jitter-buffer, and takes out the packets of voice data to the VDP layer from jitter-buffer when satisfying the taking-up condition.
3, the method for realization speech business according to claim 2, it is characterized in that, described recipient RAN with the method that the packets of voice data that receive deposit jitter-buffer in is: recipient RAN judges whether current jitter-buffer has remaining space, if then deposit the packets of voice data in jitter-buffer; Otherwise, abandon the packets of voice data that deposit in the earliest in the jitter-buffer, deposit the packets of voice data that receive in jitter-buffer then.
4, the method for realization speech business according to claim 2 is characterized in that, the method for described recipient RAN from jitter-buffer taking-up packets of voice data to the VDP layer is:
A, recipient RAN judge whether jitter-buffer is empty, if then forward step B to; Otherwise, take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then, continue to judge whether jitter-buffer is empty;
B, recipient RAN judge the current condition of taking out data from jitter-buffer that whether satisfies, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise, return step B, continue to judge the current condition of taking out data from jitter-buffer that whether satisfies.
5, the method for realization speech business according to claim 4, it is characterized in that, described judge currently whether satisfy the method for taking out the condition of data and be from jitter-buffer: recipient RAN judges whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with, if then take out the packets of voice data to the VDP layer from jitter-buffer; Otherwise, continue to judge whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with.
6, the method for realization speech business according to claim 4 is characterized in that, this method further comprises: the jitter buffer timer is set in RAN,
In step B, judge current whether satisfy take out data from jitter-buffer before, further comprise: start the jitter buffer timer,
Described judge currently whether satisfy the method for taking out the condition of data and be from jitter-buffer:
B01, recipient RAN judge whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise forward step B02 to;
B02, recipient RAN judge whether the jitter buffer timer is overtime, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise return step B01.
7, the method for realization speech business according to claim 4 is characterized in that, this method further comprises: the jitter buffer timer is set in RAN,
In step B, judge current whether satisfy take out data from jitter-buffer before, further comprise: start the jitter buffer timer,
Described judge currently whether satisfy the method for taking out the condition of data and be from jitter-buffer:
B11, recipient RAN judge whether the jitter buffer timer is overtime, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise forward step B12 to;
B12, recipient RAN judge whether the space hold rate of jitter-buffer reaches occupancy threshold value between the buffer empty that has been provided with, if, then take out the packets of voice data to the VDP layer from jitter-buffer, return steps A then; Otherwise return step B11.
8, the method for realization speech business according to claim 2 is characterized in that, the fixed value that the capacity of described jitter-buffer is provided with for the speech quality according to the jitter conditions of packet network of prediction and customer requirements.
9, the method for realization speech business according to claim 2 is characterized in that, the capacity of described jitter-buffer is dynamically determined according to the jitter conditions of current group network for recipient RAN.
10, the method for realization speech business according to claim 9 is characterized in that, described jitter conditions according to the current group network determines that dynamically the method for the capacity of jitter-buffer is:
Determine the jitter value of the current speech grouped data of reflection current group network jitter situation;
Determine the capacity of current jitter-buffer according to the jitter value of current speech grouped data.
11, the method for realization speech business according to claim 10, it is characterized in that, the method of the jitter value of described definite current speech grouped data is: judge whether the current speech grouped data is first packets of voice data that recipient RAN receives in this speech business, if then the recipient RAN shake initial value that will be provided with is as the jitter value of current speech grouped data; Otherwise recipient RAN determines the current speech transmission of packet data time, determines the jitter value of current grouped data then according to the jitter value of transmission time of current speech transmission of packet data time, last packets of voice data and last packets of voice data.
12, the method for realization speech business according to claim 11, it is characterized in that, the method in the transmission time of described definite packets of voice data is: the time that recipient RAN will receive the packets of voice data deducts the data rise time that is included in these data, obtains the transmission time of packets of voice data.
13, the method for realization speech business according to claim 11, it is characterized in that, describedly determine that the method for current speech grouped data jitter value is according to the transmission time of current speech transmission of packet data time, last packets of voice data and the jitter value of last packets of voice data:
Jitter value n=jitter value N-1+ (| the transmission time n-the transmission time N-1|-jitter value N-1)/σ,
Wherein, jitter value nBe the jitter value of current speech grouped data, jitter value N-1Be the jitter value of last packets of voice data, transmission time nBe current speech transmission of packet data time, transmission time N-1Be the transmission time of last packets of voice data, σ is a convergence coefficient.
14, the method for realization speech business according to claim 13 is characterized in that, described convergence coefficient is 16.
15, the method for realization speech business according to claim 10 is characterized in that, described jitter value according to the current speech grouped data determines that the method for jitter-buffer capacity is:
Maximum jitter thresholding, minimum jitter thresholding, heap(ed) capacity thresholding and minimum capacity thresholding are set on RAN;
The jitter value of current speech grouped data and maximum jitter thresholding and minimum jitter thresholding are compared, if the jitter value of current speech grouped data is greater than the maximum jitter thresholding, judge that then whether current jitter-buffer capacity is less than the heap(ed) capacity thresholding, if then increase the jitter-buffer capacity; Otherwise, keep current jitter-buffer capacity constant;
If whether the jitter value of current speech grouped data, then judges current jitter-buffer capacity less than the minimum jitter thresholding greater than the minimum capacity thresholding, if then reduce the jitter-buffer capacity; Otherwise, keep current jitter-buffer capacity constant;
If the jitter value of current speech grouped data is more than or equal to the minimum jitter thresholding and be less than or equal to the maximum jitter thresholding, keep the capacity of current jitter-buffer constant.
16, the method for realization speech business according to claim 15 is characterized in that, the method for described increase jitter-buffer capacity is:
The capacity step value is set on RAN;
The jitter-buffer capacity that increases is current jitter-buffer capacity and capacity step value sum,
The described method that reduces the jitter-buffer capacity is:
The capacity step value is set on RAN;
The jitter-buffer capacity that reduces is the poor of current jitter-buffer capacity and capacity step value.
17, the method for realization speech business according to claim 15 is characterized in that, the method for described increase jitter-buffer capacity is:
Volume percent is set on RAN;
The jitter-buffer capacity that increases is the product that current jitter-buffer capacity adds this capacity and volume percent,
The described method that reduces the jitter-buffer capacity is:
Volume percent is set on RAN;
The jitter-buffer capacity that reduces is the product that current jitter-buffer capacity deducts this capacity and volume percent.
18, the method for realization speech business according to claim 1 is characterized in that, this method further comprises: in RAN, distribute the jitter-buffer of the temporary transient storaged voice frame data of a constant volume,
Described recipient RAN is converted to the packets of voice data after the number of speech frames certificate, further comprises:
Recipient RAN deposits the number of speech frames certificate in jitter-buffer;
Recipient RAN takes out the number of speech frames certificate from jitter-buffer.
19, the method for realization speech business according to claim 1 is characterized in that, described packet network is a CDMA network.
CNB2004100737457A 2004-09-09 2004-09-09 Realization method of speech service Expired - Fee Related CN100525297C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2004100737457A CN100525297C (en) 2004-09-09 2004-09-09 Realization method of speech service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2004100737457A CN100525297C (en) 2004-09-09 2004-09-09 Realization method of speech service

Publications (2)

Publication Number Publication Date
CN1747465A CN1747465A (en) 2006-03-15
CN100525297C true CN100525297C (en) 2009-08-05

Family

ID=36166791

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100737457A Expired - Fee Related CN100525297C (en) 2004-09-09 2004-09-09 Realization method of speech service

Country Status (1)

Country Link
CN (1) CN100525297C (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101540655B (en) * 2009-04-20 2013-02-27 中兴通讯股份有限公司 Method and system for dynamically packing speech frames
CN102761468B (en) * 2011-04-26 2015-04-08 中兴通讯股份有限公司 Method and system for adaptive adjustment of voice jitter buffer
CN103325385B (en) * 2012-03-23 2018-01-26 杜比实验室特许公司 Voice communication method and equipment, the method and apparatus of operation wobble buffer
US9313335B2 (en) * 2012-09-14 2016-04-12 Google Inc. Handling concurrent speech
CN111225418B (en) * 2018-11-27 2022-05-24 华为技术有限公司 Data transmission method and device

Also Published As

Publication number Publication date
CN1747465A (en) 2006-03-15

Similar Documents

Publication Publication Date Title
CA2565977C (en) Delivery of information over a communication channel
EP1688001B1 (en) A method of reducing or compensating for delays associated with ptt and other real time interactive communication exchanges
JP4426454B2 (en) Delay trade-off between communication links
JP2002513249A (en) Voice and data transmission switching in digital communication systems.
JP4842075B2 (en) Audio transmission device
US20050286536A1 (en) Reducing backhaul bandwidth
US7103033B2 (en) Robust vocoder rate control in a packet network
CN100525297C (en) Realization method of speech service
US20070097957A1 (en) Method for gracefully degrading packet data voice quality in a wireless communication network
WO2009036693A1 (en) Method and system for processing uplink and downlink data in wireless communication network
WO2007118392A1 (en) A method and device for transmitting voice data
CN1312946C (en) Self adaptive multiple rate encoding and transmission method for voice
JP2002314596A (en) Packet communication system
CN100373965C (en) Method for adjusting data-sampling and encoding time
WO2008083517A1 (en) A method and system for realizing the voice compensation in the mobile communication network
CN101540622A (en) Method and device for packed transmission of speech code

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090805

Termination date: 20150909

EXPY Termination of patent right or utility model