CN100407168C - Disc buffer substitution algorithm in layered video request - Google Patents

Disc buffer substitution algorithm in layered video request Download PDF

Info

Publication number
CN100407168C
CN100407168C CN03134707XA CN03134707A CN100407168C CN 100407168 C CN100407168 C CN 100407168C CN 03134707X A CN03134707X A CN 03134707XA CN 03134707 A CN03134707 A CN 03134707A CN 100407168 C CN100407168 C CN 100407168C
Authority
CN
China
Prior art keywords
program
algorithm
frequency
cycle
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN03134707XA
Other languages
Chinese (zh)
Other versions
CN1604054A (en
Inventor
刘志明
张拥军
彭宇行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN03134707XA priority Critical patent/CN100407168C/en
Publication of CN1604054A publication Critical patent/CN1604054A/en
Application granted granted Critical
Publication of CN100407168C publication Critical patent/CN100407168C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

The present invention relates to a substitution algorithm for a disk Cache applied to layered video on demand, wherein the data storage adopts the layered (multi-layer) structure. The utility model is characterized in that an LFRU algorithm which has the nearest approach to the least frequency at which the data access time and the frequency information are combined together is adopted during an initialization time period of the video on demand; after a system reaches a stable state, a time cycle method for the access frequency statistics information of former two cycles and a present cycle is combined with a linear prediction method for predicting the access frequency of a next cycle before the access of the present cycle is completed and when the next cycle begins; combined with the data access frequency and the data access time information, the LFRU algorithm of the invention has certain adaptability to the variation of the access mode and solves the problem of pollution in the Cache, so that the hit ratio of the Cache is greatly enhanced and the resources of a consumption server is greatly reduced while the running cost and the communication cost are greatly reduced and the reliability is enhanced.

Description

Disk buffering in the stratification video request program is replaced algorithm
Technical field
The present invention relates to a kind of disk Cache (buffer memory) that is used for stratification (multipolarity) video request program and replace algorithm, belong to Computer Systems Organization, parallel computation, network multimedia field.
Background technology
Video request program VOD (Video on Demand) is a multimedia service that combines multinomial technology such as computing machine, communication, multimedia, has a wide range of applications in fields such as office, teaching, amusement, electronic business transaction.The nucleus equipment of video request program is a video server, and its is stored and manages a large amount of video frequency programs, and needing simultaneously provides a plurality of continuous bit streams in real time for a plurality of users.Because continuous media has the real-time continuous characteristic, large-scale video on-demand system needs huge memory space.If all use very costliness of disk array price at a high speed, in addition, in video-on-demand service, a large amount of order request all concentrates on the popular program of minority, and it is uneconomic that the program that the program request frequency is less is stored in disk array.The design of video request program storage system should be satisfied the memory requirement of the video data of big data quantity, provides more program to the user, also should reduce system cost simultaneously as far as possible.Therefore, stratification or multipolarity storage system (Hierarchical Storage) are a kind of rational solution, it organizes together the physical storage medium of the multiple different characteristics storage mode with stratification, satisfies the requirement of system to aspects such as capacity, bandwidth, costs.
In stratification VOD storage system, general processing is formed storage system jointly by the less high speed storing equipment of the capacity of costliness and relatively cheap high capacity low speed storage device, usually internal memory is called first order memory device, disk is called second level memory device, and CD or tape become third level memory device, wherein call disk Cache (buffer memory) the popular program on the third level memory device dynamically being copied on the memory device of the second level with this process of program that replaces unexpected winner on the memory device of the second level.
Generally that access probability is the highest, " backing " (the hotvideo titles) that concentrated the program of most user capture to be called, access probability is very little, the program that can not get visiting for a long time is called " cold " (cold video titles), and access probability circle is called " warm sheet " (warm videotitles) in the program between the two.The access probability of program changes in time, and along with the increase of access probability, cold can become warm sheet, and warm sheet can become backing.In a kind of concentrating type parallel video servers as shown in Figure 1, the VoD system with cold fragmented storage in back end, backing or warm sheet are complete to be left in the service node, and the task of disk Cache is exactly will be with the back end popular program dynamically to service node.So in system's operational process, disk Cache program needs constantly statistics user's order request, selects the higher program of program request probability, copies on the service node, eliminates the low program of program request probability on the service node simultaneously.
When a large number of users request program, the target of disk Cache is exactly significantly to improve the hit rate of disk Cache, makes hot data can obtain response speed faster.Because the video file data volume is very big, if the change that disk Cache does not hit will consume server resource greatly.It is very important to the VoD system that therefore high efficiency Cache replaces algorithm, and its efficient will directly have influence on overall system efficiency and service ability.
Traditional Cache replacement algorithm has the mutation LEAT (Longest Expected Access Time) of LFU (least frequency used), LRU (least recentlyused) and LRU etc.LFU and LRU are two kinds of the most frequently used replacement algorithms, and they have complementary characteristic.Lru algorithm utilizes the time response of last visit, only uses last visit information, and is relatively more responsive to the variation of access characteristics, but do not consider the global property of data access, can not finish the replacement task of disk Cache well; The LFU algorithm meets video request program locality of reference characteristic, but causes Cache to pollute (Cache Pollution) easily, promptly ever accessed repeatedly and don't the data that re-use can not in time replace out Cache, thereby reduced the hit rate of Cache.LRU and LFU algorithm have been represented two respectively extremely.The LEAT algorithm is to serve as to replace foundation with the accessed time of the expectation of video data, because the accessed time of expectation is dynamic change, its frequency computation part has been increased system overhead.
Summary of the invention
Technology of the present invention is dealt with problems and is: overcome the deficiency that existing C ache replaces algorithm, and in conjunction with the principle of locality and the big characteristics of video file size of user capture, propose a kind of disk Cache and replace algorithm, thereby make ultra-large VoD system increase substantially the program request hit rate, thereby improved the response speed of system, save communication fee simultaneously, increase the reliability of system.
Technical solution of the present invention is: the disk Cache (disk buffering) in stratification (multipolarity) video request program replaces algorithm, and data storage adopts the structure of stratification (multipolarity), it is characterized in that comprising the following steps:
(1) in the initialization time of video request program section, the minimum frequency that adopts time and frequency information with data access to combine uses algorithm LFRU (least frequency and recently used) recently;
(2) after system reaches steady state (SS), adopt the time cycle method of the access frequency statistical information in previous cycle and this cycle, and predict the minimum frequency algorithm ELFU (Extended LeastFrequently Used) of the expansion that the linear prediction method of next cycle access frequency combines when before this cycle, visit finished, beginning with following one-period.
Above-mentioned LFRU algorithmic formula is: select a RFN kPeaked program is object as an alternative, wherein
RFN k = ( D - t ) / D * ( t - t k ) + t / D * t - t 0 c k + N k
D represents the initial time section in the formula, and t represents the time of running of algorithm, i.e. the time value of this visit, t kExpression is to film P kThe time value of the last visit, its initial value is t 0, N kBe the number of program K (referring to video frequency program) at service node, RFN kBe the time value of data access and the weighted sum of frequency values and the number of program K in service node (referring to preserve the server of popular video frequency program), the temporal information of part expression visit wherein, second portion is represented the frequency information of visiting, third part is represented the number of program at service node, therefore with temporal information and frequency information all naturalization be a time " distance " value, with their weighted sum as the comparative factor of replacement.This is in system's starting stage, after system enters the stabilization sub stage, just adopts above-mentioned steps (2) ELFU algorithm.
Above-mentioned steps (2) ELFU algorithmic formula is: with combination frequency ELFN k = t T * W i k ^ + W i k N k , ( 0 ≤ t ≤ T ) Minimum value object as an alternative, combination frequency ELFN kBe made up of first's statistical value and second portion predicted value two parts, t is the counter in the cycle in the formula, and promptly each cycle begins all zero clearings of t,
Figure C0313470700053
Be the predict frequency of each program, T is a time cycle, N kBe the number of program K at service node.The access times of each cycle statistics all are since 0, beginning can not reflect the truth of data access, so use the prediction access times of weighting and the access times sum of statistics to decide replacement, promptly form by statistics part and predicted portions, predicted value weights in the cycle are more and more littler, wherein the predicted portions increase of t in time and linear the minimizing, to the end of term in week, what decision was replaced is the access times of statistics.The access times of decision replacement are only relevant with this cycle and preceding two cycles as can be seen, do not use expired information, just avoided the Cache pollution problem.
Principle of the present invention is: according to the characteristics of VoD system, in system's starting stage, the visit instability to film has certain concussion, and at this moment Cache is relatively more responsive to the access time characteristic.So, in this time period, replace the temporal information that the algorithm binding data is visited, strengthened the adaptability that visit is changed.As time goes on, system's operation is progressively stable, just carries out the transition to based on access frequency information.This algorithm is referred to as the LFRU algorithm.After system is stable, adopt the ELFU algorithm, promptly shield over outmoded statistics, and use linear prediction method to quicken the superseded of stale data with periodic method, solved the Cache pollution problem.What use in traditional LFU algorithm is the statistics number of past to all visits of data object, when data do not re-use, the use information in past still works, cause the data of " outmoded " to be trapped among the Cache, it is exactly so-called Cache " pollution " problem, and the access frequency of the data statistics prediction next cycle of the access frequency in former two cycles of ELFU algorithm and this cycle, use predicted method in the finish time in cycle, do not use expired information, access frequency conductively-closed is early fallen, and has so just avoided disk Cache " pollution " problem.These two kinds of algorithms are applicable to the operation phase that the VoD system is different, can be according to the concrete condition setting initialization time section of system running environment.During this period of time, the operation of system is stable inadequately, just uses the LFRU algorithm, and after this system reaches steady state (SS), uses the ELFU algorithm, and the data of replacing object are with whole file unit as an alternative.
The advantage that patent of the present invention compared with prior art has: the LFRU algorithm among the present invention combines the access frequency and the access time information of data, variation to access module has certain adaptability, the ELFU algorithm has solved Cache " pollution " problem in the LFU algorithm with periodic method and predicted method, the hit rate of Cache is significantly improved, reduced to consume the resource of server, operating cost and communication fee are reduced greatly, also improved the reliability of system simultaneously.
Description of drawings
Fig. 1 is the structural representation of this server in a kind of concentrating type parallel video servers;
Fig. 2 is division and the parameter synoptic diagram thereof of time cycle in the FLFU algorithm of the present invention;
Fig. 3 is that algorithm LFRU and traditional LFU, the emulation of lru algorithm among the present invention is compared;
Fig. 4 for the emulation of ELFU algorithm among the present invention and traditional LFU algorithm relatively;
Fig. 5 is the programming access probability distribution synoptic diagram under the Zipf rule among the present invention.
Embodiment
Be illustrated in figure 1 as the structure of this server in a kind of concentrating type parallel video servers, in concrete steps be at this server management system with algorithm application of the present invention:
1. use LFRU in the VoD system starting stage
At first define an interval D transit time, in the D time interval, use the temporal information of visit earlier, improve the adaptability that algorithm changes access module.After system's operation tends towards stability, progressively carry out the transition to the frequency information that uses visit.
In the LFRU algorithm, the determination data replacement policy be the RFN value.The RFN value is the time of data access and the weighted sum of frequency information (access times of unit interval) and the number of program in node, temporal information is R, frequency information represents with F, and program represents with N in the number of service node, and their weight is the function F of and time correlation D(t) (0≤F D(t)≤1), the victim that the data object of RFN value maximum is replaced as data.
In D time section, temporal information and frequency information are complementary, RF=F D(t) * R+[1-F D(t)] * F.In order to adapt to from shaking to stable applicable cases, F D(t) function should have following character: at the initial stage of D F D(t) value is greater than 1-F D(t), at the later stage of D F D(t) value is less than 1-F D(t).The value of RF just trends towards F from R like this, has adaptive access time information from the time and turns to the frequency information with global optimization characteristics.
The logical timer of supposing the system is t, and t represents the moment of algorithm operation, the i.e. time value of this visit.Each film all has a timer t k(expression is to film P kThe time value of the last visit, its initial value is t (0), is the time value of system as the reference starting point.Each program also has a counter C k(expression is to film P kThe number of times of visit).
Each film all has a RFN value, when request of access arrives, calculates the RFN value of all films, and the film that the RFN value is maximum is replaced out disk Cache.In this algorithm, get F D(t)=(D-t)/and D, RFN k=F D(t) * R k+ (1-F D(t)) * F k+ N kClearly, F when t=0 D(t)=1, RFN k=R k+ N k, F when t=D D(t)=0, RFN k=F k+ N kIn the D time interval, t is from 0 to D, and when t>D, the expression system has entered steady state (SS), and this formula has just been finished its mission, and timing restarts in system.
Wherein, the temporal information R of programming access k=t-t k, the last visit distance of expression program is value now.The frequency information of programming access F k = t - t 0 c k , Be with program past all-access distance mean value now, the averaging time of expression film visit.With temporal information and frequency information all naturalization be a time " distance " value, with their weighted sum as the comparative factor of replacing.
Disk Cache and disk Cache are replaced algorithm combination be in the same place, be called LFRU disk Cache management algorithm, algorithm steps is as follows:
(1) user asks program P k
(2) if P kIn disk Cache, begin to read P kBe user's service
(3) revise P kStatistical value: t k=t; c k=c k+ 1
(4) if P kNot in disk Cache, then directly from the back end service, the step below carrying out simultaneously
(5) revise P kStatistical value: t k=t; c k=c k+ 1
(6) if idle space is arranged among the disk Cache, with P kCopy the service node of the minimum use amount of disk to
(7) otherwise, calculate the RFN value of all programs among the disk Cache, N kBe the number of program at service node
RFN k = ( D - t ) / D * ( t - t k ) + t / D * t - t 0 c k + N k - - - ( 1 )
Select a RFN kPeaked program is object as an alternative
(8) with P kCopy the service node (, the node of selection minimum load a plurality of) at this program place to as if having
(9) revise N kStatistical value: N k=N k+ 1
Above-mentioned algorithm shows, in system initialisation phase, all by program request to program all to enter disk Cache, if the chance that the new visit of restriction film enters Cache add a threshold value can for its RFN value.
By formula (1) as can be seen, at D in the time period, progressively carry out the transition to LFU from LRU with the increase algorithm of t.Though LFRU algorithm and LRFU algorithm all are to LFU, lru algorithm compromise, their emphasis difference.The LRFU algorithm is that the access time is multiplied by a weight relevant with frequency, and it is based on the replacement algorithm of access time; And the LFRU algorithm is to be multiplied by a weighted value relevant with the access time to access frequency, is based on access frequency.In addition, the LRFU algorithm is irrelevant with the time basically, and the LFRU algorithm changes in time, will carry out the transition to the LFU algorithm at last, and its compromise property only works in the time period of setting, and these are different fully with the LRFU algorithm.
2. behind the system stability, adopt the ELFU algorithm, the time cycle method of the access frequency statistical information in one-period and this cycle promptly, and predict the algorithm that the linear prediction method of next cycle access frequency combines when before this cycle, visit finished, beginning with following one-period.
The VoD system entered after stationary phase, and Cache is relatively more responsive to access frequency.VoD system film data life cycle is longer, and access frequency changes certain rules, promptly the access frequency of film exist growth stage, maturity stage and declining period three phases.According to these characteristics time shaft is divided by certain length, prediction uses the access frequency decision in previous cycle and this cycle to replace, and access frequency conductively-closed is so early fallen, and just can avoid the Cache pollution problem in the LFU algorithm.
When using the access frequency of the current one-period and the data prediction next cycle in this cycle, constantly be to use existing frequency method in this cycle in the middle of cycle, the finish time and next cycle in this cycle use predicted method the zero hour, can improve the hit rate of replacing algorithm like this.The main thought of this algorithm is to utilize the variation tendency of access frequency, and the film of growth stage is changed to disk as early as possible, with the film of the declining period disk that swaps out as early as possible; In other words, ELFU algorithm life cycle method is progressively eliminated stale data, and uses the method for linear prediction to quicken this selection process, and the warm sheet that makes new backing and will become backing enters Cache.
The derivation of the algorithmic formula of ELFU is: the situation of investigating a program earlier.Define a period of time T, establish T iIt is i cycle.The statistics program is at the access times W in this cycle i, to T iAll to program before access times F i, as shown in Figure 2.
F i = Σ j = 0 i - 1 W j , F 0=0,W i=F i+1-F i
When replacing, only determine the replacement of following one-period with the statistical information in previous cycle and this cycle.In order to predict the increment Delta W of access times between the delimiting period, W i=W I-1+ Δ W, T iDuring for existing cycle, W iWith Δ W all be unknown, in linear prediction, can use Δ W I-1Estimate Δ W i, promptly Δ W ^ = Δ W i - 1 , W iPredicted value just be W ^ i = W i - 1 + Δ W i - 1 .
In algorithm,, there is not the directly temporary transient cycle access number of times information W of storage for the global information of repertoire K i k, but keep a tlv triple<F i k, F I-1 k, F I-2 k, promptly last cycle, this cycle or title current period and the access times of following one-period are calculated the value of other needs again to the function of time according to them.
The present invention predicts the access frequency number of times in two kinds of situation:
(1) in the centre in this cycle, i.e. the relative centre of one-period, periodic method promptly in the time cycle, was not fully according to the prediction access times, but determined jointly to replace in conjunction with statistics service time.If present moment is t, the total access times of k portion film are F k, this cycle access times till now are W i, W i=F I+1-F i
Film among the disk Cache is divided into the collection S that rises aWith drop set S d, S a = { P k | &Delta; W i k > 0 } ; S d = { P k | &Delta; W i k < 0 } . The film of back end is in the rising stage and access times change to collection S greater than classifying as of threshold value Ω In, S in = { P k | W i k > &Omega; , &Delta; W i k > 0 } . When to the request comes of K portion film, if do not hit, if P k∈ S In, then satisfy changing to condition, if disk has enough spaces just to change to, otherwise the film that swaps out in the following order:
(a) S dAccess times add the access times (W of one-period in middle cycle current time I-1+ W i), suppose that reckling is P K ', if W i k > W i k &prime; , Then with P K 'Swap out, withdraw from, otherwise change (b);
(b) S aAccess times add the access times (W of one-period in middle cycle current time I-1+ W i), suppose that reckling is P K ', if W i k > W i k &prime; , Then with P K 'Swap out.
(2) finish in this cycle, i.e. the end of each time cycle and next time cycle, i.e. under the request that does not have data (when system is idle) utilized the prediction access times to adjust the Cache content automatically, is called the preset method when beginning.The preset method adopts the access frequency of next time cycle of first-order linear rule prediction W i + 1 k = W i k + &Delta; W i k , And definition changes to threshold value Ω in and the threshold value Ω out that swaps out, and satisfies the condition that changes to W i + 1 k > &Omega;in , &Delta; W i k > 0 Film list in change to the collection Sin; Satisfy the condition that swaps out W i + 1 k < &Omega;out , &Delta; W i k < 0 Film list in swap out the collection Sout.When this cycle finishes, the concentrated film that swaps out is swapped out, press W I+1 kSize order will change to concentrated film and change to successively, up to disk swap out the collection the space use until exhausted.
Above-mentioned time cycle method and two kinds of situations of linear prediction method are attached in the algorithm management, and the ELFU management algorithm step of its disk Cache is as follows:
(1) works as t=T iThe time, the predict frequency of each program among the calculating disk Cache
Figure C0313470700115
, according to
Figure C0313470700116
Be the program ordering, and list these programs in the S set out that swaps out
(2) user asks program P k
(3) if P kIn disk Cache, begin to read P kBe user's service
(4) revise P kStatistical value: W i k = W i k + 1
(5) if W i kNot in disk Cache, directly by the back end service, the step below carrying out simultaneously
(6) structure P kData structure<F i k, F I-1 k, F I-2 kAnd W i k
(7) if idle space is arranged among the disk Cache, with P kCopy the service node of the minimum use amount of disk to
(8) otherwise, calculate the ELFN value of all programs among the S set out that swaps out, N kBe the number of program K at service node
ELFN k = t T * W i k ^ + W i k N k , ( 0 &le; t &le; T ) - - - ( 2 )
Adjust the ordering of program according to the ELFN value, the program of selecting an ELFN minimum value in Sout is object as an alternative
(9) with P kCopy the service node (, the node of selection minimum load a plurality of) at this program place to as if having
(10) revise N kStatistical value: N k=N k+ 1
Each program of this algorithm dictates all has a tlv triple, as program P kHave<F i k, F I-1 k, F I-2 k, write down its visit situation, no matter whether it or not in Cache, and this data structure all keeps.Program among the Cache also has the data structure of statistics this cycle access times in addition, if P kIn Cache, W i kJust be illustrated in this cycle P kAccess times.
Utilize tlv triple to calculate P kAt this cycle T iIn prediction access frequency value
Figure C0313470700121
Process as follows:
W i - 1 k = F i k - F i - 1 k , W i - 2 k = F i - 1 k - F i - 2 k
&Delta; W i - 1 k = &Delta; W i - 1 k - &Delta; W i - 2 k , &Delta; W i k = &Delta; W i - 1 k
W i kPredicted value just be W ^ i k = W i - 1 k + &Delta; W i k
This cycle finish and next time cycle begin predict that access frequency can quicken eliminating of stale data.Can utilize predicted value to replace the program that access frequency descends at service node, in back end, can also initiatively send to service node by the program that predicted value is bigger, Here it is so-called propelling movement Cache.Propelling data when propelling movement Cache can utilize offered load not high, this performance to total system improves very big benefit.The use that pushes Cache also has another situation, is exactly the recommendation of back end to new film, and active to service node, makes data near the user new program push, can avoid the network congestion that may occur.
3. the test of algorithm of the present invention and assessment
In order to verify the validity of algorithm of the present invention, developed an event driven analogy model.Access sequence is produced by a randomizer, and the probability distribution of random number meets the Zipf rule, and the Zipf rule can represent well that the access probability of program distributes.In the Zipf rule, supposing the system has N program, sorts from big to small by access probability, i.e. p 1, p 2, p 3... p N, program p wherein iAccess probability be f i, f 1>=f 2>=f 3>=...>=f N, all programming access probability sums are 1.Under the Zipf rule, the access probability of k program can provide with following formula:
f k = c ( 1 - &alpha; ) k , c = 1 &Sigma; i = 1 N 1 ( 1 - &alpha; ) i - - - ( 3 )
Wherein α is the deflection factor of program probability distribution, 0<α<1, and the locality of the more little expression programming access of its value is strong more.Provided the programming access probability distribution synoptic diagram of α=0.1 o'clock among Fig. 5.
In the analogy model of verification algorithm, library of programmes represents that with a sequencing table cache replaces algorithm and is embodied on the ordering rule of table.Only consider to replace the hit rate of algorithm and the mean access time of programming access in the present invention's experiment.
For comparison LFRU and LRU, the efficient of LFU algorithm, the distribution probability of the access sequence of use changes, and comprises the variation of α value in the Zipf rule and the variation of film access probability ordering.The variation of the load that experiment is used is beginning the comparison fierceness, and the later stage tends towards stability.The main adaptability of investigating the LFRU algorithm in system's different phase of experiment.Analog result shows that LFRU algorithm hit rate is all higher than the hit rate of LFU and LRU as shown in Figure 3.At the initial stage of experiment, the LFRU algorithm approaches lru algorithm, and in the latter stage of experiment, the LFRU algorithm approaches the LFU algorithm, and this is consistent with theoretical analysis.
For the efficient of LFU and ELFU algorithm relatively, during main study tour patterns of change, the ELFU algorithm to data outmoded overcome effect.In the regular hour (as one day), the in-place computation that the ordering of library of programmes is done at random comes the access probability of analogue film to change in time.In the case, analog result as shown in Figure 4.Experiment shows that the ELFU algorithm can be suitable for the variation of programming access probability preferably, overcomes the outmoded problem of data.The LFU algorithm is relatively more responsive to the variation of programming access probability, and fluctuation ratio greatly when hit rate changed at access module.Under suitable loading condition, the ELFU algorithm improves about 30% than the hit rate of LFU algorithm.
Application testing is to carry out in the concentrating type VOD system (shown in Figure 1) of one 8 node (2 data nodes, 6 service nodes).During algorithm was realized, D (as 15 days) between time zone of transition of definition adopted the LFRU algorithm, if time t greater than D, then adopts the ELFU algorithm in D time district in master routine.If the program of user's program request is in service node, then can directly serve, otherwise by the back end service, carry out the dynamic replication of program from the back end to the service node simultaneously, carry out dynamic disk Cache program, this function is finished by a function DynamicCache (file ID number, file path Var destination service node number, file path).The specific algorithm step is as follows:
Function DynamicCache (file ID number, file path Var destination service node number, file path)
1. from database, add up the video-on-demand times of nearest a period of time this document
2. from database, search the number of the service node at this document place
3. calculate " whether need copy Cache value " according to above-mentioned two values
" 4.if whether need copy Cache value ">" the standard C ache value of needs copy " then
The program of determining to be replaced (minimum " whether need copy Cache value " and current do not use)
else
Return " not needing copy "
5. when needs copy, the file on the service node that need determine to be replaced
If if has the service node then of remanence disk space
Look for the service node of the minimum use amount of disk
If the disk space of all service nodes of else compares the RFN of each program all with full
Or the ELFN value, replace least-recently-used program file
System is open to about 100 users, and program request was at random tried out one month.Adjust by 50 films, done 3 experiments by different content combinations to system's film vault, the hit rate of statistics service node Cache, the result is as shown in table 1.
Table 1
Figure C0313470700141
The result shows that LFRU algorithm and ELFU algorithm can support the VoD application system of stratification (multipolarity) preferably.Also the hit rate of disk Cache and its size have direct relation as can be seen from experimental result.Most request is met in disk Cache, and disk Cache will have enough capacity supports.

Claims (3)

1. the disk buffering in the stratification video request program is replaced algorithm, and data storage adopts hierarchical setting, it is characterized in that comprising the following steps:
(1) in the initialization time of video request program section, the minimum frequency that adopts time and frequency information with data access to combine uses algorithm LFRU recently;
(2) after system reaches steady state (SS), employing is in conjunction with the time cycle method of the access frequency statistical information in previous cycle and this cycle, and the minimum frequency algorithm ELFU of the expansion that combines of the linear prediction method of predicting next cycle access frequency when before this cycle visit finishes, beginning with following one-period, wherein the ELFU algorithmic formula is: with combination frequency ELFN k = t T * W i k ^ + W i k N k ( 0 &le; t &le; T ) Minimum value object as an alternative, t is the counter in the cycle in the formula,
Figure C031347070002C2
Be the predict frequency of each program, T is a time cycle, N kBe the number of program K at service node, wherein program refers to video frequency program, and service node is for preserving the server of popular video frequency program.
2. the disk buffering in the stratification video request program according to claim 1 is replaced algorithm, it is characterized in that: the formula of LFRU algorithm is in the described step (1): select a RFN kPeaked program is object as an alternative, wherein
RFN k = ( D - t ) / D * ( t - t k ) + t / D * t - t 0 c k + N k
D is the initial time section in the formula, and t represents the time of running of algorithm, i.e. the time value of this visit, t kExpression is to film P kThe time value of the last visit, its initial value is t o, C kExpression is to film P kThe number of times of visit; N kBe the number of program K at service node, RFN kBe the time value of data access and the weighted sum of frequency values and the number of program K in service node, wherein program refers to video frequency program, and service node refers to preserve the server of popular video frequency program.
3. disk buffering in the stratification video request program according to claim 2 is replaced algorithm, it is characterized in that: the data of described replacement object are that wherein program is meant video frequency program with whole program unit as an alternative.
CN03134707XA 2003-09-29 2003-09-29 Disc buffer substitution algorithm in layered video request Expired - Fee Related CN100407168C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN03134707XA CN100407168C (en) 2003-09-29 2003-09-29 Disc buffer substitution algorithm in layered video request

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN03134707XA CN100407168C (en) 2003-09-29 2003-09-29 Disc buffer substitution algorithm in layered video request

Publications (2)

Publication Number Publication Date
CN1604054A CN1604054A (en) 2005-04-06
CN100407168C true CN100407168C (en) 2008-07-30

Family

ID=34659079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN03134707XA Expired - Fee Related CN100407168C (en) 2003-09-29 2003-09-29 Disc buffer substitution algorithm in layered video request

Country Status (1)

Country Link
CN (1) CN100407168C (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944068A (en) * 2010-08-23 2011-01-12 中国科学技术大学苏州研究院 Performance optimization method for sharing cache
CN102122303A (en) * 2011-03-15 2011-07-13 浪潮(北京)电子信息产业有限公司 Method for data migration, service system and sever equipment
CN102323898A (en) * 2011-09-02 2012-01-18 深圳中兴网信科技有限公司 Cache calling method and system
CN103973752A (en) * 2013-02-06 2014-08-06 上海华师京城高新技术开发有限公司 File storage system for cloud computing system
CN104021226B (en) * 2014-06-25 2018-01-02 华为技术有限公司 Prefetch the update method and device of rule
US9740635B2 (en) * 2015-03-12 2017-08-22 Intel Corporation Computing method and apparatus associated with context-aware management of a file cache
CN108153783B (en) * 2016-12-06 2020-10-02 腾讯科技(北京)有限公司 Data caching method and device
CN106909518B (en) * 2017-01-24 2020-06-26 朗坤智慧科技股份有限公司 Real-time data caching mechanism
CN107463514B (en) * 2017-08-16 2021-06-29 郑州云海信息技术有限公司 Data storage method and device
CN107563514A (en) * 2017-09-25 2018-01-09 郑州云海信息技术有限公司 A kind of method and device of prediction data access frequency

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5150472A (en) * 1989-10-20 1992-09-22 International Business Machines Corp. Cache management method and apparatus for shared, sequentially-accessed, data
US5381539A (en) * 1992-06-04 1995-01-10 Emc Corporation System and method for dynamically controlling cache management
CN1206150A (en) * 1996-12-24 1999-01-27 国际商业机器公司 Improved high-speed buffer storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5150472A (en) * 1989-10-20 1992-09-22 International Business Machines Corp. Cache management method and apparatus for shared, sequentially-accessed, data
US5381539A (en) * 1992-06-04 1995-01-10 Emc Corporation System and method for dynamically controlling cache management
CN1206150A (en) * 1996-12-24 1999-01-27 国际商业机器公司 Improved high-speed buffer storage system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
大规模视频点播磁盘Cache替换算法. 李勇,彭宇行,陈福接.计算机研究与发展,第37卷第2期. 2000
大规模视频点播磁盘Cache替换算法. 李勇,彭宇行,陈福接.计算机研究与发展,第37卷第2期. 2000 *

Also Published As

Publication number Publication date
CN1604054A (en) 2005-04-06

Similar Documents

Publication Publication Date Title
CN101184021B (en) Method, equipment and system for implementing stream media caching replacement
CN102055650B (en) Load balance method and system and management server
US10057367B2 (en) Systems and methods for data caching in a communications network
CN101201801A (en) Classification storage management method for VOD system
CN109982104B (en) Motion-aware video prefetching and cache replacement decision method in motion edge calculation
CN100407168C (en) Disc buffer substitution algorithm in layered video request
CN102355490B (en) Spatial information cluster cache pre-fetching method for network spatial information service system
CN102129472A (en) Construction method for high-efficiency hybrid storage structure of semantic-orient search engine
CN100383792C (en) Buffer data base data organization method
CN115022342B (en) Data processing method, device, electronic equipment and computer readable storage medium
CN101179494B (en) Resource distribution method facing to network multimedia transmission service
Tu et al. An optimized cluster storage method for real-time big data in Internet of Things
CN114697683A (en) Intelligent scheduling method, equipment and computer program product for streaming media file
CN103118132A (en) Distributed caching system and method oriented to spatio-temporal data
Zhang et al. Web caching framework: Analytical models and beyond
Nicholson et al. Dynamic data replication in lcg 2008
Scheuermann et al. Adaptive load balancing in disk arrays
Chou et al. Bc-store: A scalable design for blockchain storage
CN100518146C (en) A method for global buffer management of the cluster storage system
CN105138536A (en) Mobile social network data fragmentation method based on directed hypergraph
Akhtar et al. Hifi: A hierarchical filtering algorithm for caching of online video
Zhang et al. A dynamic social content caching under user mobility pattern
Liu et al. Proactive data caching and replacement in the edge computing environment
Li et al. Enabling performance as a service for a cloud storage system
KR101913969B1 (en) Method and Apparatus for Cache management to guarantee SSD lifetime in a disk-based video storage server

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee