CN104168332A - Load balance and node state monitoring method in high performance computing - Google Patents
Load balance and node state monitoring method in high performance computing Download PDFInfo
- Publication number
- CN104168332A CN104168332A CN201410440328.5A CN201410440328A CN104168332A CN 104168332 A CN104168332 A CN 104168332A CN 201410440328 A CN201410440328 A CN 201410440328A CN 104168332 A CN104168332 A CN 104168332A
- Authority
- CN
- China
- Prior art keywords
- node
- load
- performance
- server
- supervising
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention relates to a load balance and node state monitoring method in high performance computing. The method is suitable for reducing resources which occupy a clustering system, increasing the utilization rate of the resources, effectively improving the performance of server clustering and providing high-quality service for a user. The method concretely comprises the steps that firstly, according to the operation state and load and performance parameters of server nodes, computing is performed to obtain load weights of all the nodes, and an alternative node set for next task distribution is selected; afterwards, according to load difference values and distribution probabilities, the probabilities of all node distribution tasks in the alternative node set are computed, and new requests are distributed to the selected nodes in a random probability distribution mode to solve the problem of uneven load distribution; finally, based on a load correction formula, node loads of the distribution tasks are corrected to improve the load balance effect and improve the reliability and stability of the cluster system.
Description
Technical field
The present invention relates to the field of computer network load balance process, especially load balancing and node state method for supervising in high-performance calculation.
Background technology
Along with the sharp increase of the day by day universal and Internet service of network application, no matter at enterprise network, Campus Networks or at wide area network, the development of traffic carrying capacity has all exceeded estimation in the past.Enterprise is more and more stronger to the dependence of network, also increasing to having the application demand of distributed system of extensibility and reliability.In the time that enterprise provides Web service for user, along with the quick increase of number of visitors, the webserver need to possess the ability that a large amount of Concurrency Access services are provided.Its data traffic and calculating strength are big, and single equipment cannot be born at all; On the other hand, how between the multiple network equipments that complete said function, realizing rational traffic carrying capacity and distribute, make it to be unlikely to occur one equipment is excessively busy and other equipment is not given full play to the situation of disposal ability, is also problem in the urgent need to address.Load-balancing mechanism is exactly to produce in this case.The response time of enterprise to network system and the content that provides, reliability of service, instantaneity etc. require also more and more higher, and the system that supports whole website with separate unit server cannot have been met customer need, the substitute is one group of server zone.
Load balancing (ServerLoadBalance) forms a cluster of servers by multiple servers with symmetric mode, and every station server all has par, and service all can be externally provided separately.By specific load-balancing technique, concentrate the load state on each server to be reasonably allocated on certain station server according to server zone external request, take this significantly to improve the speed of obtaining data, improve the disposed of in its entirety ability of server, solve massive concurrent access problem, and improve reliability, availability, maintainability, final purpose is to accelerate the response speed of server, thereby improves user's Experience Degree.This cluster technology can obtain the performance close to mainframe with minimum investment.
A kind of solution of reasonable is to adopt server cluster loading balancing technique.Server cluster refers to a lot of servers connected into an entirety by local area network (LAN), and forming a group of planes externally provides service, in client just as traditional individual server.Group system can make full use of existing various resource, effectively strengthens the disposal ability of network data, improves the integrity service performance of system, makes system more reliable and more stable, but the problem of load balancing of settlement server cluster is to improve the key point of cluster performance.
Load-balancing technique is mainly used to improve scalability and the availability of the service routine of carrying out mission critical on server, and it is divided into conventionally according to the difference of implementation method, and hardware is realized and software is realized two kinds.Disposal ability and the load performance of the hard-wired system of load balancing are stronger, but expensive; Although integrated multiple load-balancing algorithm, flexibility is not strong, does not support the load balancing more optimized and more complicated application protocol, and it just judges data traffic from network layer in addition, cannot effectively grasp the state of server and application.Software
Realization can distribute load according to the situation of system and application better, flexibility is large, and cost performance is high, and is easy to upgrade up-to-date, outstanding load balancing, but load-balancing algorithm can affect to the performance of server, so the complexity of algorithm is had relatively high expectations.
In software is realized, the core component of realizing cluster load balancing is load-balancing algorithm, and according to the difference of design philosophy, it is mainly divided into static equilibrium algorithm and dynamic load balancing algorithm two classes.The probability assignments task of the statistical information that static equilibrium algorithm just utilizes cluster to fix, and the ruuning situation of taking into account system reality not, load effect is very undesirable; Dynamic load balancing algorithm carrys out the load condition of evaluating system by gathering the real time execution information of cluster, and then the distribution of the task of adjustment, avoids run-off the straight under system long-play.Experiment shows, dynamic load balancing algorithm can be obtained than the better performance of static equilibrium algorithm, even in extreme situation, dynamic load balancing algorithm still can be obtained more satisfactory performance.Generally, with respect to for static equilibrium algorithm, dynamic algorithm can obtain 30% ~ 40% performance improvement, and along with the research to Dynamic Load-balancing Algorithm, dynamic equalizing technology will replace static equilibrium technology.
But, how to obtain more easily system operation information, reduce the resource consumption of interbehavior as far as possible and make great efforts to reduce the impact that dynamic load balancing algorithm itself produces group system and will become the problem of the necessary solution of dynamic load equilibrium technology institute, therefore be necessary to find a kind of load-balancing algorithm of more optimizing, the alap resource that takies group system, improve the utilance of resource, effectively promote the performance of server cluster, for user provides high-quality service.
Summary of the invention
The primary technical problem solving of the present invention is to provide load balancing and node state method for supervising in a kind of high-performance calculation, it is suitable for the resource that takies group system reducing, and improve the utilance of resource, effectively promote the performance of server cluster, for user provides high-quality service.
In order to solve the problems of the technologies described above, load balancing and node state method for supervising in high-performance calculation provided by the invention, it comprises: the load, the performance information that use server node, the request of group system is evenly distributed on each node, and the load that each node is shared is directly proportional to its performance, and then the resource of system is fully used, farthest improve the performance of cluster.
Specifically: according to the running status of server node and load thereof, performance parameter, by calculating the load weights of each node, and alternative node set while choosing next allocating task; Calculate the probability of each node allocating task in alternative node set according to load difference, allocation probability, and the mode that uses random chance to distribute, new request is assigned on the node of selection, distribute uneven problem to solve load; Finally utilize load correction formula to revise the node load of allocating task, to improve load balancing effect, strengthen the reliability and stability of group system.
Load-balancing algorithm based on load weights is described below:
1. set a threshold epsilon;
Whether 2. new request of every arrival, need the state of update server node according to timer inspection, if desired upgrade;
3. according to the performance C (Si) of the running status computing node of node and load L (Si), and according to its result computational load weights W (Si) and load difference Δ L (Si);
4. choose candidate allocation node set J according to the load weights of each node; First choose node Sm, it satisfies condition: W (Sm)=min{W (Si) }, i=0,1 ..., n-1;
If other arbitrary nodes Si meets the following conditions: W (Si) <W (Sm)+ε, i=0,1 ..., n-1;
Node Si is joined in set J;
5. the probability P (Si) that in calculated candidate distribution node set J, each node load distributes;
The method of 6. distributing according to random chance is selected suitable node from set J, and task is assigned on this node;
7. revise the load of selected node, during for lower sub-distribution request.
The performance C (Si) that carrys out evaluation node Si from CPU quantity n, cpu frequency C (Ci), magnetic disc i/o speed C (Di), memory size C (Mi), network bandwidth C (Ni) index of server node, uses following formula (8.1) to calculate:
….(8.1)
Wherein, k is the weighting parameter of indices, reflects the influence degree of each index to server node performance.
Carry out the load L (Si) of evaluation node Si from CPU usage L (Ci), memory usage L (Mi), magnetic disc i/o utilization rate L (Di), network bandwidth utilization rate L (Ni) index, and use following formula (8.2) to calculate:
……(8.2);
R is the weighting parameter of indices, reflects the influence degree of each index to dissimilar service.
The load weights W (Si) of node is defined as the ratio of node load L (Si) and joint behavior C (Si), adopts formula (8.3) to calculate;
……(8.3)。
The load difference Δ L (Si) of so-called node, the difference that is defined as the maximum WMAX of all node load weights and the load weights of this node is multiplied by this joint behavior, adopts formula (8.4) to calculate:
……(8.4)。
The load difference Δ L (Si) of node accounts for the load allocation probability P (Si) that the percentage of all node load difference sums is node, adopts formula (8.5) to calculate:
……(8.5)
The allocation probability of node in set of computations J, therefrom chooses suitable node and distributes load; Node allocation probability adopts formula (8.6) to calculate:
……(8.6)。
Incremental loading δ refers to that a request of certain COS is assigned on certain server node, the load capacity that this node is increased, adopt formula to be fitted on certain server node, the load capacity that this node is increased, adopt formula to be fitted on certain server node, the load capacity that this node is increased, adopts formula to be fitted on certain server node, the load capacity that this node is increased, adopts formula: δ=L (S)/N ... (8.7)
Wherein, N is the number of request of this COS on this node, the load that L (S) brings node for N request.
Adopt formula (8.8) to calculate to the correction of server node load, wherein δ is incremental loading value, the performance of the node that C (S) uses while being computational load increment, L (Si) and C (Si) are respectively load and the performance of this node;
(8.8).
With respect to prior art, the technique effect that the present invention has is: load balancing and node state method for supervising in the high-performance calculation in the present invention, on the basis of several load-balancing algorithms of comprehensive prior art, adopt the dynamic feedback of load equalization algorithm based on load weights, mainly utilize the factor such as disposal ability and actual loading of node, the concepts such as load weights and load difference have been proposed, and instruct the distribution of task with this, make server node work according to his ability as far as possible, give full play to the advantage of group system, ensure the stability of a system, improve reliabilty and availability.The main information such as load, performance of using server node, through relevant computing and method, by the request of group system as far as possible evenly, fair, be reasonably assigned on each node, guarantee that the load that each node is shared is directly proportional to its performance, the resource of system is fully used, farthest improves the performance of cluster.
Brief description of the drawings
In order to clearly demonstrate innovative principle of the present invention and the technical advantage than existing product thereof, by applying the limiting examples of described principle, possible embodiment is described by means of accompanying drawing below.In the drawings:
Fig. 1 is the probability space distribution map of candidate allocation node of the present invention;
Fig. 2 is dynamical feedback illustraton of model of the present invention.
Embodiment
Load balancing and node state method for supervising in high-performance calculation of the present invention, adopt the dynamic feedback of load equalization algorithm based on load weights, mainly utilize the factor such as disposal ability and actual loading of node, the concepts such as load weights and load difference have been proposed, and the distribution of instructing task with this, make server node work according to his ability as far as possible, give full play to the advantage of group system, ensure the stability of a system, improve reliabilty and availability.
Described dynamic feedback of load equalization algorithm mainly uses the information such as load, performance of server node, through relevant computing and method, by the request of group system as far as possible evenly, fair, be reasonably assigned on each node, guarantee that the load that each node is shared is directly proportional to its performance, the resource of system is fully used, farthest improves the performance of cluster.
The core concept of dynamic feedback of load equalization algorithm is: consider the parameter such as running status and load, performance of server node, and by calculating the load weights of each node, and alternative node set while choosing next allocating task; Calculate the probability of each node allocating task in alternative node set according to the computing formula such as load difference, allocation probability, and the mode that uses random chance to distribute, new request is assigned on the node of selection, solve preferably load and distribute uneven problem; Finally utilize load correction formula to revise the node load of allocating task, improve as far as possible load balancing effect, strengthen the reliability and stability of group system.
Load-balancing algorithm based on load weights is described below:
1. set a threshold epsilon (needing to make clear the scope of the value that this threshold epsilon chooses, the relevant foundation of value size herein).
Whether 2. new request of every arrival, need the state of update server node according to timer inspection, if desired upgrade (need to make clear: the correlated condition of " needing the state of update server node " or according to) herein.
3. according to the performance C (Si) of the running status computing node of node and load L (Si), and according to its result computational load weights W (Si) and load difference Δ L (Si) etc.
4. choose candidate allocation node set J according to the load weights of each node.First choose node Sm, it satisfies condition: W (Sm)=min{W (Si) }, i=0,1 ..., n-1
If other arbitrary nodes Si meets the following conditions: W (Si) <W (Sm)+ε, i=0,1 ..., n-1
Node Si is joined in set J.
5. use the probability P (Si) that in formula (8.6) calculated candidate distribution node set J, each node load distributes.
The method of 6. distributing according to random chance is selected suitable node from set J, and task is assigned on this node.
7. use formula (8.8) to revise the load of selected node, during for lower sub-distribution request.
Dynamic feedback of load equalization algorithm is mainly started with from load and two aspects of performance of node, and its integrated application was mixed to tolerance and the dividing of load of node state, and therefore the load of node and the calculating of performance are particularly important.Node S=in group system (S1, S2 ..., Sn) and isomery often, the hardware configuration of server varies, and especially utilizes old equipment is set up or the cluster of expansion progressively, and its disposal ability difference is very large.In view of the difference on the each node hardware of group system, in the time considering the information such as load condition of node, can not put on an equal footing, otherwise the node of high configuration can be for a long time in idle condition, the node of low configuration can, because of the performance of the overweight reduction system of load, even there will be the machine phenomenon of delaying.
Therefore, the performance that we define server node is distinguished the disposal ability of different server node, divides timing also to treat according to the performance difference of different nodes in load, and to reach, able people should do more work, the effect of load balancing.The calculating of server node performance has several different methods, for example, can on different server, move identical calculation task, portrays the performance height of different nodes according to the needed time of finishing the work.This method specific aim is stronger, and the result accuracy obtaining is higher, but due to the certainty of calculation task, the joint behavior applicability that the method is calculated is not strong, and can not free adjustment.The most frequently used computational methods are the hardware configuration parameters according to server, obtain the performance of node by specific computational methods (as weighted sum etc.), the performance versatility that this method is calculated is stronger, and has very large adjusting space, is easy to upgrade in system running.
Due to the isomerism of cluster node, the different hardware parameter of server also has very big difference to the percentage contribution of performance, therefore adopt the mode of various hardware parameter weighted sums herein, the indexs such as main CPU quantity n, cpu frequency C (Ci) from server node, magnetic disc i/o speed C (Di), memory size C (Mi), network bandwidth C (Ni) are carried out the performance C (Si) of evaluation node Si, use formula (8.1) to calculate.
……(8.1)
Wherein, k is the weighting parameter of indices, reflect the influence degree of each index to server node performance, and the COS that server performance provides with server is relevant, the i.e. degree of dependence difference of different services to various indexs, such as hypertext transmission service (HTTPService) is mainly had relatively high expectations to arithmetic speed and the internal memory etc. of server central processing unit (CPU), and file transfer services (FTPService) lays particular emphasis on hard disk I/O, the network throughput etc. of server.
The indices of evaluating server node performance is relatively fixing, can often not change once hardware configuration is determined.But the load state of node can change constantly, be subject to the impact of various factors larger, be a key factor that affects group system load balancing.The final goal of group system is to adjust in time, exactly the distribution of group system flow according to the load state of node, realizes overall load balancing.Therefore, must formulate the standard of weighing load state, collect the load state of abundant load information computing node, and then instruct the reasonable distribution of group system task.
The method of computing node load state is varied, and the development that is accompanied by Dynamic Load-balancing Algorithm is gradually improved.People use number of processes in the group system evaluation index as server load the earliest, but present operating system is all generally the system of multi-user's multi-process, even if node is more idle, still likely move many processes, as finger daemon of the critical processes of system or other services etc.So, weigh node load situation with process number and be inaccurate.
More consistent algorithm is, according to the load state that the behaviour in service of the larger index of server node performance impact is carried out to evaluation node, if under the occasions such as special applications, also will consider the impact of other specific factors at present.Can adopt such computational methods, and the index of choosing with computing node performance is consistent, mainly carry out the load L (Si) of evaluation node Si from indexs such as CPU usage L (Ci), memory usage L (Mi), magnetic disc i/o utilization rate L (Di), network bandwidth utilization rate L (Ni), use formula (8.2) to calculate.
……(8.2)
Identical with the computing formula (8.1) of joint behavior, r is the weighting parameter of indices, reflect the influence degree of each index to dissimilar service, and weighting parameter is relevant with COS, the weighting parameter that different COS is chosen is not identical yet.In addition, this parameter is not in full accord with k, and in actual application process, can regulate according to running situation, to reaching better load balancing effect.
Load weights and load difference:
We have defined load and the method for evaluating performance of server, thereby can obtain load state and the performance etc. of node, how to utilize the distribution of these information guiding group system flows just to become one of algorithm urgent problem.The simplest method is according to the node load and the performance information that calculate, directly selects suitable node to distribute new arriving of task.As filtered out the less part of nodes of load according to the load state of node, and then calculate according to the performance of node the request arriving that makes new advances and be assigned to the probability on these nodes, a node is selected in the specific system of selection of last basis, and task is distributed on this node.This Method And Principle is simple, and be easy to realize, but its performance is not too high, because the load state of node is relevant with its performance, if two node load states that for example performance is different are identical, its remaining disposal ability is not identical, so can not only determine the busy-idle condition of server according to the load state of node.
Therefore, for task more balanced, that reasonably distribute group system, the concept of node load weights be can introduce, the integrated load of computing node, the busy of response service device node more exactly come with this.The load weights W (Si) of node is defined as the ratio of node load L (Si) and joint behavior C (Si), adopts formula (8.3) to calculate.Weights are larger, illustrate that node is more busy, and its rest processing capacity is fewer; Otherwise node is more idle, rest processing capacity is stronger.Therefore, can decide according to the size of load weights the busy of server, understand the size of its rest processing capacity, and then instruct the distribution of task.
……(8.3)
Load weights have reacted node busy preferably, make us have a clear understanding of the load state of each node, but only from the result of load weights, we cannot learn the busy degree of server node.In order to represent more accurately node busy degree, the scientifically size of computing node rest processing capacity, has introduced the concept of load difference.The load difference Δ L (Si) of so-called node, the difference that is defined as the maximum WMAX of all node load weights and the load weights of this node is multiplied by this joint behavior, adopts formula (4.4) to calculate.
……(8.4)
From definition, the load difference of the most busy server node is zero.Load difference has been reacted the size of each node rest processing capacity, in the time that computational load is assigned to the probability on node, can use the indexs such as load difference to calculate more accurately.
Allocation probability calculates and random chance is distributed:
Probability assignments is more fair distribution method, for the load of distribution system more equably, first needs the allocation probability of server node to calculate, and then selects suitable node according to probability and distribution method separately.More reasonable for load is distributed, the load difference Δ L (Si) of our defined node accounts for the load allocation probability P (Si) that the percentage of all node load difference sums is node, adopts formula (8.5) to calculate.
……(8.5)
But in actual applications, the scale of group system is generally huger, server node is more, in order to reduce the consumption of dispensed probability to system resource, and increasing the science of distributing, is not to calculate for all nodes, but select the lower node of fractional load weights, the set J of composition candidate allocation node, the allocation probability of node in set of computations J, therefrom chooses suitable node and distributes load.Improved node allocation probability adopts formula (8.6) to calculate.
……(8.6)
Select the method for distribution node a lot of according to the allocation probability P (Si) of each node in set J, modal is to select the node of maximum probability to distribute, but this likely causes the load of this node sharply to increase, then select probability time large node to distribute, go down to cause the inhomogeneous of distribution with this.The stationarity of distributing for fear of load, the mode that we adopt random chance to distribute is selected the node distributing.
Random chance is distributed, and refers in simple terms the node of determining allocating task according to the random number of the allocation probability of node and the random generation of system.In Assumption set J, the number of node is n, and the allocation probability of each node Si is P (Si), can be obtained by formula (8.6), and the allocation probability sum of n node equals 1, and the probability space of its composition distributes as shown in Figure 1.
In the time that new request arrives, system generates one [0,1] interval random number, drop point according to this random number at the probability space of candidate allocation node, select corresponding server node to distribute this request, thereby reach the effect of Random assignment, the stationarity of avoiding task to distribute, the possibility of minimizing system run-off the straight.
Dynamical feedback and load correction:
The final goal of group system is that task is reasonably allocated on each server node, inquires about the state of each node when best bet is each allocating task, then selects suitable node.Although this method accuracy is the highest, can increase the expense of system, reduce the performance of group system, particularly when the peak period in request, the state of query node will increase the response time greatly.For this reason, group system can only periodically be upgraded the state information of each node, but within the system update cycle, and the actual loading situation of each server node can have deviation with the value that upgraded last time of recording on load divider.Therefore, within the update cycle, new task needs to revise in time the load of node after distributing, and ensures that deviation is not too large; Update cycle is while arrival, then upgrades by inquiry the record that load equalizer is preserved, and the mode of this " inquiry-upgrade-correction " is called dynamic state feedback mechanism.
The dynamical feedback model of the dynamic feedback of load equalization algorithm based on load weights as shown in Figure 2.Wherein, the load regulation module of F (L, C) reaction group system, it calculates load weights and the allocation probability etc. of node according to the load of node and performance, offer load divider and process the solicited message of client, completes the distribution of request.Server cluster has the server node of many isomeries to form, and be responsible for request that distributor is sent and respond, and periodically to the state information of load regulation module feedback node.
Due to load information that can not the each node of real-time query within the update cycle, must use good method to revise in time the load state of node, for this reason, we introduce the concept of incremental loading the load of node are predicted, guarantee that in the update cycle, system can run-off the straight.Incremental loading δ refers to that a request of certain COS is assigned on certain server node, the load capacity that this node is increased, adopt formula to be fitted on certain server node, the load capacity that this node is increased, adopt formula to be fitted on certain server node, the load capacity that this node is increased, adopts formula to be fitted on certain server node, the load capacity that this node is increased, adopts formula:
δ=L(S)/N……(8.7)
Wherein N is the number of request of this COS on this node, the load that L (S) brings node for N request.In order to improve the stability of system, incremental loading can dynamically be adjusted in system running.In the time of computational load increment, in order to reduce the impact of other factors, server node only provides the service of single type, and request number needs reaches certain magnitude, and adopt the method such as average of repeatedly measuring, guarantee the validity of the incremental loading calculating.
Obtained after the incremental loading of certain service, in the interval of update cycle, distributed the load of the node of new task will add increment size, the load of the node of finishing the work will deduct increment size, could react more really so the real-time load of each node.This method is fairly simple, also more satisfactory to the correction effect of node load, but the method needs the performance of the each node task of Real-Time Monitoring, or node itself is reported completing of its task to load dispatch device, this will increase the burden of service node, take the network bandwidth, increase the response time of group system.In order both to avoid the generation of this phenomenon, can revise the load information of node again, the situation that in real time detection node task completes of the dynamic feedback of load equalization algorithm based on load weights, but this is considered in the dynamic adjustment of incremental loading and is gone.
In the interval of update cycle, if the task that node completes more than newly assigned task, the load total amount of node will reduce; After the update cycle arrives, by inquiring about the more load of new node, compare the load of upgrading last time less so, revised incremental loading value also can reduce.In the next update cycle, this can make up the impact of node load not being revised when node is finished the work, and reaches the effect of same reduction server load.The feasibility hypothesis of this method is that the request of group system is slowly to change, and can not have situation fluctuated.In the time that the request of group system changes, the incremental loading of revising after next update node load changes thereupon, and can within next period, play good regulating action to the load of node, can make up preferably the not defect of Real-Time Monitoring node task performance, in reducing system consumption, reach same regulating effect.Therefore, adopt formula (8.8) to calculate to the correction of server node load, wherein δ is incremental loading value, the performance of the node that C (S) uses while being computational load increment, and L (Si) and C (Si) are respectively load and the performance of this node.
……(8.8)。
For the deficiency of traditional equalization algorithm, the information such as performance and load of the load-balancing algorithm comprehensive utilization server node based on load weights, by the calculating of load weights and allocation probability, instructs the distribution of task, improves load balancing effect; By the anti-locking system run-off the straight of load correction, improve the stability of system.
Obviously, above-described embodiment is only for example of the present invention is clearly described, and is not the restriction to embodiments of the present invention.For those of ordinary skill in the field, can also make other changes in different forms on the basis of the above description.Here without also giving exhaustive to all execution modes.And these belong to apparent variation that spirit of the present invention extended out or variation still among protection scope of the present invention.
Claims (10)
1. load balancing and a node state method for supervising in high-performance calculation, is characterized in that comprising:
A, according to the running status of server node and load thereof, performance parameter, by calculating the load weights of each node, and alternative node set while choosing next allocating task;
B, calculate the probability of each node allocating task in alternative node set according to load difference, allocation probability, and the mode that uses random chance to distribute, new request is assigned on the node of selection;
C, utilize load correction formula to revise the node load of allocating task.
2. load balancing and node state method for supervising in high-performance calculation according to claim 1, is characterized in that: the load-balancing algorithm based on load weights comprises:
1. set a threshold epsilon;
Whether 2. new request of every arrival, need the state of update server node according to timer inspection, if desired upgrade;
3. according to the performance C (Si) of the running status computing node of node and load L (Si), and according to its result computational load weights W (Si) and load difference Δ L (Si);
4. choose candidate allocation node set J according to the load weights of each node; First choose node Sm, it satisfies condition: W (Sm)=min{W (Si) }, i=0,1 ..., n-1;
If other arbitrary nodes Si meets the following conditions: W (Si) <W (Sm)+ε, i=0,1 ..., n-1;
Node Si is joined in set J;
5. the probability P (Si) that in calculated candidate distribution node set J, each node load distributes;
The method of 6. distributing according to random chance is selected suitable node from set J, and task is assigned on this node;
7. revise the load of selected node, during for lower sub-distribution request.
3. load balancing and node state method for supervising in high-performance calculation according to claim 2, it is characterized in that: carry out the performance C (Si) of evaluation node Si from CPU quantity n, cpu frequency C (Ci), magnetic disc i/o speed C (Di), memory size C (Mi), network bandwidth C (Ni) index of server node, use following formula (8.1) to calculate:
….(8.1)
Wherein, k is the weighting parameter of indices, reflects the influence degree of each index to server node performance.
4. load balancing and node state method for supervising in high-performance calculation according to claim 2, it is characterized in that: carry out the load L (Si) of evaluation node Si from CPU usage L (Ci), memory usage L (Mi), magnetic disc i/o utilization rate L (Di), network bandwidth utilization rate L (Ni) index, and use following formula (8.2) to calculate:
……(8.2)
R is the weighting parameter of indices, reflects the influence degree of each index to dissimilar service.
5. load balancing and node state method for supervising in high-performance calculation according to claim 2, it is characterized in that: the load weights W (Si) of node is defined as the ratio of node load L (Si) and joint behavior C (Si), adopt formula (8.3) to calculate;
……(8.3)。
6. load balancing and node state method for supervising in high-performance calculation according to claim 5, it is characterized in that: the load difference Δ L (Si) of so-called node, the difference that is defined as the maximum WMAX of all node load weights and the load weights of this node is multiplied by this joint behavior, adopts formula (8.4) to calculate:
……(8.4)。
7. load balancing and node state method for supervising in high-performance calculation according to claim 5, it is characterized in that: the load difference Δ L (Si) of node accounts for the load allocation probability P (Si) that the percentage of all node load difference sums is node, adopt formula (8.5) to calculate:
……(8.5)。
8. load balancing and node state method for supervising in high-performance calculation according to claim 5, is characterized in that: the allocation probability of node in set of computations J, and therefrom choose suitable node and distribute load; Node allocation probability adopts formula (8.6) to calculate:
……(8.6)。
9. load balancing and node state method for supervising in high-performance calculation according to claim 5, it is characterized in that: incremental loading δ refers to that a request of certain COS is assigned on certain server node, the load capacity that this node is increased, adopt formula to be fitted on certain server node, the load capacity that this node is increased, adopt formula to be fitted on certain server node, the load capacity that this node is increased, adopt formula to be fitted on certain server node, the load capacity that this node is increased, adopts formula: δ=L (S)/N ... (8.7)
Wherein, N is the number of request of this COS on this node, the load that L (S) brings node for N request.
10. load balancing and node state method for supervising in high-performance calculation according to claim 5, it is characterized in that: adopt formula (8.8) to calculate to the correction of server node load, wherein δ is incremental loading value, the performance of the node that C (S) uses while being computational load increment, L (Si) and C (Si) are respectively load and the performance of this node;
(8.8).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410440328.5A CN104168332A (en) | 2014-09-01 | 2014-09-01 | Load balance and node state monitoring method in high performance computing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410440328.5A CN104168332A (en) | 2014-09-01 | 2014-09-01 | Load balance and node state monitoring method in high performance computing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104168332A true CN104168332A (en) | 2014-11-26 |
Family
ID=51911953
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410440328.5A Pending CN104168332A (en) | 2014-09-01 | 2014-09-01 | Load balance and node state monitoring method in high performance computing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104168332A (en) |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104580476A (en) * | 2015-01-13 | 2015-04-29 | 北京京东尚科信息技术有限公司 | Method and device for selecting node in distributed system |
CN104618743A (en) * | 2014-12-30 | 2015-05-13 | 北京国双科技有限公司 | Method, device and system for allocating code rate resource |
CN105141541A (en) * | 2015-09-23 | 2015-12-09 | 浪潮(北京)电子信息产业有限公司 | Task-based dynamic load balancing scheduling method and device |
CN105260245A (en) * | 2015-11-04 | 2016-01-20 | 浪潮(北京)电子信息产业有限公司 | Resource scheduling method and device |
CN105389392A (en) * | 2015-12-18 | 2016-03-09 | 浪潮(北京)电子信息产业有限公司 | Metadata load statistical method and system |
CN105491138A (en) * | 2015-12-15 | 2016-04-13 | 国网智能电网研究院 | Load rate based graded triggering distributed load scheduling method |
CN105763636A (en) * | 2016-04-15 | 2016-07-13 | 北京思特奇信息技术股份有限公司 | Optimal host selection method and system in distributed system |
CN106682980A (en) * | 2017-01-18 | 2017-05-17 | 北京云知科技有限公司 | Method for designing probability generator |
CN107547650A (en) * | 2017-08-29 | 2018-01-05 | 中国民航大学 | Towards the improved weighted least-connection scheduling algorithm of SWIM systems |
CN107707680A (en) * | 2017-11-24 | 2018-02-16 | 北京永洪商智科技有限公司 | A kind of distributed data load-balancing method and system based on node computing capability |
CN107707612A (en) * | 2017-08-10 | 2018-02-16 | 北京奇艺世纪科技有限公司 | A kind of appraisal procedure and device of the resource utilization of load balancing cluster |
CN107783860A (en) * | 2016-08-31 | 2018-03-09 | 阿里巴巴集团控股有限公司 | The recovery point objectives monitoring method and equipment of a kind of data transfer |
CN108449394A (en) * | 2018-03-05 | 2018-08-24 | 北京华夏电通科技有限公司 | A kind of dispatching method of data file, dispatch server and storage medium |
CN109343942A (en) * | 2018-09-03 | 2019-02-15 | 北京邮电大学 | Method for scheduling task based on edge calculations network |
CN109426646A (en) * | 2017-08-30 | 2019-03-05 | 英特尔公司 | For forming the technology of managed node based on telemetry |
CN109542586A (en) * | 2018-11-19 | 2019-03-29 | 郑州云海信息技术有限公司 | A kind of node resource state update method and system |
CN109614228A (en) * | 2018-11-27 | 2019-04-12 | 南京轨道交通系统工程有限公司 | Comprehensively monitoring front-end system and working method based on dynamic load leveling mode |
CN109711526A (en) * | 2018-12-20 | 2019-05-03 | 广东工业大学 | Server cluster dispatching method based on SVM and ant group algorithm |
CN110113399A (en) * | 2019-04-24 | 2019-08-09 | 华为技术有限公司 | Load balancing management method and relevant apparatus |
CN110505109A (en) * | 2018-05-17 | 2019-11-26 | 阿里巴巴集团控股有限公司 | The method, apparatus and storage medium of test macro isolation performance |
CN110545450A (en) * | 2019-09-09 | 2019-12-06 | 深圳市网心科技有限公司 | Node distribution method, system, electronic equipment and storage medium |
CN110928676A (en) * | 2019-07-18 | 2020-03-27 | 国网浙江省电力有限公司衢州供电公司 | Power CPS load distribution method based on performance evaluation |
WO2020062277A1 (en) * | 2018-09-30 | 2020-04-02 | 华为技术有限公司 | Management method and apparatus for computing resources in data pre-processing phase of neural network |
CN111049919A (en) * | 2019-12-19 | 2020-04-21 | 上海米哈游天命科技有限公司 | User request processing method, device, equipment and storage medium |
CN111459677A (en) * | 2020-04-01 | 2020-07-28 | 北京顺达同行科技有限公司 | Request distribution method and device, computer equipment and storage medium |
CN111597041A (en) * | 2020-04-27 | 2020-08-28 | 深圳市金证科技股份有限公司 | Calling method and device of distributed system, terminal equipment and server |
CN111897816A (en) * | 2020-07-16 | 2020-11-06 | 中国科学院上海微系统与信息技术研究所 | Interactive method for computing information between satellites and generation method of information table applied by interactive method |
WO2021052199A1 (en) * | 2019-09-18 | 2021-03-25 | 中兴通讯股份有限公司 | Server load balancing method and apparatus, and cdn node |
CN113329067A (en) * | 2021-05-21 | 2021-08-31 | 广州爱浦路网络技术有限公司 | Edge computing node load distribution method, core network, device and storage medium |
CN113992691A (en) * | 2021-12-24 | 2022-01-28 | 苏州浪潮智能科技有限公司 | Method, device and equipment for distributing edge computing resources and storage medium |
CN114079656A (en) * | 2022-01-19 | 2022-02-22 | 之江实验室 | Probability-based load balancing method and device, electronic equipment and storage medium |
CN114584565A (en) * | 2020-12-01 | 2022-06-03 | 中移(苏州)软件技术有限公司 | Application protection method and system, electronic equipment and storage medium |
CN115174583A (en) * | 2022-06-28 | 2022-10-11 | 福州大学 | Server load balancing method based on programmable data plane |
CN116382892A (en) * | 2023-02-08 | 2023-07-04 | 深圳市融聚汇信息科技有限公司 | Load balancing method and device based on multi-cloud fusion and cloud service |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103327072A (en) * | 2013-05-22 | 2013-09-25 | 中国科学院微电子研究所 | Method for cluster load balance and system thereof |
-
2014
- 2014-09-01 CN CN201410440328.5A patent/CN104168332A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103327072A (en) * | 2013-05-22 | 2013-09-25 | 中国科学院微电子研究所 | Method for cluster load balance and system thereof |
Non-Patent Citations (1)
Title |
---|
张玉芳 等: "基于负载权值的负载均衡算法", 《计算机应用研究》 * |
Cited By (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104618743A (en) * | 2014-12-30 | 2015-05-13 | 北京国双科技有限公司 | Method, device and system for allocating code rate resource |
CN104580476A (en) * | 2015-01-13 | 2015-04-29 | 北京京东尚科信息技术有限公司 | Method and device for selecting node in distributed system |
CN104580476B (en) * | 2015-01-13 | 2018-09-14 | 北京京东尚科信息技术有限公司 | The method and apparatus for choosing node in a distributed system |
CN105141541A (en) * | 2015-09-23 | 2015-12-09 | 浪潮(北京)电子信息产业有限公司 | Task-based dynamic load balancing scheduling method and device |
CN105260245A (en) * | 2015-11-04 | 2016-01-20 | 浪潮(北京)电子信息产业有限公司 | Resource scheduling method and device |
CN105260245B (en) * | 2015-11-04 | 2018-11-13 | 浪潮(北京)电子信息产业有限公司 | A kind of resource regulating method and device |
CN105491138B (en) * | 2015-12-15 | 2020-01-24 | 国网智能电网研究院 | Distributed load scheduling method based on load rate graded triggering |
CN105491138A (en) * | 2015-12-15 | 2016-04-13 | 国网智能电网研究院 | Load rate based graded triggering distributed load scheduling method |
CN105389392A (en) * | 2015-12-18 | 2016-03-09 | 浪潮(北京)电子信息产业有限公司 | Metadata load statistical method and system |
CN105763636A (en) * | 2016-04-15 | 2016-07-13 | 北京思特奇信息技术股份有限公司 | Optimal host selection method and system in distributed system |
CN105763636B (en) * | 2016-04-15 | 2019-01-15 | 北京思特奇信息技术股份有限公司 | The selection method and system of optimal host in a kind of distributed system |
CN107783860A (en) * | 2016-08-31 | 2018-03-09 | 阿里巴巴集团控股有限公司 | The recovery point objectives monitoring method and equipment of a kind of data transfer |
CN106682980A (en) * | 2017-01-18 | 2017-05-17 | 北京云知科技有限公司 | Method for designing probability generator |
CN107707612A (en) * | 2017-08-10 | 2018-02-16 | 北京奇艺世纪科技有限公司 | A kind of appraisal procedure and device of the resource utilization of load balancing cluster |
CN107547650A (en) * | 2017-08-29 | 2018-01-05 | 中国民航大学 | Towards the improved weighted least-connection scheduling algorithm of SWIM systems |
CN109426646A (en) * | 2017-08-30 | 2019-03-05 | 英特尔公司 | For forming the technology of managed node based on telemetry |
CN107707680A (en) * | 2017-11-24 | 2018-02-16 | 北京永洪商智科技有限公司 | A kind of distributed data load-balancing method and system based on node computing capability |
CN108449394A (en) * | 2018-03-05 | 2018-08-24 | 北京华夏电通科技有限公司 | A kind of dispatching method of data file, dispatch server and storage medium |
CN108449394B (en) * | 2018-03-05 | 2021-08-13 | 北京华夏电通科技股份有限公司 | Data file scheduling method, scheduling server and storage medium |
CN110505109A (en) * | 2018-05-17 | 2019-11-26 | 阿里巴巴集团控股有限公司 | The method, apparatus and storage medium of test macro isolation performance |
CN109343942B (en) * | 2018-09-03 | 2020-11-03 | 北京邮电大学 | Task scheduling method based on edge computing network |
CN109343942A (en) * | 2018-09-03 | 2019-02-15 | 北京邮电大学 | Method for scheduling task based on edge calculations network |
WO2020062277A1 (en) * | 2018-09-30 | 2020-04-02 | 华为技术有限公司 | Management method and apparatus for computing resources in data pre-processing phase of neural network |
CN112753016A (en) * | 2018-09-30 | 2021-05-04 | 华为技术有限公司 | Management method and device for computing resources in data preprocessing stage in neural network |
CN109542586A (en) * | 2018-11-19 | 2019-03-29 | 郑州云海信息技术有限公司 | A kind of node resource state update method and system |
CN109614228A (en) * | 2018-11-27 | 2019-04-12 | 南京轨道交通系统工程有限公司 | Comprehensively monitoring front-end system and working method based on dynamic load leveling mode |
CN109711526A (en) * | 2018-12-20 | 2019-05-03 | 广东工业大学 | Server cluster dispatching method based on SVM and ant group algorithm |
CN110113399A (en) * | 2019-04-24 | 2019-08-09 | 华为技术有限公司 | Load balancing management method and relevant apparatus |
CN110928676B (en) * | 2019-07-18 | 2022-03-11 | 国网浙江省电力有限公司衢州供电公司 | Power CPS load distribution method based on performance evaluation |
CN110928676A (en) * | 2019-07-18 | 2020-03-27 | 国网浙江省电力有限公司衢州供电公司 | Power CPS load distribution method based on performance evaluation |
CN110545450A (en) * | 2019-09-09 | 2019-12-06 | 深圳市网心科技有限公司 | Node distribution method, system, electronic equipment and storage medium |
WO2021052199A1 (en) * | 2019-09-18 | 2021-03-25 | 中兴通讯股份有限公司 | Server load balancing method and apparatus, and cdn node |
CN111049919A (en) * | 2019-12-19 | 2020-04-21 | 上海米哈游天命科技有限公司 | User request processing method, device, equipment and storage medium |
CN111049919B (en) * | 2019-12-19 | 2022-09-06 | 上海米哈游天命科技有限公司 | User request processing method, device, equipment and storage medium |
CN111459677A (en) * | 2020-04-01 | 2020-07-28 | 北京顺达同行科技有限公司 | Request distribution method and device, computer equipment and storage medium |
CN111597041A (en) * | 2020-04-27 | 2020-08-28 | 深圳市金证科技股份有限公司 | Calling method and device of distributed system, terminal equipment and server |
CN111897816A (en) * | 2020-07-16 | 2020-11-06 | 中国科学院上海微系统与信息技术研究所 | Interactive method for computing information between satellites and generation method of information table applied by interactive method |
CN111897816B (en) * | 2020-07-16 | 2024-04-02 | 中国科学院上海微系统与信息技术研究所 | Interaction method of calculation information between satellites and generation method of information table applied by same |
CN114584565B (en) * | 2020-12-01 | 2024-01-30 | 中移(苏州)软件技术有限公司 | Application protection method and system, electronic equipment and storage medium |
CN114584565A (en) * | 2020-12-01 | 2022-06-03 | 中移(苏州)软件技术有限公司 | Application protection method and system, electronic equipment and storage medium |
CN113329067A (en) * | 2021-05-21 | 2021-08-31 | 广州爱浦路网络技术有限公司 | Edge computing node load distribution method, core network, device and storage medium |
CN113992691A (en) * | 2021-12-24 | 2022-01-28 | 苏州浪潮智能科技有限公司 | Method, device and equipment for distributing edge computing resources and storage medium |
CN114079656A (en) * | 2022-01-19 | 2022-02-22 | 之江实验室 | Probability-based load balancing method and device, electronic equipment and storage medium |
CN115174583B (en) * | 2022-06-28 | 2024-03-29 | 福州大学 | Server load balancing method based on programmable data plane |
CN115174583A (en) * | 2022-06-28 | 2022-10-11 | 福州大学 | Server load balancing method based on programmable data plane |
CN116382892B (en) * | 2023-02-08 | 2023-10-27 | 深圳市融聚汇信息科技有限公司 | Load balancing method and device based on multi-cloud fusion and cloud service |
CN116382892A (en) * | 2023-02-08 | 2023-07-04 | 深圳市融聚汇信息科技有限公司 | Load balancing method and device based on multi-cloud fusion and cloud service |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104168332A (en) | Load balance and node state monitoring method in high performance computing | |
CN102185779B (en) | Method and device for realizing data center resource load balance in proportion to comprehensive allocation capability | |
US20200287961A1 (en) | Balancing resources in distributed computing environments | |
CN106776005B (en) | Resource management system and method for containerized application | |
EP3161632B1 (en) | Integrated global resource allocation and load balancing | |
CN105491138B (en) | Distributed load scheduling method based on load rate graded triggering | |
US7472159B2 (en) | System and method for adaptive admission control and resource management for service time guarantees | |
US9489222B2 (en) | Techniques for workload balancing among a plurality of physical machines | |
US8291424B2 (en) | Method and system of managing resources for on-demand computing | |
CN105279027B (en) | A kind of virtual machine deployment method and device | |
CN109120715A (en) | Dynamic load balancing method under a kind of cloud environment | |
CN108667878A (en) | Server load balancing method and device, storage medium, electronic equipment | |
CN104881325A (en) | Resource scheduling method and resource scheduling system | |
CN103401939A (en) | Load balancing method adopting mixing scheduling strategy | |
CN103338228A (en) | Cloud calculating load balancing scheduling algorithm based on double-weighted least-connection algorithm | |
CN109617826A (en) | A kind of storm dynamic load balancing method based on cuckoo search | |
US11496413B2 (en) | Allocating cloud computing resources in a cloud computing environment based on user predictability | |
CN105471985A (en) | Load balance method, cloud platform computing method and cloud platform | |
CN102664814A (en) | Grey-prediction-based adaptive dynamic resource allocation method for virtual network | |
CN109032800A (en) | A kind of load equilibration scheduling method, load balancer, server and system | |
CN110099083A (en) | A kind of load equilibration scheduling method and device for server cluster | |
CN112711479A (en) | Load balancing system, method and device of server cluster and storage medium | |
CN108632394A (en) | A kind of web cluster load balancing method of adjustment and device | |
Tan et al. | Dynamic task assignment in server farms: Better performance by task grouping | |
Kim et al. | Virtual machines placement for network isolation in clouds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20141126 |
|
RJ01 | Rejection of invention patent application after publication |