Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberCN104580476 A
Publication typeApplication
Application numberCN 201510016624
Publication date29 Apr 2015
Filing date13 Jan 2015
Priority date13 Jan 2015
Also published asWO2016112831A1
Publication number201510016624.7, CN 104580476 A, CN 104580476A, CN 201510016624, CN-A-104580476, CN104580476 A, CN104580476A, CN201510016624, CN201510016624.7
Inventors吕信, 郭李明
Applicant北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司
Export CitationBiBTeX, EndNote, RefMan
External Links: SIPO, Espacenet
Method and device for selecting node in distributed system
CN 104580476 A
Abstract
The invention provides a method and device for selecting a node in a distributed system. The method and device facilitate the improvement of the whole PrestoDB cluster performance through improving the memory capacity of a None node under the condition that the workload and the cost are low. According to the method, in the distributed system, a portion of predetermined nodes are used as candidate None nodes; under the condition that pieces of data of a plurality of Source nodes are needed to be gathered in one node, a node is chosen form the candidate None nodes, and then the pieces of data of the Source nodes are made to gather in the selected node.
Claims(10)  translated from Chinese
1.一种在分布式系统中选取节点的方法,所述分布式系统为PrestoDB集群,其特征在于,该方法包括: 在所述分布式系统中,将指定的一部分节点作为候选的None节点; 在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点,然后将所述多个Source节点的数据片汇聚到选择的节点。 A selected node in a distributed system, said distributed system PrestoDB cluster, wherein the method comprises: in the distributed system, a portion of the node as a candidate None specified node; In the next piece of data needs to be aggregated into a plurality of nodes Source node case, select a node in the candidate None node, then the data sheet of the plurality Source node to node choice aggregation.
2.根据权利要求1所述的方法,其特征在于,在将指定的一部分节点作为候选的None节点之后,还包括: 在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 2. A method according to claim 1, characterized in that, after the part of the node as a candidate of the specified node None, further comprising: a sheet at a plurality of data needed to converge to a plurality of nodes Source node Fixed case to determine whether to allow the current node as a candidate None Fixed node candidate, and if so, in the distributed system randomly select multiple nodes as Fixed node, otherwise in the distributed system nodes in the candidate None outside randomly selected plurality of nodes as Fixed node.
3.根据权利要求1所述的方法,其特征在于,在将指定的一部分节点作为候选的None节点之后,还包括: 在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source 节点。 3. The method according to claim 1, characterized in that, after the part of the node as a candidate of the specified node None, further comprising: in case of need to save the data fragments to the Source node determines whether to allow the current None of the candidates Source node as a candidate node, and if so, in the distributed system randomly select multiple nodes as a Source node, otherwise than in the distributed system None of the candidates randomly selected node more As nodes Source node.
4.根据权利要求3所述的方法,其特征在于,在所述分布式系统中随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中按照本地性原则选取多个节点作为Source节点。 4. The method according to claim 3, characterized in that, in the distributed system randomly selects a plurality of nodes as the Source node comprises the step: in the case of the current sensing hardware manner, in the distributed system In accordance with the principle of local select multiple nodes as a Source node.
5.根据权利要求3或4所述的方法,其特征在于,在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 5. The method according to claim 3 or claim 4, characterized in that, in addition to the distributed system None of the candidate nodes randomly as the plurality of nodes Source node comprises: sensing hardware in this manner Under the circumstances, than in the distributed system nodes None of the candidates selected in accordance with the principle of local multiple nodes as a Source node.
6.一种在分布式系统中选取节点的装置,所述分布式系统为PrestoDB集群,其特征在于,该装置包括: 配置模块,用于记录所述分布式系统中被指定的作为候选的None节点的一部分节点; None节点选择模块,用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点作为None节点。 A selecting device node in a distributed system, the distributed system is PrestoDB cluster, characterized in that, the apparatus comprising: a configuration module, for recording the distributed system is designated as a candidate None part of the node; None node selection module for in cases where a plurality of data pieces Source nodes converge to a node, select a node in the candidate None None node as a node.
7.根据权利要求6所述的装置,其特征在于,还包括Fixed节点选择模块,用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 7. The apparatus according to claim 6, characterized in that, further comprising Fixed node selection module, needs to be used in a plurality of data pieces Source nodes converge to a plurality of Fixed nodes, it is determined whether to allow the current None Fixed node as a candidate node candidate, and if so, in the distributed system randomly select multiple nodes as Fixed node, otherwise than in the distributed system None of the candidates randomly selected node plurality of nodes As Fixed node.
8.根据权利要求6所述的装置,其特征在于,还包括Source节点选择模块,用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。 8. The apparatus according to claim 6, characterized in that the selection module further comprises a Source node is used in case of need to save the data fragments to the Source node determines whether to allow the current node as a candidate None Candidate Source node, and if so, in the distributed system randomly select multiple nodes as a Source node, otherwise than in the distributed system None of the candidates randomly selected nodes as a plurality of nodes Source node.
9.根据权利要求8所述的装置,其特征在于,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 9. The apparatus according to claim 8, characterized in that said Source node selection module is further configured in the current mode of sensing hardware, other than in the distributed system according to the candidate node None local principles as Source node to select multiple nodes.
10.根据权利要求8或9所述的装置,其特征在于,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 8 or 10. The apparatus according to claim 9, characterized in that said Source node selection module is further configured hardware in the current sensing mode, in the distributed system nodes of said candidate None outside, according to local principles as Source node to select multiple nodes.
Description  translated from Chinese
在分布式系统中选取节点的方法和装置 Method and apparatus for selecting a node in a distributed system

技术领域 TECHNICAL FIELD

[0001] 本发明涉及计算机技术领域,特别地涉及一种在分布式系统中选取节点的方法和 [0001] The present invention relates to the field of computer technology, particularly to a selected node in a distributed system, method and

目.0 Head .0

背景技术 Background technique

[0002] 伴随着大数据的兴起,互联网公司的业务数据量逐年上升,因此各大互联网公司都在内部推行大数据技术,并且针对于核心业务系统建设数据仓库,目前数据仓库分为两种类型:离线数据仓库和实时数据仓库。 [0002] With the rise of big data, traffic data Internet companies increased year by year, so the major Internet companies are implementing big data technology in-house, and for the core business system construction data warehouse, data warehouse is currently divided into two types : offline data warehouse and real-time data warehouse.

[0003] 离线数据仓库的代表产品就是hive,该产品由于底层计算框架是MapReduce,因此其适合于超大数据集的离线分析和计算,对于实时性要求比较高的数据分析和计算并不适合。 [0003] Representative offline data warehousing products that hive, the product is due to the underlying computing framework is MapReduce, so it is suitable for large data sets of off-line analysis and calculations, relatively high real-time requirements for data analysis and calculation is not suitable.

[0004] 实时数据仓库的代表产品是PrestoDB,该产品由FaceBook开发,采用了PipeLine的分布式数据计算和传输模式,对于大数据的分析和计算能够满足在100ms-20m之内,满足了实时数据分析和计算的要求。 [0004] The representative of real-time data warehouse products are PrestoDB, the product developed by the FaceBook, PipeLine uses distributed computing and data transfer mode, for large data analysis and calculation can be met within the 100ms-20m, to meet the real-time data Requirements analysis and calculation.

[0005] 由于PrestoDB是一个基于内存的分布式计算框架,在进行数据分析和计算的时候,PrestoDB首先将需要分析和计算的数据分为数据片并将每个数据片读取到PrestoDB的Source节点中的内存中,然后将每个Source节点内存中的数据通过网络汇聚到一个None节点或者多个Fixed节点中,具体是汇聚到None节点还是Fixed节点与聚合函数的类型相关,例如:如果查询中包含有order by语句,那么就需要对所有的结果进行整体排序,因此各个Source节点内存中的数据就需要汇聚到一个None节点中,然后进行整体排序;如果查询中包含有group by语句,那么就需要对所有的结果进行分组,因此各个Source节点内存中的数据就需要汇聚到多个Fixed节点中,从而进行分组。 [0005] Since PrestoDB is a distributed computing framework based on memory, making data analysis and computation time, PrestoDB will first need to be analyzed and calculated data into the data sheet and read each piece of data to the Source node PrestoDB in memory, then the data in memory each Source node over the network to an aggregation node None Fixed nodes or more, specifically converge to type None or Fixed node node associated with the aggregate functions, such as: If a query contains order by statement, then we need to sort all the results as a whole, so the data in memory each Source node requires a converged None node, then the whole sort; if the query contains a group by statement, then All the results need to be grouped, so the data for each Source node memory needs to converge to a plurality of Fixed nodes, thereby performing packet.

[0006] 目前PrestoDB是从整个集群中随机选取一个节点作为None节点的,具体PrestoDB各种节点的选取算法如图1所示,图1是根据现有技术中的在PrestoDB集群中选取节点的流程的示意图。 [0006] There are currently PrestoDB is randomly selected from the entire cluster as a node None node selection algorithm specific PrestoDB various nodes shown in Figure 1, Figure 1 is a prior art selected nodes in the cluster process PrestoDB FIG. 如图1所示,首先判断需要选取的节点的类型,如需选取None节点或Fixed节点,则在集群中随机选取;如需选取Source节点,先判断是否需要采用硬件感知的方式,若是,则根据数据本地性来选取,否则随机选取多个节点作为Source节点。 1, first need to determine the type of node selected, select None For nodes or Fixed node in the cluster is randomly selected; To Select Source nodes, first determine whether hardware-aware way, and if so, According to data locality to select, otherwise randomly select multiple nodes as a Source node. 这里的硬件感知是指感知需要处理的数据所在的位置,本地性是指优先选择数据所在的节点作为工作节点。 Here is the location-aware hardware perceived need to deal with where the data resides, local refers to the node where the data resides preference as the working node. 因为如果分配的工作节点,刚好就是需要处理的数据所在的节点,就能减少数据进行网络传输所需要的时间,能够减少计算任务所需要的时间。 Because node node if the assigned work, just need to deal with is where the data resides, can reduce the time network data transmission needs, you can reduce the time required for computing tasks. 所以在一些情况下可采用硬件感知方式,按本地性原则选取节点。 So in some cases it may be hardware-aware manner, according to the principle of selecting the local node.

[0007] 因此可以看出,如果一个节点被选择作为None节点,那么对其内存容量的要求就比较大。 [0007] Thus it can be seen, if a node is selected as None node, then its memory requirements is relatively large. 要想保证PrestoDB大数据量分析与计算的顺利进行,就必须对集群中的所有节点进行内存升级,使各个节点在被选择为None节点时都能胜任计算要求,这种升级工作量和成本都比较大。 To ensure the smooth progress PrestoDB large amount of data analysis and calculation, it is necessary for memory upgrades to all nodes in the cluster so that each node when the node is selected as None can be competent computing requirements, this upgrade effort and costs are bigger.

发明内容 SUMMARY OF THE INVENTION

[0008] 有鉴于此,本发明提供一种在分布式系统中选取节点的方法和装置,通过只提高None节点的内存容量,从而在比较低的工作量和成本下提高整个PrestoDB集群的性能。 [0008] In view of this, the present invention provides a method and apparatus for selecting a node in a distributed system by raising None node memory capacity, resulting in a relatively low cost to improve the performance of the entire workload and PrestoDB cluster.

[0009] 为实现上述目的,根据本发明的一个方面,提供了一种在分布式系统中选取节点的方法。 [0009] To achieve the above object, according to one aspect of the present invention, there is provided a node selecting method in a distributed system.

[0010] 本发明的在分布式系统中选取节点的方法中,分布式系统为PrestoDB集群,该方法包括:在所述分布式系统中,将指定的一部分节点作为候选的None节点;在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点,然后将所述多个Source节点的数据片汇聚到选择的节点。 [0010] selected node in a distributed system, the method of the present invention, a distributed system PrestoDB cluster, the method comprising: in the distributed system, a portion of the node as a candidate None specified node; the need to Source data sheet under a plurality of nodes converge to a node of the case, select a node in the candidate None node, then the data sheet of the plurality Source node to node choice aggregation.

[0011] 可选地,在将指定的一部分节点作为候选的None节点之后,还包括:在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 [0011] Alternatively, after a portion of the specified nodes as candidate nodes None, further comprising: a sheet at a plurality of data needed to converge to a plurality of nodes Source Fixed nodes, it is determined whether to allow the current candidate The None Fixed node node as a candidate, and if so, in the distributed system randomly select multiple nodes as Fixed node, otherwise than in the distributed system None of the candidates randomly selected nodes as a plurality of nodes Fixed node.

[0012] 可选地,在将指定的一部分节点作为候选的None节点之后,还包括:在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,则在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。 [0012] Alternatively, after a portion of the nodes in the specified node as a candidate of None, further comprising: in case of need to save the fragmented data to the Source node determines whether to allow the candidate of the current node as a candidate None The Source node, and if so, in the distributed system randomly select multiple nodes as a Source node, otherwise than in the distributed system None of the candidates randomly selected nodes as a plurality of nodes Source node.

[0013] 可选地,在所述分布式系统中随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中按照本地性原则选取多个节点作为Source 节点。 [0013] Alternatively, in a distributed system randomly select multiple nodes as a Source node comprises: in the case of the current perception of the way of hardware in the distributed system, in accordance with the principle of selecting a plurality of local node as the Source node.

[0014] 可选地,在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点的步骤包括:在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 [0014] Alternatively, other than in the distributed system None of the candidates randomly selected nodes as a plurality of nodes Source node comprises: in the case of the current perception of the way of hardware in the distributed system None other than in the candidate nodes to select multiple nodes as a Source node in accordance with the principle of locality.

[0015] 根据本发明的另一方面,提供了一种在分布式系统中选取节点的装置。 [0015] According to another aspect of the present invention, there is provided a selected node in a distributed system devices.

[0016] 对于本发明的在分布式系统中选取节点的装置,分布式系统为PrestoDB集群,该装置包括:配置模块,用于记录所述分布式系统中被指定的作为候选的None节点的一部分节点;None节点选择模块,用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在所述候选的None节点中选择一个节点作为None节点。 [0016] For a selected node in a distributed system device of the present invention, a distributed system PrestoDB cluster, the apparatus comprising: a configuration module, part of the recording of the distributed system is designated as a candidate node for None node; None node selection module for the need to converge to the next node in the case of a plurality of data pieces Source node, select a node as a node in the candidate None None nodes.

[0017] 可选地,还包括Fixed节点选择模块,用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许所述候选的None节点作为候选的Fixed节点,若是,则在所述分布式系统中随机选取多个节点作为Fixed节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Fixed节点。 [0017] Optionally, further comprising Fixed node selection module, used in the plurality of data pieces need to converge to a plurality of nodes Source Fixed nodes, it is determined whether to allow the current node as a candidate of the candidate Fixed None node, and if so, in the distributed system randomly select multiple nodes as Fixed node, otherwise than in the distributed system nodes None of the candidates randomly select multiple nodes as Fixed node.

[0018] 可选地,还包括Source节点选择模块,用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许所述候选的None节点作为候选的Source节点,若是,贝Ij在所述分布式系统中随机选取多个节点作为Source节点,否则在所述分布式系统中所述候选的None节点之外随机选取多个节点作为Source节点。 [0018] Optionally, also include Source node selection module, used in case of need to save the fragmented data to the Source node determines whether to allow the candidate of the current node as a candidate None Source node, and if so, shellfish Ij in the distributed system randomly select multiple nodes as a Source node, otherwise than in the distributed system None of the candidates randomly selected nodes as a plurality of nodes Source node.

[0019] 可选地,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 [0019] Alternatively, the Source node selection module is also used in the current case of hardware-aware way, than in the distributed systems of the candidate None nodes to select multiple nodes as in accordance with the principle of locality Source node.

[0020] 可选地,所述Source节点选择模块还用于在当前采用硬件感知方式的情况下,在所述分布式系统中所述候选的None节点之外按照本地性原则选取多个节点作为Source节点。 [0020] Alternatively, the Source node selection module is also used in the current case of hardware-aware way, than in the distributed systems of the candidate None nodes to select multiple nodes as in accordance with the principle of locality Source node.

[0021] 根据本发明的技术方案,在PrestoDB集群中指定一部分节点作为候选的None节点,从而将None节点的选取限定在一定范围之内,这样可以对该范围的节点进行内存升级和扩容,使之胜任计算要求。 [0021] According to the present invention, specified as part of the nodes in the cluster as a candidate PrestoDB None node, which will select None nodes limited within a certain range, which can be a memory upgrade and expansion of the scope of the node, so The competent computing requirements. 这种方式无需对整个PrestoDB集群的所有节点进行内存升级扩容,因此升级扩容的工作量比较低,并且能够提高整个PrestoDB集群的性能。 In this way all the nodes without the whole PrestoDB cluster expansion for memory upgrade, so upgrading and expansion of the workload is relatively low, and to improve the performance of the whole PrestoDB cluster.

附图说明 Brief Description

[0022] 附图用于更好地理解本发明,不构成对本发明的不当限定。 [0022] The drawings for a better understanding of the present invention, the present invention does not constitute an undue limitation. 其中: among them:

[0023] 图1是根据本发明实施例的示意图; [0023] FIG. 1 is a schematic diagram of an embodiment of the present invention;

[0024] 图2是根据本发明实施例的在分布式系统中选取节点的方法的示意图; [0024] FIG. 2 is a schematic diagram of selected nodes in a distributed system, the method embodiments of the present invention;

[0025] 图3是根据本发明实施例的在分布式系统中选取节点的装置的主要模块的示意图。 [0025] FIG. 3 is a schematic diagram of a selected device nodes in a distributed system, the main module embodiment of the present invention.

具体实施方式 DETAILED DESCRIPTION

[0026] 以下结合附图对本发明的示范性实施例做出说明,其中包括本发明实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。 [0026] the following with reference to an exemplary embodiment of the present invention make a statement, including all the details one embodiment of the present invention to facilitate understanding, they should be regarded as merely exemplary. 因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本发明的范围和精神。 Thus, those of ordinary skill in the implementation should be appreciated, the embodiments described herein can make various changes and modifications without departing from the scope and spirit of the invention. 同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。 Again, for clarity and conciseness, the following description omits a description of known functions and structures.

[0027] 在本发明实施例的方案中,事先指定PrestoDB集群中的一部分节点作为候选的None节点,在需要选择None节点时就从这一部分节点中选择。 [0027] In an embodiment of the present invention, the program, the pre-designated PrestoDB part of a cluster node as a candidate node None, if necessary, choose None from the part of the node node is selected. 也可以设置配置项,对于是从这一部分节点中选择None节点还是在PrestoDB集群中随机选择None节点进行配置。 You can also set the configuration items, for this part of the node is the node or select None in PrestoDB None randomly selected node cluster configuration. 在PrestoDB启动的时候,对该配置项进行解析,根据配置项中的配置信息,构建一个由对应的IP-Port对组成的一个列表,并在分配None节点时进行使用。 In PrestoDB starts, the configuration items for analysis, according to the configuration items in the configuration information, build a by the corresponding IP-Port on a list composed and use in allocating None nodes. 其配置规范例如:None汇聚节点=IP地址1:端口I ;IP地址2:端口2。 Its configuration specifications such as: None aggregation node = IP Address 1: Port I; IP Address 2: Port 2. 即指定了IP地址为地址I和地址2的两个节点作为候选的None节点,端口分别为端口I和端口2。 Which specifies the IP address and the address for the address I both nodes 2 as candidate None nodes, ports, respectively port I and port 2. 在配置项中,还可以对于是否允许上述的候选的None节点作为候选的Fixed节点进行配置,对于是否允许上述的候选的None节点作为候选的Source节点也进行配置。 In the configuration items can also be configured as a candidate for the Fixed node whether to allow the above-mentioned candidate None node, whether to allow for the above-mentioned candidate None Source node as a candidate node can be configured. 这样,在需要选择节点时,可按图2所示流程来进行。 Thus, when you need to select a node, according to the flow shown in Figure 2. 图2是根据本发明实施例的在分布式系统中选取节点的方法的示意图。 Figure 2 is a schematic diagram of the present invention is selected node in a distributed system, the method of this embodiment. 该方法可由PrestoDB中的Coordinator节点来执行。 The method may be PrestoDB the Coordinator node to perform.

[0028] 步骤S21:判断需要选择的节点的类型。 [0028] Step S21: the type of judgment need to select the node. 在需要将分片的数据保存到Source节点的情况下,需选择的节点是Source节点,进入步骤S24。 In case you need to save the fragmented data to the Source node, the node need to select the Source node proceeds to step S24. 在需要对数据片进行汇聚处理时根据聚合函数的类型来确定需选择的节点的类型,在需选择None节点时,进入步骤S22;在需选择Fixed节点时,进入步骤S23。 Type in the need for data aggregation processing sheet is based on the type of aggregate function to determine the need to select a node when the node need to select None proceeds to step S22; when you need to select the Fixed node proceeds to step S23.

[0029] 步骤S22:判断None节点是否从指定范围中选取。 [0029] Step S22: None determine whether the node is selected from the specified range. 该判断根据上述的配置项进行。 This determination is carried out according to the above configuration items. 若是,贝lJ从配置项中记录的候选的None节点中选取一个节点作为None节点(步骤S221),否则随机选取一个节点作为None节点(步骤S222)。 If so, Tony lJ configuration item records from the candidate node None None selected node as a node (step S221), otherwise a randomly selected node as None node (step S222).

[0030] 步骤S23:判断是否允许候选的None节点作为候选的Fixed节点。 [0030] Step S23: to judge whether to allow the candidate None Fixed node node as a candidate. 若是,则可以随机选取多个节点作为Fixed节点(步骤S231),否则在分布式系统中的候选的None节点之外随机选取多个节点作为Fixed节点。 If so, you can randomly select multiple nodes as Fixed node (step S231), otherwise than in a distributed system None candidate node randomly selected plurality of nodes as Fixed node.

[0031] 步骤S24:判断是否采用硬件感知的方式,若是,进入步骤S241,否则进入步骤S242o [0031] Step S24: determine whether the hardware sensible manner, if so, proceeds to step S241, otherwise proceeds to step S242o

[0032] 步骤S241:判断是否允许候选的None节点作为候选的Source节点。 [0032] Step S241: determine whether to allow the candidate None Source node as a candidate node. 若是,则可以按照本地性原则选取多个节点作为Source节点(步骤S2411),否则在分布式系统中的候选的None节点之外按照本地性原则选取多个节点作为Source节点(步骤S2412)。 If so, you can select multiple nodes principles as Source node (step S2411), according to local nature, otherwise than in the distributed system principles None candidate node to select multiple nodes as a Source node (step S2412), according to local resistance.

[0033] 步骤S242:判断是否允许候选的None节点作为候选的Source节点。 [0033] Step S242: determine whether to allow the candidate None Source node as a candidate node. 若是,则可以随机选取多个节点作为Source节点(步骤S2421),否则在分布式系统中的候选的None节点之外随机选取多个节点作为Source节点(步骤S2422)。 If so, you can randomly select multiple nodes as a Source node (step S2421), otherwise than in a distributed system None of the candidates randomly selected nodes as a plurality of nodes Source node (step S2422).

[0034] 图3是根据本发明实施例的在分布式系统中选取节点的装置的主要模块的示意图。 [0034] FIG. 3 is a schematic diagram of a selected device nodes in a distributed system, the main module embodiment of the present invention. 如图3所示,本发明实施例的在分布式系统中选取节点的装置30主要包括配置模块31和None节点选择模块32。 As shown in Figure 3, select a node in a distributed system 30 embodiment of the present invention includes a configuration module 31 and None node selection module 32. 配置模块31用于记录分布式系统中被指定的作为候选的None节点的一部分节点。 31 part of the node configuration module is used to record a distributed system is designated as a candidate None nodes. None节点选择模块32用于在需要将多个Source节点的数据片汇聚到一个节点的情况下,在候选的None节点中选择一个节点作为None节点。 None node selection module 32 is used in case of need more than one piece of data aggregation Source node to node, select a node as the candidate None None node in the node.

[0035] 装置30还可以包括还包括Fixed节点选择模块(图中未示出),用于在需要将多个Source节点的数据片汇聚到多个Fixed节点的情况下,判断当前是否允许候选的None节点作为候选的Fixed节点,若是,则在分布式系统中随机选取多个节点作为Fixed节点,否则在分布式系统中候选的None节点之外随机选取多个节点作为Fixed节点。 [0035] The apparatus 30 may further include a selection module further includes a Fixed Node (not shown), is used in the plurality of data pieces need to converge to a plurality of nodes Source Fixed nodes, it is determined whether to allow the current candidate None Fixed node node as a candidate, and if so, in a distributed system randomly select multiple nodes as Fixed node in a distributed system. Otherwise, the candidate None other than the plurality of nodes randomly selected node as Fixed node.

[0036] 装置30还可以包括Source节点选择模块(图中未示出),用于在需要将分片的数据保存到Source节点的情况下,判断当前是否允许候选的None节点作为候选的Source节点,若是,则在分布式系统中随机选取多个节点作为Source节点,否则在分布式系统中候选的None节点之外随机选取多个节点作为Source节点。 [0036] means 30 may also include Source node selection module (not shown), for in case of need to save the fragmented data to the Source node determines whether to allow the candidate None of the current node as a candidate node Source If so, then in a distributed system randomly select multiple nodes as a Source node, in addition to a distributed system. Otherwise, the candidate None nodes randomly selected as a plurality of nodes Source node.

[0037] Source节点选择模块还可用于在当前采用硬件感知方式的情况下,在分布式系统中候选的None节点之外按照本地性原则选取多个节点作为Source节点。 [0037] Source node selection module can also be used in the current hardware-aware way, in addition to a distributed system principles candidate None nodes to select multiple nodes as a Source node according to locality. Source节点选择模块还用于在当前采用硬件感知方式的情况下,在分布式系统中候选的None节点之外按照本地性原则选取多个节点作为Source节点。 Source node selection module is also used in the current case of hardware-aware way, in addition to a distributed system principles candidate None nodes to select multiple nodes as a Source node according to locality.

[0038] 根据本发明实施例的技术方案,在PrestoDB集群中指定一部分节点作为候选的None节点,从而将None节点的选取限定在一定范围之内,这样可以对该范围的节点进行内存升级和扩容,使之胜任计算要求。 [0038] According to an embodiment of the present invention, the technical solution, specifying a subset of nodes in the cluster as a candidate PrestoDB None node, which will select None nodes limited within a certain range, which can be a memory upgrade and expansion of the scope of node so that competent computing requirements. 这种方式无需对整个PrestoDB集群的所有节点进行内存升级扩容,因此升级扩容的工作量比较低,并且能够提高整个PrestoDB集群的性能。 In this way all the nodes without the whole PrestoDB cluster expansion for memory upgrade, so upgrading and expansion of the workload is relatively low, and to improve the performance of the whole PrestoDB cluster.

[0039] 上述具体实施方式,并不构成对本发明保护范围的限制。 [0039] Specific embodiments described above does not constitute a restriction on the scope of the present invention. 本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。 Those skilled in the art should understand that, depending on design requirements and other factors may occur that various modifications, combinations, sub-combinations and alterations. 任何在本发明的精神和原则之内所作的修改、等同替换和改进等,均应包含在本发明保护范围之内。 Any modifications within the spirit and principles of the present invention made, equivalent replacement and improvement should be included within the scope of the present invention.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
CN101102225A *26 Jul 20079 Jan 2008北京航空航天大学Management method of wireless sensor network nodes
CN101924777A *17 Jun 200922 Dec 2010中国移动通信集团公司Method, system and equipment for searching active nodes in P2P streaming media system
CN103188161A *30 Dec 20113 Jul 2013中国移动通信集团公司Method and system of distributed data loading scheduling
CN104168332A *1 Sep 201426 Nov 2014广东电网公司信息中心Load balance and node state monitoring method in high performance computing
US7710884 *1 Sep 20064 May 2010International Business Machines CorporationMethods and system for dynamic reallocation of data processing resources for efficient processing of sensor data in a distributed network
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
WO2016112831A1 *11 Jan 201621 Jul 2016北京京东尚科信息技术有限公司Method and device of selecting distributed system node
Classifications
International ClassificationH04L29/08
Cooperative ClassificationH04L29/08
Legal Events
DateCodeEventDescription
29 Apr 2015C06Publication
27 May 2015C10Entry into substantive examination