CN103389715A - High-performance distributed data center monitoring framework - Google Patents
High-performance distributed data center monitoring framework Download PDFInfo
- Publication number
- CN103389715A CN103389715A CN2013103181767A CN201310318176A CN103389715A CN 103389715 A CN103389715 A CN 103389715A CN 2013103181767 A CN2013103181767 A CN 2013103181767A CN 201310318176 A CN201310318176 A CN 201310318176A CN 103389715 A CN103389715 A CN 103389715A
- Authority
- CN
- China
- Prior art keywords
- monitoring
- framework
- alarm
- data center
- scheduling process
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Abstract
The invention discloses a high-performance distributed data center monitoring framework. The framework structurally comprises monitoring core engines, monitoring and dispatching progresses, active monitoring and roll poling devices, passive monitoring receivers, warning engines and monitoring data processing centers. The monitoring framework is designed in a distributed manner, the involved processing processes in the framework are separated, detailed and modularized, the framework is divided into six modules for finishing work of each stage, and one monitoring core engine is reserved for finishing dispatching operation of each module, so that the consumed resource quantity during operation of the whole monitoring framework is reduced, equal distribution of the consumed resources in each module is realized, finally, the high performance during data center monitoring is realized, and the monitoring scale can be expanded to tens of thousands of nodes even hundreds of thousands of nodes.
Description
Technical field
The present invention relates to distributed monitoring and data center's monitoring field, be specifically related to a kind of data center high performance, distributed, that monitoring is in large scale and monitor framework.
Background technology
Current, the scale of data center is increasing, high performance data center monitoring demand is more and more stronger, but because traditional monitoring framework only has too fat to move, monitoring core poor efficiency, the work efficiencies such as various data sampling and processings and analysis are very low, and there is insurmountable performance bottleneck problem, can't be to large-scale data center implementing monitoring.In actual applications, along with data center build larger and larger, more and more higher to the requirement of data center monitoring, traditional monitoring framework can't reach user demand, the performance bottleneck problem is also very serious.This traditional, that have performance bottleneck, integrate the too fat to move monitoring framework that all are processed, when to the device resource monitoring in enormous quantities of data center, efficiency is very low, resource cost serious, have performance bottleneck, can only the data center of 2000 node scales be monitored.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of more efficient, framework of data center's monitoring more accurately of the monitoring to the large-scale data center.
The technical solution used in the present invention is: a kind of high performance distributed data center monitoring framework, the architecture of this framework comprises: monitoring core engine, monitoring and scheduling process, active monitoring interrogator, passive type monitoring receiver, alarm engine and monitor data processing enter, wherein:
The monitoring core engine is the core of this framework, is responsible for driving, dispatching each module, also is responsible for reading every configuration of necessary for monitoring, will automatically configure according to being distributed to the monitoring and scheduling module after the division of dispatching process number;
The monitoring and scheduling process is responsible for mainly that monitoring according to the distribution of monitoring core engine configures to drive and dispatch active monitoring interrogator or the passive type monitoring receiver carries out monitoring data collection or reception, can be automatically a plurality of according to the startup of configuration scale, can carry out efficiently work to guarantee each monitoring and scheduling process;
Active monitoring interrogator is initiatively carried out the Monitoring Data collection;
The passive reception Monitoring Data of passive type monitoring receiver;
The alarm engine mainly be responsible for to be monitored alarm notification or the event handling action of institute's monitor device resources, according to monitoring that content is carried out the mail alarm, short message alarm sends or processing the event that produces etc.;
The monitor data processing enter is responsible for collecting, recording the Monitoring Data of generation, it is recorded to daily record, database or RRD database, and (RRD is the abbreviation of Round Robin Database, be used for recording fixed number, has cycle characteristics, and the data that particular value is arranged at current point in time, such as take sky as unit record temperature) in, and carry out data and process, analyze and obtain fault trend, historical monitor state curve, usability analyses form etc.
Node scale for data center, monitoring capacity according to each monitoring and scheduling process, enable the monitoring and scheduling process, the monitoring core engine can be automatically will be monitored configuration according to the dispatching process number and divides, and ready-portioned monitoring configuration is distributed to each monitoring and scheduling process gets on; Then, the monitoring core engine can drive the monitoring and scheduling process and start to carry out work, after each monitoring and scheduling process operation, drives active monitoring interrogator or passive type monitoring receiver and gathers, collects Monitoring Data; After monitoring data collection, the monitoring and scheduling process can send it to the monitor data processing enter, carries out processing, analysis and the record of data; The alarm engine is as the alarm core in whole monitoring framework, carry out work in self-driven mode, monitor alarm notification or the event handling action of monitored node, and according to monitoring content, make corresponding actions, send alarm email, note or process the event that produces.
In architecture, monitoring core engine, monitoring and scheduling process, active monitoring interrogator, passive type monitoring receiver, alarm engine, monitor data processing enter be modularized processing all, be the monitoring of whole data center framework is distributed is deployed on different servers, take full advantage of the resource that possesses separately and form one and can monitor that hundreds of thousands node scale is data center, high performance supervisory system.
The large module of in this framework six has all designed a spare module in the supervisory system application, with fault-tolerance and the stability that guarantees this system.
Active monitoring interrogator is designed to mode extending transversely automatically, distribute the monitoring task of getting off according to the monitoring and scheduling process and automatically adjust the number of active monitoring interrogator, guarantee the moderate pressure of each interrogator, with this, reach efficiently, initiatively carry out accurately the purpose that Monitoring Data gathers.
beneficial effect of the present invention: the present invention has broken traditional, there is performance bottleneck, integrate the too fat to move monitoring framework that all are processed, when data center carries out the monitoring of device resource in enormous quantities, efficiency is very low, resource cost is serious, there is performance bottleneck (as can only the data center of 2000 node scales be monitored) etc., carry out Distributed Design by monitoring framework, the processing procedure that relates in framework is separated, refinement, modularization, be divided into is that six large modules are completed the work in each stage, and keep a monitoring core engine and complete the management and running of each module, and then the consumes resources amount while reducing the operation of whole monitoring framework, and accomplish that each module institute's cost source is impartial and distribute.Finally, the high-performance while having realized the data center monitoring, the popularization that can monitor has arrived several ten thousand, a hundreds of thousands node.
Description of drawings
Accompanying drawing 1 is traditional monitoring configuration diagram;
Accompanying drawing 2 is based on distributed high-performance data center monitoring configuration diagram;
Accompanying drawing 3 is the distributed data central monitor system schematic diagram of 100,000 node scales.
Embodiment
With reference to Figure of description, content of the present invention is done following detailed explanation with an instantiation:
The supervisory system that builds as shown in Figure 3 100,000 node scales is example, sets forth the specific implementation of high performance distributed data center monitoring framework.
For the data center with 1,000,000,000 node scales, be about 10000 nodes according to the monitoring capacity of each monitoring and scheduling process, need to enable 10 monitoring and scheduling processes and 1 standby monitoring and scheduling process while therefore disposing monitoring.The monitoring core engine will be automatically will be monitored configuration according to the dispatching process number and be divided into 10 parts, and ready-portioned monitoring configuration is distributed to each monitoring and scheduling process will get on.Then; the monitoring core engine can drive the monitoring and scheduling process and start to carry out work; after each monitoring and scheduling process operation; just according to the configuration in the configuration active or the passive type monitoring mode drives active monitoring interrogator or the passive type monitoring receiver gathers, collects Monitoring Data; can start 5 active monitoring interrogator according to the node scale and respond the monitoring data collection task that 10, upper strata monitoring and scheduling process issues, namely two monitoring and scheduling processes of 1 active monitoring interrogator response issue task.After monitoring data collection, the monitoring and scheduling process can send it to data processing centre (DPC), carries out processing, analysis and the record etc. of data.The alarm engine is to carry out work as the alarm core in whole monitoring framework in self-driven mode, it will monitor alarm notification or the event handling action of monitored node, and according to monitoring content, make corresponding actions, send the event of alarm email, note or processing generation etc.As shown in FIG., the large module of six in this framework has all designed a spare module in a little supervisory system application, with fault-tolerance and the stability that guarantees this system.
Claims (4)
1. high performance distributed data center monitoring framework, it is characterized in that: the architecture of this framework comprises: monitoring core engine, monitoring and scheduling process, active monitoring interrogator, passive type monitoring receiver, alarm engine and monitor data processing enter, wherein:
The monitoring core engine is the core of this framework, is responsible for driving, dispatching each module, also is responsible for reading every configuration of necessary for monitoring, will automatically configure according to being distributed to the monitoring and scheduling module after the division of dispatching process number;
The monitoring and scheduling process is responsible for mainly that monitoring according to the distribution of monitoring core engine configures to drive and dispatch active monitoring interrogator or the passive type monitoring receiver carries out monitoring data collection or reception, can automatically according to the configuration scale, start a plurality of monitoring and scheduling processes;
Active monitoring interrogator is initiatively carried out the Monitoring Data collection;
The passive reception Monitoring Data of passive type monitoring receiver;
The alarm engine mainly is responsible for monitoring alarm notification or the event handling action of institute's monitor device resources, and content is carried out the mail alarm, short message alarm sends or process the event that produces according to monitoring;
The monitor data processing enter is responsible for collecting, recording the Monitoring Data of generation, it is recorded in daily record, database or RRD database, and carries out data and process, analyze and obtain fault trend, historical monitor state curve, usability analyses form;
Node scale for data center, monitoring capacity according to each monitoring and scheduling process, enable the monitoring and scheduling process, the monitoring core engine can be automatically will be monitored configuration according to the dispatching process number and divides, and ready-portioned monitoring configuration is distributed to each monitoring and scheduling process gets on; Then, the monitoring core engine can drive the monitoring and scheduling process and start to carry out work, after each monitoring and scheduling process operation, drives active monitoring interrogator or passive type monitoring receiver and gathers, collects Monitoring Data; After monitoring data collection, the monitoring and scheduling process can send it to the monitor data processing enter, carries out processing, analysis and the record of data; The alarm engine is as the alarm core in whole monitoring framework, carry out work in self-driven mode, monitor alarm notification or the event handling action of monitored node, and according to monitoring content, make corresponding actions, send alarm email, note or process the event that produces.
2. high performance distributed data center monitoring framework according to claim 1, it is characterized in that: described monitoring core engine, monitoring and scheduling process, active monitoring interrogator, passive type monitoring receiver, alarm engine, monitor data processing enter be modularized processing all, i.e. whole data center monitoring framework distributed earth is deployed on different servers.
3. high performance distributed data center monitoring framework according to claim 2 is characterized in that: described monitoring core engine, monitoring and scheduling process, active monitoring interrogator, passive type monitoring receiver, alarm engine, the large module of monitor data processing enter six all design a spare module in the supervisory system application.
4. according to claim 1,2 or 3 described high performance distributed data center monitoring frameworks, it is characterized in that: described active monitoring interrogator is designed to mode extending transversely automatically, distributes the monitoring task of getting off according to the monitoring and scheduling process and automatically adjusts the number of active monitoring interrogator.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310318176.7A CN103389715B (en) | 2013-07-26 | 2013-07-26 | A kind of high performance distributive data center monitoring framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310318176.7A CN103389715B (en) | 2013-07-26 | 2013-07-26 | A kind of high performance distributive data center monitoring framework |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103389715A true CN103389715A (en) | 2013-11-13 |
CN103389715B CN103389715B (en) | 2016-03-23 |
Family
ID=49534014
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310318176.7A Active CN103389715B (en) | 2013-07-26 | 2013-07-26 | A kind of high performance distributive data center monitoring framework |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103389715B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103618644A (en) * | 2013-11-26 | 2014-03-05 | 曙光信息产业股份有限公司 | Distributed monitoring system based on hadoop cluster and method thereof |
CN105094698A (en) * | 2015-07-08 | 2015-11-25 | 浪潮(北京)电子信息产业有限公司 | Method for predicting disc capacity based on historical monitoring data |
CN106027306A (en) * | 2016-05-26 | 2016-10-12 | 浪潮(北京)电子信息产业有限公司 | Resource monitoring method and device |
CN106100938A (en) * | 2016-08-19 | 2016-11-09 | 浪潮(北京)电子信息产业有限公司 | The monitoring of a kind of distributed cluster system and alarm method and system |
CN106202324A (en) * | 2016-06-30 | 2016-12-07 | 北京奇虎科技有限公司 | The data processing method of a kind of real-time calculating platform and device |
CN106354616A (en) * | 2016-08-18 | 2017-01-25 | 北京并行科技股份有限公司 | Method and device for monitoring application execution performance and high-performance computing system |
CN106407078A (en) * | 2016-09-26 | 2017-02-15 | 中国工商银行股份有限公司 | An information interaction-based client performance monitoring device and method |
CN107508731A (en) * | 2017-10-10 | 2017-12-22 | 郑州云海信息技术有限公司 | A kind of large-scale data center monitoring method and system |
CN108234150A (en) * | 2016-12-09 | 2018-06-29 | 中兴通讯股份有限公司 | For the data acquisition and processing (DAP) method and system of data center's monitoring system |
CN108259270A (en) * | 2018-01-11 | 2018-07-06 | 郑州云海信息技术有限公司 | A kind of data center's system for unified management design method |
CN108563550A (en) * | 2018-04-23 | 2018-09-21 | 上海达梦数据库有限公司 | A kind of monitoring method of distributed system, device, server and storage medium |
WO2018199817A1 (en) * | 2017-04-24 | 2018-11-01 | Telefonaktiebolaget Lm Ericsson (Publ) | Message queue performance monitoring |
CN108809701A (en) * | 2018-05-23 | 2018-11-13 | 郑州云海信息技术有限公司 | A kind of data center's wisdom data platform and its implementation |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020002443A1 (en) * | 1998-10-10 | 2002-01-03 | Ronald M. Ames | Multi-level architecture for monitoring and controlling a functional system |
US20080082181A1 (en) * | 2006-09-29 | 2008-04-03 | Fisher-Rosemount Systems, Inc. | Statistical signatures used with multivariate analysis for steady-state detection in a process |
CN101232515A (en) * | 2008-02-25 | 2008-07-30 | 浪潮电子信息产业股份有限公司 | Distributed type colony management control system based on LDAP |
CN102591282A (en) * | 2012-02-14 | 2012-07-18 | 浙江鼎丰实业有限公司 | Distributed data collection and transmission system |
CN102608970A (en) * | 2012-03-05 | 2012-07-25 | 浪潮通信信息系统有限公司 | Distributed data acquisition method based on centralized management and automatic scheduling |
CN102970183A (en) * | 2012-11-22 | 2013-03-13 | 浪潮(北京)电子信息产业有限公司 | Cloud monitoring system and data reflow method thereof |
-
2013
- 2013-07-26 CN CN201310318176.7A patent/CN103389715B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020002443A1 (en) * | 1998-10-10 | 2002-01-03 | Ronald M. Ames | Multi-level architecture for monitoring and controlling a functional system |
US20080082181A1 (en) * | 2006-09-29 | 2008-04-03 | Fisher-Rosemount Systems, Inc. | Statistical signatures used with multivariate analysis for steady-state detection in a process |
CN101232515A (en) * | 2008-02-25 | 2008-07-30 | 浪潮电子信息产业股份有限公司 | Distributed type colony management control system based on LDAP |
CN102591282A (en) * | 2012-02-14 | 2012-07-18 | 浙江鼎丰实业有限公司 | Distributed data collection and transmission system |
CN102608970A (en) * | 2012-03-05 | 2012-07-25 | 浪潮通信信息系统有限公司 | Distributed data acquisition method based on centralized management and automatic scheduling |
CN102970183A (en) * | 2012-11-22 | 2013-03-13 | 浪潮(北京)电子信息产业有限公司 | Cloud monitoring system and data reflow method thereof |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103618644A (en) * | 2013-11-26 | 2014-03-05 | 曙光信息产业股份有限公司 | Distributed monitoring system based on hadoop cluster and method thereof |
CN105094698A (en) * | 2015-07-08 | 2015-11-25 | 浪潮(北京)电子信息产业有限公司 | Method for predicting disc capacity based on historical monitoring data |
CN105094698B (en) * | 2015-07-08 | 2018-09-11 | 浪潮(北京)电子信息产业有限公司 | A kind of disk size prediction technique based on Historical Monitoring data |
CN106027306A (en) * | 2016-05-26 | 2016-10-12 | 浪潮(北京)电子信息产业有限公司 | Resource monitoring method and device |
CN106202324A (en) * | 2016-06-30 | 2016-12-07 | 北京奇虎科技有限公司 | The data processing method of a kind of real-time calculating platform and device |
CN106202324B (en) * | 2016-06-30 | 2020-10-30 | 北京奇虎科技有限公司 | Data processing method and device for real-time computing platform |
CN106354616B (en) * | 2016-08-18 | 2019-05-03 | 北京并行科技股份有限公司 | Monitor the method, apparatus and high performance computing system of application execution performance |
CN106354616A (en) * | 2016-08-18 | 2017-01-25 | 北京并行科技股份有限公司 | Method and device for monitoring application execution performance and high-performance computing system |
CN106100938A (en) * | 2016-08-19 | 2016-11-09 | 浪潮(北京)电子信息产业有限公司 | The monitoring of a kind of distributed cluster system and alarm method and system |
CN106407078A (en) * | 2016-09-26 | 2017-02-15 | 中国工商银行股份有限公司 | An information interaction-based client performance monitoring device and method |
CN106407078B (en) * | 2016-09-26 | 2019-06-25 | 中国工商银行股份有限公司 | Client performance monitoring device and method based on information exchange |
CN108234150A (en) * | 2016-12-09 | 2018-06-29 | 中兴通讯股份有限公司 | For the data acquisition and processing (DAP) method and system of data center's monitoring system |
WO2018199817A1 (en) * | 2017-04-24 | 2018-11-01 | Telefonaktiebolaget Lm Ericsson (Publ) | Message queue performance monitoring |
US10853153B2 (en) | 2017-04-24 | 2020-12-01 | Telefonaktiebolaget Lm Ericsson (Publ) | Message queue performance monitoring |
CN107508731A (en) * | 2017-10-10 | 2017-12-22 | 郑州云海信息技术有限公司 | A kind of large-scale data center monitoring method and system |
CN108259270A (en) * | 2018-01-11 | 2018-07-06 | 郑州云海信息技术有限公司 | A kind of data center's system for unified management design method |
CN108563550A (en) * | 2018-04-23 | 2018-09-21 | 上海达梦数据库有限公司 | A kind of monitoring method of distributed system, device, server and storage medium |
CN108809701A (en) * | 2018-05-23 | 2018-11-13 | 郑州云海信息技术有限公司 | A kind of data center's wisdom data platform and its implementation |
Also Published As
Publication number | Publication date |
---|---|
CN103389715B (en) | 2016-03-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103389715B (en) | A kind of high performance distributive data center monitoring framework | |
CN106651633B (en) | Power utilization information acquisition system based on big data technology and acquisition method thereof | |
CN104407964B (en) | A kind of centralized monitoring system and method based on data center | |
CN108845878A (en) | The big data processing method and processing device calculated based on serverless backup | |
CN109873499B (en) | Intelligent power distribution station management terminal | |
CN105608223A (en) | Hbase database entering method and system for kafka | |
CN107302466A (en) | A kind of power & environment supervision system big data analysis platform and method | |
CN101673100B (en) | Acquisition method and system of parameters of technique process | |
CN105320757A (en) | Business intelligent analysis method for quickly processing data | |
CN105094698A (en) | Method for predicting disc capacity based on historical monitoring data | |
CN106097161A (en) | Water affairs management system and data processing method thereof | |
CN105430030A (en) | OSG-based parallel extendable application server | |
CN102355696A (en) | Large scale Internet of things gateway system and realization method thereof | |
CN112462724A (en) | Data monitoring system based on industrial internet | |
CN107480027A (en) | A kind of distributed deep learning operational system | |
CN103973516A (en) | Method and device for achieving monitoring function in data processing system | |
CN105373620A (en) | Mass battery data exception detection method and system for large-scale battery energy storage power stations | |
US10331484B2 (en) | Distributed data platform resource allocator | |
CN104391990A (en) | Multi-task type collecting and harvesting method based on vertical industry | |
CN114598586B (en) | Multi-cloud scene computing power gridding method and system | |
CN102571424A (en) | Processing method, device and system for engineering event | |
CN113487170A (en) | Full link monitoring system with layered technical architecture | |
CN202231739U (en) | Large-scale internet of things gateway system | |
CN109302723A (en) | A kind of multinode real-time radio pyroelectric monitor control system Internet-based and control method | |
CN205158617U (en) | Equipment inspection maintains and data acquisition system based on RFID radio frequency technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20190715 Address after: 250100 North 3-storey North District, No. 1036 Tidal Road, Tidal Science Park S05 Building, Jinan High-tech Zone, Shandong Province Patentee after: Shandong Yingxin Computer Technology Co., Ltd. Address before: 250014 Shandong Province, Ji'nan City hi tech Development Zone, Nga Road No. 1036 Patentee before: Langchao Electronic Information Industry Co., Ltd. |
|
TR01 | Transfer of patent right |