WO2015170365A1

WO2015170365A1 - Server device and computer system

Info

Publication number: WO2015170365A1
Application number: PCT/JP2014/002435
Authority: WO
Inventors: 誠司関
Original assignee: 三菱電機株式会社
Priority date: 2014-05-08
Filing date: 2014-05-08
Publication date: 2015-11-12
Also published as: TW201543224A

Abstract

The present invention provides a computer system in which the results of processing performed by an application running on a slave device can be taken over when the slave device is reset after the communication between the slave device and the master device has been interrupted. A slave server constituting the computer system of the present invention is provided with an application monitoring unit which, after a watchdog timer has timed out, monitors whether processing initiated by an application prior to the time-out is completed, wherein if the processing is completed, the application monitoring unit outputs a notification signal indicating the completion, and transmits output data to another server device via a network, said output data including both the results of the completed processing performed by the application and an added identification flag which indicates that the processing has been completed after the time-out of the watchdog timer.

Description

Server device and computer system

The present invention relates to a computer system composed of a master server and a slave server, and relates to a technique for initializing a slave server when a communication error occurs between the master server and the slave server.

In a computer system consisting of a master server and a slave server, if a communication error occurs where the slave server cannot receive data sent from the master server, the slave server hangs up and the master server connects the slave server via the network. There was a problem that it could not be controlled and the slave server could not be reset.

In order to solve these problems, an initialization command is transmitted from the master device to the slave device, the watchdog timer is initialized and counted down every time the slave device receives the initialization command, and an initialization command ( The following patent document discloses a control method for initializing (resetting) a slave device when the watchdog timer times out without receiving a clear command.

JP 2006-110150 A JP 2004-274275 A

The computer system control method disclosed in the above patent document assumes a case where a master device and a slave device communicate one-to-one, and in a network in which a plurality of slave devices executing a plurality of applications exist, the slave device However, since this is not intended for a system that performs data communication with other slave devices, a situation occurs in which the slave device is initialized regardless of the data reception status between the slave devices.

FIG. 5 is a diagram for explaining a problem in resetting a slave device in a computer system having a plurality of slave devices. As shown in FIG. 5, the slave device 1 that has received the input data 1 from the master device executes an application for the input data 1. During the execution of application processing, if a reception error occurs in the slave device 1 that prevents receiving data from the master device, the next input data 2 cannot be received. Furthermore, since the clear command from the master device cannot be received within a predetermined time, the watchdog timer (WDT) of the slave device 1 times out.

Here, when the slave device 1 is reset regardless of the operation status of the application of the slave device 1 and the data transfer status between the slave devices, the output data of the application process for the input data 1 executed in the slave device 1 is displayed. There is a problem in that it is not possible to take over and it is necessary to re-execute an application process that has already been executed.

The present invention has been made to solve the above-described problems, and in the case where the slave device is reset when communication between the slave device and the master device is interrupted, the application processing executed in the slave device is performed. An object of the present invention is to provide a computer system capable of taking over results.

A computer system according to the present invention is a computer system in which a master server and a slave server that executes an application according to input data transmitted from the master server are connected via a network.
The master server periodically transmits a clear command for detecting a communication abnormality with the slave server to the slave server,
The slave server initializes the watchdog timer in response to the clear command and starts a countdown. When the watchdog timer times out, the slave server completes all processes of the application that has started executing before the timeout occurs. And sending output data with an identification flag indicating that the processing is completed after the watchdog timer has timed out to the processing result of the completed application to another server via the network,
The other server that has received the output data transmits the received output data to the master server based on the identification flag added to the output data.

According to the computer system according to the present invention, when the watchdog timer of the slave server times out, the processing of the application being executed in the slave server is completed, and the processing is completed after the watchdog timer times out. Since an identification flag indicating completion is added and transmitted to another server device, the processing result of the application executed in the slave device can be taken over when a communication error with the master server occurs.

It is a block diagram of the computer system concerning one Embodiment of this invention. It is a flowchart which shows operation | movement of a slave server. It is a figure which shows an example of the output data format of a slave server. It is a figure which shows operation | movement of the computer system concerning one Embodiment of this invention. It is a figure which shows operation | movement of a computer system.

FIG. 1 is a configuration diagram of a computer system according to an embodiment of the present invention. The computer system shown in FIG. 1 is configured by connecting a plurality of

slave servers

110 and 120 to a master server 100 via a network. The slave server 120 is assumed to have the same configuration as the slave server 110.

The slave server 110 can execute a plurality of applications, and communicates with the master server 100 and the slave server 120. The master server 100 includes a clear command transmission unit 101 that periodically transmits a clear command for detecting a communication abnormality with the slave server 110 and a communication unit 102 that transmits the clear command to the slave server 110.

The slave server 110 includes an application monitoring unit 111, a reset unit 112, a watchdog timer (hereinafter referred to as WDT) 113, and a communication unit 114. The slave server 110 can process a plurality of applications, receives input data from the master server 100 for each application, and transmits a processing result of each application as output data to the master server.

The communication unit 114 of the slave server 110 transmits and receives data to and from the master server 100 and other slave servers 120. WDT 113 initializes a timer in response to a clear command from master server 100 input via communication unit 114 and starts a countdown. If the clear command is not received within a predetermined period after the countdown is started and time-out occurs, the WDT 113 notifies the application monitoring unit 111 that time-out has occurred.

When the application monitoring unit 111 monitors the processing status of the application and receives a notification indicating that the WDT 113 has timed out, the application monitoring unit 111 confirms that the processing of all the active applications has been completed, and notifies the reset unit 112 accordingly. To do. The reset unit 112 receives the notification from the application monitoring unit 111 and resets the slave server 110. In addition, the application monitoring unit 111 transmits output data to the other server 120 via the communication unit 114, to which the identification flag indicating that the processing is completed after the timeout is added to the processing result of the application. Furthermore, the application monitoring unit 111 transmits a notification signal indicating that the processing of all applications has been completed to the master server 100 and the slave server 120 via the communication unit 114.

FIG. 3 is a diagram illustrating an example of a format of data output from the slave server 110. As shown in FIG. 3, output data is given an output data ID, a data size, an identification flag, and a sequence number.

FIG. 2 is a flowchart showing the operation of the slave server 110 shown in FIG. 1. When the input data from the master server 100 cannot be received due to a communication error or the like and the slave server 110 is initialized when it hangs up. The operation is shown.

Slave server 110 detects whether or not a clear command periodically transmitted from master server 100 has been received (ST1). When the clear command is received, the slave server 110 initializes the WDT 113 and starts counting down (ST2). If the clear command cannot be received within the predetermined period, WDT 113 times out (ST3).

If the WDT 113 times out, the application monitoring unit 111 waits for completion of the application being processed (ST4). When the processing of the application is completed (ST5), output data in which an identification flag indicating that the processing result is the processing result after the time-out is added to other predetermined servers (master server 100, slave server 120). (ST6). Here, the slave server 120 that has received the output data with the identification flag added detects the identification flag in the application monitoring unit, and transmits the output data to the master server 100.

When all the applications being processed in the slave server 110 are terminated (ST7), the application monitoring unit 111 notifies the other server of the completion of the processing of the application via the communication unit 114 (ST8). Further, the application monitoring unit 111 notifies the reset unit 112 that the processing of the application has been completed, and the reset unit 112 resets the slave server 110 (ST9).

FIG. 4 is a diagram showing the operation of the computer system shown in FIG. 1, and initializes the slave server 110 when the slave server 110 cannot receive input data from the master server 100 due to a communication error or the like and hangs up. The operation is shown.

Master server 100 transmits input data 1 to slave server 110. The slave server 110 that has received the input data 1 performs an application process on the input data 1. If a reception abnormality occurs in the slave server 110 that cannot receive data from the master server 100 during application processing, the next input data 2 cannot be received. Furthermore, since the clear command from the master server 100 cannot be received within a predetermined time, a timeout of the WDT 113 occurs.

When the WDT 113 times out, when the application monitoring unit 111 of the slave server 110 completes the application process for the input data 1, the output data 1 with the identification flag indicating the processing result after the timeout is added to the processing result as the master server 100, And to the slave server 120. The slave server 120 that has received the output data 1 detects the identification flag and transmits the output data 1 to the master server.

When the application processing of the slave server 110 is completed, the application monitoring unit 111 of the slave server 110 notifies the other server of the completion of application processing via the communication unit 114. The slave server 120 that has received the end notification transmits the received end notification to the master server 100. Then, the reset unit 112 of the slave server 110 resets the slave server 110 in response to the notification from the application monitoring unit 111.

As described above, by completing the application processing before resetting the slave server 110 and transmitting the processing result to another slave server, the master server 100 has caused the output data 1 for the input data 1 to be abnormal. Can be received via no slave server 120.

As a result, since the data processed by the slave server 110 can be taken over, it is possible to prevent data inconsistency, and it is not necessary to perform the processing completed on the slave server again. It is possible to suppress a decrease in system performance associated with.

100 master server, 101 clear command transmission unit, 110, 120, slave server, 111 application monitoring unit, 112 reset unit, 113 WDT, 102, 114 communication unit

Claims

A server device that executes an application according to input data transmitted from a master server,
A watchdog timer that receives a clear command periodically transmitted from the master server, is initialized according to the clear command, and starts a countdown;
When the watchdog timer times out, it monitors whether or not the processing of the application that has started execution before the time-out occurs, and outputs a notification signal indicating that when the processing is completed, An application monitoring unit for transmitting output data to which the identification flag indicating that the process is completed after the watchdog timer time-out is added to the processing result of the completed application to another server device via the network; A server device comprising:
The server device according to claim 1, further comprising a reset unit that performs a reset process in response to a notification signal output from the application monitoring unit.
The server apparatus according to claim 1, wherein the application monitoring unit transmits a notification signal indicating that the application process is completed after the watchdog timer has timed out to another server via the network.
A computer system in which a master server and a slave server that executes an application according to input data transmitted from the master server are connected via a network,
The master server periodically transmits a clear command for detecting a communication abnormality with the slave server to the slave server,
The slave server initializes the watchdog timer in response to the clear command and starts a countdown, and when the watchdog timer times out, completes the processing of the application that has started execution before the timeout occurs, Sending output data with an identification flag indicating that the process has been completed after the watchdog timer has timed out to the process result of the completed application to another server via the network,
The other server that has received the output data transmits the received output data to the master server based on the identification flag added to the output data.
The slave server, after a timeout of the watchdog timer, transmits to the master server a notification signal indicating that all processes of an application that has started execution before the timeout occurs are completed. 5. The computer system according to 4.