WO2015003627A1

WO2015003627A1 - Method and device for detecting malicious uniform resource locator (url)

Info

Publication number: WO2015003627A1
Application number: PCT/CN2014/081861
Authority: WO
Inventors: 申飞龙; 张辉; 刘健
Original assignee: 腾讯科技（深圳）有限公司
Priority date: 2013-07-09
Filing date: 2014-07-09
Publication date: 2015-01-15
Also published as: CN103327029B; CN103327029A

Abstract

In an embodiment of the present invention, a device for detecting a malicious uniform resource locator (URL) divides a URL to be detected into several parts, allocates a corresponding detection score to each part, and according to the detection score of each part, determines an overall score of the URL to be detected; if the overall score is within a predetermined first score range for a malicious URL, then the URL to be detected is identified as a malicious URL.

Description

TECHNICAL FIELD The present invention relates to the field of information processing technologies, and in particular, to a method and a device for detecting a malicious website.

BACKGROUND OF THE INVENTION When a user accesses a web server by using a client, the client generally inputs a server URL such as a Uniform Resource Locator (URL), and connects to the server through the web address. If the client inputs a malicious web address, It is possible to pose a threat to user information, so it is necessary to detect malicious URLs.

In the prior art, when detecting a malicious website, the detecting device first needs to connect to the server through the website address, obtain the content provided by the server of the website, that is, the content of the page, and determine the corresponding address by matching the content of the page or matching the screenshot of the page. Whether the content is malicious, and if so, the URL is a malicious URL. It can be seen that in the prior art, it is necessary to obtain the content corresponding to the URL first, so that the efficiency of detecting the malicious website is relatively low. In the actual application process, the server corresponding to the malicious website shields the address of the detecting device with the security software system, so that the detecting device cannot obtain the content corresponding to the website, and the detection fails. SUMMARY OF THE INVENTION Embodiments of the present invention provide a method and device for detecting a malicious website, which improves the detection efficiency of a malicious website. The embodiment of the invention provides a method for detecting a malicious website, which includes:

Divide the to-be-detected URL into multiple components, including the domain name, port, path, file name, data parameter, and any combination of anchor points;

Assigning a corresponding detection score to each component of the plurality of components; determining an overall score of the to-be-detected URL according to the detection score of each component Value

If the overall score is within the first score range of the preset malicious web address, it is determined that the to-be-detected web address is a malicious web address.

The embodiment of the invention provides a detection device for a malicious website, which includes:

a dividing unit, configured to divide the to-be-detected web address into multiple components, where the plurality of component parts include any combination of a domain name, a port, a path, a file name, a data parameter, and an anchor point; Each component of the plurality of components divided by the dividing unit is assigned a corresponding detection score;

a total score determining unit, configured to determine an overall score of the to-be-detected web address according to a detection score of each component part allocated by the score distribution unit;

The malicious website determining unit is configured to determine that the to-be-detected website is a malicious website if the overall score determined by the overall score determining unit is within a first score range of the preset malicious website.

It can be seen that the detecting device of the malicious website divides the to-be-detected URL into multiple components, assigns corresponding detection scores to each component of the multiple components, and determines the total number of the detected URLs according to the detection scores of the respective components. The score, if the overall score is within the first score range of the preset malicious URL, it is determined that the to-be-detected URL is a malicious URL. In this way, the operation of detecting the web address can be directly detected to detect whether it is a malicious web address, and the content corresponding to the detected web address is detected to detect whether it is a malicious web address, which saves the time for obtaining the content corresponding to the website to be detected, thereby improving the detection efficiency. Failure to detect due to failure to obtain the content of the URL to be detected is also avoided. BRIEF DESCRIPTION OF THE DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following is a brief description of the drawings used in the embodiments or the prior art description. The drawings in the following description are only some embodiments of the present invention, and those skilled in the art can obtain other drawings according to the drawings without any inventive labor.

FIG. 1 is a schematic diagram of a network environment in an embodiment of the present invention.

FIG. 2 is a flowchart of a method for detecting a malicious web address according to an embodiment of the present invention. FIG. 3A and FIG. 3B are schematic diagrams of dividing a website to be detected in an embodiment of the present invention.

FIG. 4 is a schematic structural diagram of a device for detecting a malicious website according to an embodiment of the present invention.

FIG. 5 is a schematic structural diagram of a terminal applied to a method for detecting a malicious website according to an embodiment of the present invention. detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

FIG. 1 is a schematic diagram of a network environment applied by a malicious website detection method according to an embodiment of the present invention. As shown in FIG. 1, the user can access the network 110 through the terminal 120, thereby accessing the web server 130 by inputting a web address on the terminal 120. The terminal 120 may include a smartphone, a tablet, an e-book reader, a Moving Picture Experts Group Audio Layer III (MP3) player, and a motion picture expert to compress a standard audio layer 4 (Moving Picture Experts Group Audio Layer IV, MP4) Players, laptops and desktop computers, etc.

As shown in FIG. 1, the terminal 120 can be connected to the network through a wired or wireless link. 110. The wireless link may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), and code division multiple access ( Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), etc.

The detecting device 140 of the malicious website provided by the embodiment of the present invention is included in the terminal 120. The detection device 140 includes computer readable instructions stored in the terminal 120.

When the user accesses a certain website (for example, the website is entered in the browser), the detecting device 140 of the malicious website divides the website into multiple components, and the plurality of components may include a domain name, a port, and a path. , any combination of file names, data parameters, and anchor points. The detecting device 140 of the malicious website allocates a corresponding detection score to each component of the plurality of components, and determines the overall score of the to-be-detected URL according to the detection score of each component, if the overall score is in the preset malicious URL. When the first score is within the range, it is determined that the URL to be detected is a malicious web address. In this way, the operation of detecting the web address can be directly detected to detect whether it is a malicious web address, and the content corresponding to the detected web address is detected to detect whether it is a malicious web address, which saves the time for obtaining the content corresponding to the website to be detected, thereby improving the detection efficiency. FIG. 2 is a flowchart of a method for detecting a malicious website according to an embodiment of the present invention. The method can be performed by the detecting device 140 of the malicious web address shown in FIG. As shown in Figure 2, the method includes the following steps.

Step 201: The website to be detected is divided into multiple components, and the plurality of component parts may be any combination of a domain name, a port, a path, a file name, a data parameter, and an anchor point.

In general, URLs, especially URLs, can be distinguished by domain name ( Domain Name ), port ( ort ), path ( path ), file name ( filename ), data parameter ( query ) , and anchor point . ( anchor ) and other components.

In the embodiment of the present invention, the multiple components may include any combination of a domain name, a port, a path, a file name, a data parameter, and an anchor point. For example, the plurality of components can include a domain name, a port, a path, and a file name.

among them:

A domain name is the name of a computer or group of computers in a network consisting of a series of dot-separated names that identify the electronic location of the computer during data transmission, and sometimes the geographic location. Domain names can have multiple levels. Each level of a domain name is "., separated. In short, how many points are several levels of domain names, and the word on the far right is a top-level domain.

A port is an outlet for communication between a computer and the outside world; a path usually refers to a file or a location on a network server; a file name is used to indicate a specific file on the accessed server; the data parameter is mainly started with a question mark (?) to (& ) Separated information; An anchor is a string or a command anchor chain, which refers to a partial fragment in the content corresponding to the URL to be detected.

Example, for the URL:

Http:〃 video. google. co.uk:80/videoplay/index.html?docid=10086&hl=en #00h02m30s, the domain name is google.co.uk, the port is 80, the path is /videoplay, and the file name is index .html, the data parameter name is docid, the value of the data parameter is 10086, and the anchor point is #00h02m30s.

In addition, in addition to the above mentioned domain names, ports, paths, file names, data parameters, and anchor points, a URL can be divided into more components. In practical applications, values may also be further assigned to these components as a basis for determining whether the URL is a malicious web address.

Step 202: Assign a corresponding detection score to each component of the plurality of components. When assigning a detection score to each component, the characteristics of the malicious web address preset in the detection device of the malicious website can be compared with the characteristics of the corresponding components, if If there is a match or match, a negative detection score is assigned to a component. For example, the feature of the domain name of the preset malicious website is compared with the feature of the domain name obtained in step 201, so that the detection score is assigned to the component of the "domain name".

Step 203: Determine an overall score of the to-be-detected web address according to the detection scores of the respective components.

Specifically, the detection scores of the respective components may be added to obtain an overall score; or the importance of each component in the entire malicious website may be considered, and the detection scores of the respective components are weighted to obtain a weighted value. And adding the weighted values to obtain a total score, wherein, when the weighted value is obtained, the detected score is multiplied by the weighting coefficient, and the weighting coefficient used when the more important components are weighted is relatively large.

Step 204: Determine whether the total score obtained in step 203 is within the score range of the preset malicious web address. If yes, execute step 205, that is, determine that the to-be-detected web address is a malicious web address, and if not, the to-be-detected web address. Not a malicious URL.

Specifically, the value range of the preset malicious web address may be preset by the user in the detecting device of the malicious website according to actual needs, for example, when the overall score is lower than the preset first threshold, it is a malicious website.

Further, in a specific embodiment, the method may further include steps 206-207, as shown in FIG.

Step 206: When it is determined that the overall score is not within the first score range of the preset malicious website, the detecting device of the malicious website may continue to determine whether the total score is within the second score range of the preset suspicious website.

Step 207: If the total score is within the second score, determining that the to-be-detected web address is a suspicious web address, that is, a malicious web address.

This will allow suspicious URLs to be processed accordingly. For example, if the overall score is higher than the preset first threshold and lower than the preset second threshold, it is a suspicious URL. It can be seen that, in this embodiment, the detecting device of the malicious website divides the to-be-detected web address into multiple components, and assigns corresponding detection scores to each component of the plurality of component parts, and according to the detection scores of the respective component parts. The overall score of the to-be-detected URL is determined. If the overall score is within the first score range of the preset malicious URL, it is determined that the to-be-detected URL is a malicious URL. In this way, the operation of detecting the web address can be directly detected to detect whether it is a malicious web address, and the content corresponding to the detected web address is detected to detect whether it is a malicious web address, which saves the time for obtaining the content corresponding to the website to be detected, thereby improving the detection efficiency. Failure to detect due to failure to obtain the content of the URL to be detected is also avoided.

In a specific embodiment, when the detecting device of the malicious website performs the foregoing step 202, the detection scores need to be allocated according to different policies for different components, as follows: (1) The component is a domain name.

If the number of levels of the domain name exceeds the preset level (for example, 4), the domain name is given a negative score, and as the number of stages increases, the given score is lower, wherein each level is ".,, Separately, in short, how many points are several levels of domain names.

If the spelling of the domain name does not match the spelling logic, the domain name is given a negative score. In general, the spelling logic of the domain name is ABC, abl2, and 12ab, that is, only letters, or letters and numbers are not mixed. If the letters and numbers are mixed, such as _a lb2, it does not match the spelling logic; in another case, the detection device of the malicious URL can also determine whether the domain name is compared by comparing the domain name with the preset features that do not match the spelling logic. Matches spelling logic.

If the similarity between the domain name and the preset domain name is higher than the preset similarity, the domain name is given a negative score, and the domain name that is easily spoofed may be preset, and the similarity refers to the domain name and the preset domain name. The percentage of characters that match after matching.

If the domain name is a free domain name outside of China, you can also give the domain name a negative score.

(2) The component is the path If the path contains special characters, the path is given a negative score, where the special characters are in addition to letters and numbers and a limited number of punctuation (%, ?, /, =, #, ., -, _ ) character.

Perform symbol segmentation on the path. If the spelling of more than two split parts does not match the spelling logic, give the path a negative score.

Perform symbol segmentation on the path. If the length of more than two divided parts is less than the preset length (such as 2), the path is given a negative score. Where symbols are characters other than numbers and letters, such as /, ? Wait.

(3) The component is the file name. If the file name contains special characters, the file name is given a negative score. If the spelling of the file name does not match the spelling logic, a negative value is given for the file name.

(4) The component is a port. If the port does not match the preset port, the port is given a negative score. The commonly used common ports are 80, 8080, and 8081.

(5) The component is a data parameter. If the data parameter is not in the form of kv, ie, the data parameter name and the data parameter value, the data parameter is given a negative score; if the data parameter value contains a "/", that is, a slash , then give the data parameter a negative score.

(6) The component is an anchor. If the anchor contains " /", the anchor is given a negative score.

It should be noted that, in the process of assigning the detection scores to the respective components, it may be necessary to determine whether the spelling of some components conforms to the spelling logic. In the specific implementation, the features of the components may be matched with the spelling logic. Features are compared, if they match, they match the spelling logic, otherwise they don't match the spelling logic. In another case, the components of the mixed arrangement of letters and numbers can be directly determined to be inconsistent with the spelling logic.

In addition, when the detection device of the malicious website gives a score value according to the characteristics of each component, it can be given according to the importance degree of the feature, if more important features such as The spelling logic, when the spelling logic is not met, the given negative score is lower. Further, after detecting the different scores according to the multiple features of each component, the detecting device of the malicious website may select the lowest score as the detection score of the component.

FIG. 3A is a schematic diagram of dividing a to-be-detected web address according to an embodiment of the present invention. As shown in Figure 3A, for the URL to be detected 310: htt ://zh. wikipedia.org: 80/wiki/TCP/UDP port list/index.html?uid=1212#head, the URL after the partition is shown in Figure 3A Shown in 320. The domain name is zh.wikipedia.org (see 321 in Figure 3A), the domain name level is less than 4, the domain name is assigned a score of "0"; the port is 80 (see 322 in Figure 3A), which is a common port of the server. Assign the port a score of "0"; the path is a wiki/TCP/UDP port list (see 323 in Figure 3A). The name of each partition is a common name and has a certain semantics. The value is "0"; the file name is index.html (see 324 in Figure 3A), which is the default URL home page, assigning a score "0" to the file name; the data parameter is uid=1212 (see 325 in Figure 3A). In the form of the data parameter name and the data parameter value, the data parameter is assigned a score "0"; the head in the anchor point (see 326 in Fig. 3A) indicates that after opening the page corresponding to the URL, it automatically scrolls to the anchor name as The position of the head, assigning a score of "0" to the anchor.

In the embodiment of the present invention, the calculation method of the overall score is as follows: The scores of the respective parts are added. Thus, according to the scores of the respective components, the total score of the to-be-detected web address 310 can be obtained as 0+0+0+0+0=0. Assume that the pre-set malicious URL has a width of -10. Since the total score of the to-be-detected web site 310 is greater than the threshold value, it is determined that the to-be-detected web address 310 is a non-malicious web address.

FIG. 3B is another schematic diagram of dividing a to-be-detected web address according to an embodiment of the present invention. As shown in Figure 3B, for the URL 330 to be detected:

Http:〃 qz0ne.qq.com.8866.org:6799/s3u/a/q.asp?2121&1312# ^AAAAA , the URL after the partition is shown as 340 in Figure 3B, and its domain name is qz0ne.qq.com.8866. Org (see 341 in Figure 3B), with a domain name level greater than 4, and the free top-level domain name 8866.org, and The similarity with the preset domain name qzone.qq.com is higher, so the detection score of the domain name in the URL is lower. In this embodiment, the domain name is assigned a score of "-10"; the port is 6799 (see 342 in Figure 3B), which is not a port of a common server, and the port is assigned a score "-5"; the path is s3u/ a (see 343 in Figure 3B), s3u does not have any meaning, does not match the user's naming, assigns a score "-5" to the path; the file name is q.asp (see 344 in Figure 3B); the data parameter is 2121 & 1312 (see 345 in Figure 3B), does not conform to the data parameter name and data parameter value form, assigns the score parameter "-3" to the data parameter; the anchor point contains the special character " ^ΛΛΛΛΛ " (see Figure 3Β 346), Assign the score "-5" to the anchor.

In the embodiment of the present invention, the calculation method of the overall score is as follows: weighting the scores of the respective parts to obtain weighted values, and adding the weighted values to obtain the overall score. Assume that the weighting coefficients of each part are as follows: The domain name has a weight of 3, the port has a weight of 2, and the remaining other parts have a weight of 1.

Thus, according to the scores of the above components, the total score of the to-be-detected web address 330 can be obtained as 3*( -10) +2* ( -5 ) + ( -5 ) + ( -3 ) + ( -5 ) =-53. Assume that the pre-set malicious URL has a width of -10. Since the total score of the to-be-detected web address 330 is less than the score threshold, it is determined that the to-be-detected web address 330 is a malicious web address.

FIG. 4 is a schematic structural diagram of a malicious website detecting device according to an embodiment of the present invention. As shown in FIG. 4, the malicious website detecting device 400 includes:

The dividing unit 410 is configured to divide the to-be-detected web address into multiple components, and the multiple components may include any combination of a domain name, a port, a path, a file name, a data parameter, and an anchor point.

The score distribution unit 411 is configured to allocate a corresponding detection score to each component of the plurality of component parts divided by the division unit 410.

The overall score determining unit 412 is configured to determine, according to the detected scores of the respective components allocated by the score assigning unit 411, the overall score of the to-be-detected webpage, specifically, the overall The score determining unit 412 may add the detected scores of the respective constituents to obtain the overall score; or, add the weighted values of the detected scores of the respective constituents to obtain the overall score or the like.

The malicious URL determining unit 413 is configured to determine that the to-be-detected web address is a malicious web address, for example, when the overall score determined by the overall score determining unit 412 is within a first score range of the preset malicious web address. When the score is lower than the preset first threshold, it is determined to be a malicious URL.

Further, in another embodiment of the present invention, the device 400 may further include a suspicious URL determining unit 414, configured to: if the overall score determined by the overall score determining unit 412 is in a preset suspicious URL When the score is in the range of two points, it is determined that the to-be-detected web address is a suspicious web address, for example, when the overall score is higher than the preset first threshold and lower than the preset second threshold, the suspicious URL is determined. .

It should be noted that the above-mentioned score distribution unit 411 may specifically allocate detection scores to different components according to different strategies, specifically:

When the dividing unit 410 divides the domain name in the to-be-detected web address, the score assigning unit 411 is configured to: if the number of levels of the domain name exceeds the preset level, give a negative score for the domain name; or If the similarity between the domain name and the preset domain name is higher than a preset similarity; or, if the spelling of the domain name does not conform to spelling logic, a negative score or the like is given for the domain name. When the dividing unit 410 divides the file name in the to-be-detected web address, the score assigning unit 411 is specifically configured to: if the file name includes a special character, give the file name a negative score; or If the spelling of the file name does not match the spelling logic, the file name is given a negative score. When the dividing unit 410 divides the path in the to-be-detected web address, the score assigning unit 411 is specifically configured to: if the path contains a special character, give the path a negative score; or, The path is symbol-divided, and if the spelling of the two or more divided parts does not match the spelling logic, the path is given a negative score; or, the path is performed Number division, if the length of two or more divided parts is less than the preset length, the path is given a negative score.

When the dividing unit 410 divides the data parameter in the to-be-detected web address, the score assigning unit 411 is specifically configured to: if the data parameter is not in the form of a data parameter name and a data parameter value, The parameter gives a negative score; or, if the data parameter value includes a slash, the data parameter is given a negative score. When the dividing unit 410 divides the port in the to-be-detected web address, the score assigning unit 411 is specifically configured to give the port a negative score if the port does not match the preset port. . When the dividing unit 410 divides the anchor point in the to-be-detected web address, the score assigning unit 411 is specifically configured to: if the anchor point includes a slash, give the anchor point a negative score .

It can be seen that, in the detecting device 400 of the malicious website of the embodiment, the dividing unit 410 divides the to-be-detected web address into a plurality of component parts, and the score-allocating unit 411 assigns a corresponding detection score to each component of the plurality of component parts. And the overall score determining unit 412 determines the overall score of the to-be-detected web address according to the detected scores of the respective components. If the overall score is within the first score range of the preset malicious web address, the malicious web address determining unit 413 determines that the to-be-detected web address is a malicious web address. In this way, the operation of the website to be detected can be detected directly to detect whether it is a malicious website, and the content corresponding to the detected website is operated to detect whether it is a malicious website, which saves time for obtaining the content corresponding to the website to be detected, and can improve the detection effect. The method and device for detecting a malicious website provided by the embodiment of the present invention can be applied to a user terminal. FIG. 5 is a schematic diagram of a user terminal according to an embodiment of the present invention. As shown in FIG. 5, the user terminal 500 includes a processor 510, a memory 520, a communication unit 530, and an input/output unit 540.

The computer readable instructions are stored in the memory 520. The processor 510 executes the partitioning unit 410 of FIG. 4 by running computer readable instructions stored in the memory 520, The functions of the score assignment unit 411, the overall score determination unit 412, the malicious URL determination unit 413, and the suspicious URL determination unit 414. The specific functions and operations have been described in detail in the foregoing embodiments with reference to FIG. 2 and FIG. 4, and details are not described herein again.

The memory 520 may be a read only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, or the like.

Communication unit 530 is for communicating with the network over a wired link or a wireless link. For example, communication unit 530 can include a radio frequency (RF) circuit. Thus, the terminal 500 can be connected to the network via a wireless link.

The input/output unit 540 is connected to the input/output device through an interface, receives numeric or character information input by the user, and displays information to the user.

In addition, the terminal 500 may further include other components, such as a camera, a Bluetooth module, and the like, and details are not described herein again.

The method and device for detecting a malicious website provided by the embodiment of the present invention are described in detail above. The description of the above embodiment is only for helping to understand the method and core idea of the present invention. Meanwhile, for those skilled in the art, The present invention is not limited by the scope of the present invention.

Claims

claims

1. A method for detecting malicious URLs, which is characterized by including:

Divide the URL to be detected into multiple components, and the multiple components include any combination of domain name, port, path, file name, data parameter and anchor point;

Assigning a corresponding detection score to each component among the plurality of components; Determining the overall score of the URL to be detected based on the detection score of each component;

If the overall score is within the first score range of the preset malicious URL, it is determined that the URL to be detected is a malicious URL.

2. The method of claim 1, wherein the component includes a domain name, and then assigning a corresponding detection score to each of the plurality of components, specifically including:

If the level of the domain name exceeds the preset level, a negative score is given to the domain name; or, if the similarity between the domain name and the preset domain name is higher than the preset similarity; or, if the similarity between the domain name and the preset domain name is higher than the preset similarity, If the spelling of the domain name does not comply with the spelling logic, a negative score will be given to the domain name.

3. The method of claim 1, wherein the component includes a file name, and then assigning a corresponding detection score to each of the multiple components, specifically including:

If the file name contains special characters, a negative score is given to the file name; or, if the spelling of the file name does not comply with the spelling logic, a negative score is given to the file name.

4. The method of claim 1, wherein the component includes a path, and then assigning a corresponding detection score to each of the plurality of components, specifically including:

If the path contains special characters, the path is given a negative score; or, Perform symbol splitting on the path, and if the spelling of more than two split parts does not comply with the spelling logic, give the path a negative score; or,

The path is symbolically divided, and if the length of more than two divided parts is less than a preset length, a negative score is given to the path.

5. The method of claim 1, wherein if the component includes a data parameter, then assigning a corresponding detection score to each of the multiple components, specifically including: if the component If the data parameter is not in the form of a data parameter name and a data parameter value, a negative score is given to the data parameter; or, if the data parameter value contains a slash, a negative score is given to the data parameter. value;

If the component includes a port, assigning a corresponding detection score to each of the multiple components, specifically including: if the port does not match a preset port, assigning the port to Given a negative score;

If the component includes an anchor point, assigning a corresponding detection score to each of the multiple components, specifically including: If the anchor point includes a slash, then assigning a corresponding detection score to the anchor point Given a negative score.

6. The method according to any one of claims 1 to 5, wherein determining the overall score of the website to be detected based on the detection scores of each component specifically includes:

The overall score is obtained by adding the detection scores of each component; or, the overall score is obtained by adding the weighted values of the detection scores of each component.

7. The method according to any one of claims 1 to 5, characterized in that the method further includes:

If the overall score is within the preset second score range of the suspicious URL, it is determined that the URL to be detected is a suspicious URL.

8. A malicious website detection device, characterized by including: The dividing unit is used to divide the URL to be detected into multiple components, and the multiple components include any combination of domain name, port, path, file name, data parameter and anchor point; the score allocation unit is used to Each component of the plurality of components divided by the dividing unit is assigned a corresponding detection score;

An overall score determination unit, configured to determine the overall score of the website to be detected based on the detection scores of each component assigned by the score allocation unit;

A malicious URL determination unit, configured to determine that the URL to be detected is a malicious URL if the overall score determined by the overall score determination unit is within the first score range of the preset malicious URL.

9. The device according to claim 8, characterized in that,

The dividing unit is specifically used to divide domain names in the website to be detected; the score allocation unit is specifically used to assign a negative value to the domain name if the level of the domain name exceeds a preset level. score; or, if the similarity between the domain name and the preset domain name is higher than the preset similarity; or, if the spelling of the domain name does not comply with the spelling logic, a negative score is given to the domain name.

10. The device according to claim 8, characterized in that,

The dividing unit is specifically used to divide the file name in the URL to be detected; the score allocation unit is specifically used to give a negative score to the file name if the file name contains special characters. value; or, if the spelling of the file name does not follow spelling logic, give the file name a negative score.

11. The device according to claim 8, characterized in that,

The dividing unit is specifically used to divide a path in the URL to be detected; the score allocation unit is specifically used to give a negative score to the path if the path contains special characters; or , perform symbolic segmentation on the path, and if the spelling of more than two segmented parts does not comply with the spelling logic, a negative score is given to the path; or, for all The path is symbolically divided. If the length of more than two divided parts is less than the preset length, a negative score is given to the path.

12. The device according to claim 8, characterized in that,

The dividing unit is specifically used to divide the data parameters in the URL to be detected; the score allocation unit is specifically used to divide the data parameters into data parameters if the data parameters are not in the form of data parameter names and data parameter values. Give a negative score to the data parameter; or, if the data parameter value contains a slash, give a negative score to the data parameter; or,

The dividing unit is specifically used to divide ports in the URL to be detected; the score allocation unit is specifically used to give a given value to the port if the port does not match the preset port. Negative score; or,

The dividing unit is specifically used to divide anchor points in the URL to be detected; then the score allocation unit is specifically used to if the anchor point contains a slash, then the anchor point is given Negative score.

13. The device according to any one of claims 8 to 12, wherein the overall score determination unit is specifically configured to add the detection scores of each component to obtain the overall score; Or, the weighted values of the detection scores of each component are added to obtain the overall score.

14. The device according to any one of claims 8 to 12, further comprising: a suspicious URL determination unit, configured to determine if the overall score is within the preset second score range of the suspicious URL , then it is determined that the URL to be detected is a suspicious URL.