WO2003056704A1

WO2003056704A1 - Method and apparatus for adapting data compression according to previous exchange of information

Info

Publication number: WO2003056704A1
Application number: PCT/FI2002/001061
Authority: WO
Inventors: Heikki Mannila
Original assignee: Nokia Corporation
Priority date: 2001-12-31
Filing date: 2002-12-31
Publication date: 2003-07-10
Also published as: FI20012597A0; EP1470645A1; AU2002352313A1; FI20012597A

Abstract

A method and apparatuses are disclosed for compressing digital data before transmitting it from a communications device of a first party to a communications device of a second party. There is maintained (302, 303) a collection of digital data (105, 115) that represents an archive of information (104,114, 201) that has been exchanged between the first party and the second party. When new information (202) to be transmitted from the communications device of the first party to the communications device of the second party is composed (308), it is represented as digital data. Said digital data is compressed (311) by utilizing correspondences between the new information (202), so that at least some of the resulting compressed digital data (203) refers to information in said archive (201).

Description

Method and apparatus for adapting data compression according to previous exchange of information

The invention concerns generally the field of compressing data before transmission over a communications link and decompressing data after such compressing. Especially the invention concerns the task of compressing and decompressing when performed by communicating parties between which some other information has already been previously transmitted.

The processing and storage capacity of data-transferring digital devices, even light- weight portable ones, has become large enough to justify the statement that the available bandwidth of the communication channel is by far the most severe limiting factor regarding the volume of data to be transmitted. For example a single SMS (Short Messaging Service) message has the maximum of 160 characters. Also the data rates available for many WAP (Wireless Application Protocol) and GPRS (General Packet Radio Service) services may be quite low. There is clearly a need for compressing all kinds of data for the purposes of transmission over a communications channel that has a limited bandwidth. Even stationary computers that are fixedly coupled to data transfer networks with cable or fiber connections would benefit from effective data compressing.

A vast variety of known algorithms exist for compressing and decompressing digital data. Simple examples include e.g. run length encoding, in which sequences of repetitively occurring symbols are effectively represented with symbols that denote the length of such sequences, as well as various codeword mapping schemes where frequently occurring symbol sequences have shorter codeword representations. More complicated compression schemes take advantage of the physical nature of the signals that the digital bit streams represent: prominent examples are the MPEG (Motion Picture Experts Group) encoding and decoding arrangements where the encoding process is partly based on the relative amounts of oscillating components of different frequencies present in the signal to be encoded.

Despite of the many advantages of known universally applicable data compression systems, it has been the opinion of the present inventors that the optimal solution for compressing digital data under specific circumstances remains to be found.

It is an objective of the present invention to provide methods, arrangements and apparatuses for compressing and decompressing digital data that is transferred be- tween devices between which there has already been some previous exchanging of information. It is another objective of the invention that the same principle could be widely applied regardless of the types of devices in question, and regardless of the other programs that are used for exchanging the information. It is a further objective of the invention that it can also include cryptographic aspects.

The objectives of the invention are met by using an archive of previously exchanged information as the basis for compressing and decompressing.

The methods according to the various method aspects of the invention are characterised by the features recited in the corresponding independent method claims.

The apparatuses according to the various apparatus aspects of the invention are characterised by the features recited in the corresponding independent apparatus claims.

The invention is based on the insight that data compression methods that are meant to be universally applicable are by definition excluded from taking into account the specific circumstances in which they are used at any particular time, and thus cannot benefit from anything that would be specific to that particular communication connection only. Remarkable advantage in compression performance can be gained by noting that in many cases the information that is transmitted between two devices recycles certain features of information that has been previously exchanged between these two devices. According to the present invention such previously exchanged information is seen as a dynamically changing source of ingredients for subsequently composed messages.

In a simple embodiment of the invention passages of previously exchanged information that are known to both communicating parties are taken and represented with codes that are remarkably shorter than the passages themselves. Further pieces of information are then transmitted in compressed form by taking advantage of the codes. Both devices use the same logic for associating words and/or passages from previously exchanged information with short codes. Whenever the uncompressed form of a subsequently composed message includes an exact copy of something that already exists in the stock of code-associated message ingredients, the compression algorithm replaces such something with the appropriate code. Correspondingly the decompression algorithm recognizes the code and retrieves the original word or passage as a part of its decompressing work. The invention only works if the compression and decompression algorithms both refer to the exactly same "dictionary" collection of previously exchanged information, or at least to such part of it where the associations between pieces of uncompressed information and the corresponding codes are the same. A simple and bandwidth- efficient way of ensuring the required similarity is to make the devices use a standard algorithm to calculate some kind of a compressed hash of the dictionary, and to incorporate into some part of initiating a connection a step of transmitting said hash over to the other device. The fact that the dictionary is dynamically changing means that each time when a new piece of information has been successfully exhanged, both devices augment their previous dictionaries with this new piece of information and calculate a new hash.

An example of the applicability of the present invention is the exhanging of e-mail (electronic mail) or SMS messages between two communicating parties. A human user transmits an overwhelming majority of all his messages to a conversation parner, i.e. to a recipient with which he has exchanged information also before, using the same form of transmission. The messages that were exchanged previously with this particular recipient constitute a dictionary of words and passages. When the human user writes yet another message, the compressing algorithm goes into the dictionary that consists of previously exchanged messages and looks for similarities between words and phrases in the newly written message and occurrences of the same words and phrases in the dictionary. The compressing, which may take place completely without the user even knowing about data compression being in use, involves replacing the found occurrences of previously known words and phrases with short codes that appear in association with these words and phrases in the dic- tionary. The recipient of the message may also remain unaware of any data compression being in use, because before presenting him with the arrived message, the decompression algorithm restores its original plaintext information content with a dictionary lookup operation that is effectively the inverse of that performed by the compression algorithm. The compression and decompression algorithms maintain their dictionaries in an up-to-date state by accepting each new successfully exchanged message into the dictionary, so that the words and phrases contained therein are available for compressing and decompressing subsequent messages.

In addition to simple dictionary-based compressing and decompressing schemes the invention allows the use of more sophisticated compressing and decompressing al- gorithms, as long as they all have the common feature of using a stock of previously exchanged information as the basis of establishing congruence between uncompressed information and compressed data.

The novel features which are considered as characteristic of the invention are set forth in particular in the appended claims. The invention itself, however, both as to its construction and its method of operation, together with additional objects and advantages thereof, will be best understood from the following description of specific embodiments when read in connection with the accompanying drawings.

Fig. 1 illustrates a communication connection where the invention is applied,

fig. 2 illustrates dictionary-based compressing according to an embodiment of the invention,

figs. 3a, 3b and 3c illustrate the exchanging of messages in a method according to an embodiment of the invention, and

fig. 4 illustrates a method according to an embodiment of the invention.

The exemplary embodiments of the invention presented in this patent application are not to be interpreted to pose limitations to the applicability of the appended claims. The verb "to comprise" is used in this patent application as an open limitation that does not exclude the existence of also unrecited features. The features recited in depending claims are mutually freely combinable unless otherwise explicitly stated.

Fig. 1 illustrates a communications connection where the invention would most typically be applied. Two users, human or other, communicate with each other over a communications connection of the so-called point-to-point type in order to exchange information that has a certain plaintext, uncompressed form. Such information typically consists of character strings that constitute words, phrases and num- bers. A further typical feature is that the exchange of information takes place in the form of passing discrete information sequences, referred to as messages, over the communications connection. A communications connection between two particular users is by no means unique, but typically each user has a multitude of possible partners to communicate with.

We will refer to "information" as being something that is directly representable to a user, and thus has a certain informative content that a user can appreciate. In the context ot the present invention, information comes most typically in the form of text. Similarly we refer to "data" as something that machines use to represent information digitally. In other words data consists of strings of digital symbols. Compression and decompression algorithms essentially handle data, because the objective of compression is primarily to make the total number of transmitted digital symbols as small as possible.

Each user has a user interface 101, 111 in order to input information to be transmitted and to extract (read) received information. Each user has also a compressing and decompressing entity 102, 112 that takes care of compressing the data that represents information to be transmitted and decompressing the data that represents re- ceived information. The communication device of each user maintains a dictionary 103, 113 that comprises an archive of plaintext information 104, 114 that has been previously transmitted to or received from a particular other user. The dictionary 103, 113 also comprises certain codes 105, 115 that are associated with pieces of information in the plaintext archive 104, 114. The information and the associated codes that are held in the dictionaries 103, 113 are necessarily arranged (or at least adaptable to be arranged) on a per communication counterpart basis, because a single communication counterpart can only be aware of the information that has been exchanged with him, and thus only such information can be used as a basis of compression coding in further communication with that counterpart.

The user interface 101, 111 only needs to have a bidirectional connection to the plaintext part 104, 114 of the dictionary, while the compressing and decompressing entity 102, 112 must be able to access both the plaintext part 104, 114 and the codes 105, 115.

Fig. 2 illustrates schematically the principle of using a user-specific dictionary for compressing. We assume that there exists a dictionary 201 where the left-hand column represents the archive of plaintext expressions (here in some fictitious language) that have been used previously in communication between the user in question and another party to which the present user is about to transmit a new message 202. The right-hand column of the dictionary 201 represents the unambiguous codes that have been associated with each of the previously used plaintext expressions. When the compression algorithm examines the newly composed plaintext message

202, it notices that several parts of it appear directly in the dictionary 201, so it replaces those parts with the appropriate codes to produce the compressed message

203. Here we assume also that end-of-lines appear in the compressed message 203 as codes (E). Decompression is a simple inverse operation where the compressed message 203 is converted into the plaintext form by replacing each code in it with the appropriate plaintext expression.

The invention does not require the users to be human users. Figs. 3a to 3c illustrate the exchange of certain messages between the terminal device of a human user and a server, which in this example is an e-mail server. Step 301 represents generally the exchange of e-mail messages between these two devices. On the basis of the previously exchanged messages the terminal and the server both compose their own dictionaries at steps 302 and 303 respectively. In this example we denote the dictionary composed by the terminal as D and the dictionary composed by the server as D(U) to emphasize the fact that this dictionary is only one of those possibly composed by the server, and specifically relevant to user U. The terminal and the server both also calculate certificates (or "hashes") from their own dictionaries at steps 304 and 305 respectively. A certificate is a kind of fingerprint of a certain composition of digital data. The algorithm that is used for calculating the certificate is always the same, so the certificate assumes a value that depends essentially unambiguously on the initial data. Many suitable algorithms for calculating certificates or hashes are known from the field of digital cryptography. It is essential that two different sets of initial data are most unlikely to produce the same certificate, so that if two certificates are the same, it is reasonable to assume that they were calculated from exactly the same initial data.

At step 306 the terminal receives from its user a request for obtaining certain information from the server. As an example we may assume that the user wants to retrieve the headers of the e-mail messages that are currently in his "incoming" folder at the e-mail server. Typically e-mail transmission protocols are very well opti- mized for composing very tightly compressed standard requests; for example protocols are known where the request "give me the headers of messages in the incoming folder" can be expressed with a single letter "m". Therefore we may assume that no additional compressing in the sense meant by the present invention is necessary for transmitting the appropriate request to the server. It is also self-evident that the re- quest must contain some identification of the user the mails of which are requested. The part of the request message transmitted at step 307 that has the most to do with the invention is the certificate c(D). The terminal transmits this certificate to the server so that the latter can then check that the dictionaries are equal at both ends.

At step 308 the server receives the request and finds the appropriate answer. At step 309 the server compares the certificate c(D) that it received from the terminal with the certificate c(D(U)) that it has previously calculated from its own dictionary. From this step onwards the action follows either the steps shown in fig. 3b or those shown in 3c depending on whether the server found the certificates to be equal or not at step 309.

Let us first assume that the certificates were the same, indicating that the teπr-inal's current dictionary D is the same as the server's corresponding current dictionary D(U). This means that after step 309 the steps illustrated in fig. 3b take place. At step 311 the server compresses the answer it found at step 308, using the dictionary D(U) for compressing. At step 312 the server transmits the compressed answer to the terminal, which receives it and decompresses it with the help of the dictionary D at step 313.

The steps 311, 312 and 313 represent the transmission of a newly composed message from the server to the terminal, which means that fresh input is now available to the dictionaries at both ends. Therefore both the terminal and the server augment the previous composition of their respective dictionaries at steps 314 and 315 re- spectively by adding the information that was contained in the server's response. If augmenting the dictionary involves other actions, like recalculating codes, these are also performed at steps 314 and 315. Now the contents of the dictionaries have changed and the previous certificates are not valid any more, which means that new certificates must be calculated at both ends as is illustrated by steps 316 and 317. At step 318 the terminal displays the information content of the requested answer to the user. The location in time of step 318 as regards steps 314 and 316 is not important to the invention. In many cases it is even better to display the requested information to the user as soon as it is available in plaintext form at the terminal, in order to minimize the delays that the user experiences.

In principle a dictionary-augmenting and certificate-calculating round like that illustrated as steps 314, 315, 316 and 317 could have taken place already after the server had successfully received the request that the terminal transmitted at step 307. A newly calculated certificate could then have been included in the message that the server transmits at step 312, just to inform the terminal that the server has indeed augmented its dictionary before compressing the requested answer. However, since the request message of step 307 is typically of a very strictly standardized and compressed form, it contains only few such ingredients that could be useful in user- dependent compressing of further messages. In the embodiment of figs. 3a to 3c we assume that dictionaries are only augmented after there has been exchanged some- thing the form and/or content of which is at least somehow dependent on the actual users. Let us then assume that the certificates were not the same at step 309, indicating that the terminal's current dictionary D was for some reason not the same as the server's corresponding current dictionary D(U). This means that after step 309 now the steps illustrated in fig. 3c take place. It also means that the current dictionaries cannot be used as the basis of compressing and decompressing, because compressing made on the basis of one dictionary is basically impossible to properly decompress with a different dictionary. At step 321 the server compresses the answer it found at step 308, but uses now some general compression method that does not depend on the user-specific dictionaries. At step 322 the server transmits the com- pressed answer along with a command for the terminal to reset its dictionary. At step 323 the terminal receives the answer, notes the command for resetting its dictionary and decompresses the answer with a predefined general-purpose decompression method that corresponds to the compression method used by the server.

It is generally not possible to know on the basis of available information, what made the dictionaries differ from each other. Therefore the only reliable way of bringing the dictionaries D and D(U) back into synchronism is to reset them both so that only the most recently exchanged information, i.e. that contained in the answer that the server transmits at step 322 (as well as the request that the terminal transmitted at step 307, if that is considered to contain useful ingredients to a dictionary) remain in the dictionaries. The resetting steps at the terminal and the server are shown as steps 324 and 325 respectively, and the following recalculation of the certificates is shown as steps 326 and 327 respectively. As was the case in fig. 3b, the step 328 of displaying the content of the answer to the user might also occur immediately after the decompression step 323.

Fig. 4 illustrates schematically a method to be executed by an apparatus according to an embodiment of the invention. State 401 is an idle state where some dictionary D and the corresponding certificate c(D) exist. Repeated negative answers at steps 402 and 403 only keep the device in question in the idle state, until action is triggered either by a user requesting something that requires the device to transmit a message (positive answer at step 402) or a message arriving from another device (positive answer at step 403).

As an example, the method steps that the terminal performed in the case of figs. 3a and 3b represent a straight line through the middle of fig. 4. After the user has requested some information at step 402, the terminal notes at step 404 that the current dictionary D is at least assumed to be valid, and composes the require message at step 405. Above we assumed that this message is not compressed because of its in- herent ultimate simplicity, so the terminal just transmits it at step 406, the certificate c(D) included in it, and moves on to step 407 to wait for the server to respond. When the server responds, the terminal notes at step 408 that the response did not contain a reset command, so it can use the dictionary D for decompressing at step 409. The terminal uses the freshly obtained message to augment its dictionary at step 410 and calculates a new certificate at step 411. It displays the requested information to the user at step 412 and decides at step 413 that no further exchanging of messages is necessary, which means a return to the idle state 401.

The actions of the server in the case of figs. 3a and 3b starts from a positive answer at step 403. At step 414 the server checks that the certificate it received in the message matches its own corresponding certificate c(D(U)). It goes trivially through steps 408, 409, 410, 411 and 412 (no reset command received, no decompressing needed, no augmenting of dictionary or recalculating certificate because of the simplicity of the received message, no need to display anything to a user) and finds at step 413 that a response is indeed due. Having thus returned to step 404 to confirm that the current dictionary is to be regarded valid, the server composes and compresses the answer message at step 405 and transmits it at step 406. At step 407 it will receive no further response from the terminal, but the check of step 415 shows that none is even required. Thereafter the server augments its dictionary at step 410, calculates a new certificate at step 411, passes through step 412 and returns from step 413 to the idle state 401.

If the server would have found at step 414 that the certificates do not match, it would have passed through step 416 to step 417 where it would have reset its own dictionary. After passing thereafter through steps 411, 412 and 413 to step 404 the server would then notice that the current dictionary is now known to be invalid, which means that the composed answer message would have to be compressed with the general purpose compression method at step 418, followed by transmitting together with a reset command at step 419. The rest of the server's actions would proceed through steps 407, 415, 410 (dictionary not augmented any more, because it was just reset), 411 (the new certificate being calculated for the reset dictonary), 412 and 413 back to the idle state 401.

The method diagram of fig. 4 takes also into account that if the device has transmitted something and finds at steps 407 that it has not received a response although it should expect one, there may be a dictionary error that confused the device at the other end enough to impede transmitting a response. In that case the device resets its dictionary at step 420 and tries transmitting again. Similarly the diagram of fig. 4 takes into account a situation where the request made by a user only causes something to be transmitted without a response to be expected at all: in such a case the method proceeds through steps 402, 404, 405, 406, 407, 415, 410, 411, 412 and 413 back to the idle state 401.

The description given so far has referred to a dictionary as being something that associates words or phrases directly with certain codes. Such an assumption should not be construed to pose limitations to the applicability of the invention, because the way in which an archive of previously exchanged information is used as the basis of compressing is not important to the general idea of the invention. For example, the dictionary might arrange previously used words, phrases and expressions into a descending order according to observed frequency or recency of use, so that the code associated with each word, phrase or expression would be simply its sequence number in said order. Or the dictionary could consist of the lines of characters as they appeared in exchanged information in temporal order, so that a code of a word would include a line number and the sequence number on that line of the appropriate word. Or certain words, characters or other entities that are taken from the archive of previously exchanged information could act as the basic set of "generating functions", from which all other words, characters or other entities in a newly composed message can be generated according to a parameterized formula: compressing would then involve finding the suitable values of such parameters while decompressing would involve mapping a set of parameter values back into the original information.

Speaking about dictionaries suggests that the invention would only be applicable to the compression of text or at most some character-based data, but this is strictly speaking not true. The invention can be applied even to graphical data, if such data involves regularity that can be utilized as the basis of composing a "dictionary". As an example we may consider vector graphics where the presentation of graphical objects involves a great number of repetitively occurring regular expressions. Such regular expressions can be used as the building blocks of a "dictionary" quite simi- larly as words or phrases of textual information.

Departing completely from the concept of dictionary-based compression we may generally consider a compression method M that involves a data structure DD the size of which is reasonable considering its storage and use in communications devices and possible transmission therebetween. The nature of DD is such that it holds some information about what a certain character, word or longer sequence of characters should be or how should it be handled when that character, word or longer sequence of characters occurs subsequently to some previously exchanged information. DD is not necessarily composed of parts of information content proper: for example DD may be a template that describes merely how certain information should be organized rather than disclosing the actual content of such information. A certifi- cate c(DD) can be calculated for a certain data structure DD in the way that was described earlier in the case of the certificate c(D) for a dictionary D. A dictionary D can constitute a subpart of a more extensively defined data structure DD.

If there exists a certificate c(DD) of the data structure DD, the preparatory transmissions between two communications devices that aim at ensuring the proper use of user-specific compression may involve transmitting only the certificate c(DD), only the certificate c(D) or both. The selection of what is actually transmitted depends on whether e.g. D constitutes a subpart of DD, whether the data structure DD can be considered as constant or whether the variable and potentially large number of constant entries in a data structure (like empty cells in a spreadsheet table) makes it at- tractive to certify and use both the data structure and the entries therein for user- specific compression.

An interesting aspect of the invention involves encryption in addition to or in place of compressing. From the technology of symmetric key cryptography there are known several methods and arrangements where the same pseudorandom string of characters is used both as the encryption key and the decryption key. Some of the known methods involve inherent compressing of data in addition to encrypting. The above-presented general principle of using an archive of previously exchanged information as the basis of compressing a newly composed message is easily generalized to cover symmetric key cryptography so that the communicating devices use the archive of previously exchanged information or a certain predefined part of it as the encrypting and decrypting key. This aspect of the invention does not necessarily require that the encrypting algorithm involves inherent compressing, although compressing is usually an advantageous additional feature. However, in some cases good secrecy takes priority over minimizing the number of transmitted symbols. In some cases it is even more advantageous to expand a message during encrypting, so that the resulting apparently large number of transmission symbols mask the fact that the actual message was only very short, e.g. because it only consisted of a password.

The practice of sending a certificate of the dictionary over an assumedly nonsecure transmission link does not compromise the security aspects related to the encrypting embodiments of the invention, because it is typical to known certificates of this kind that although each certificate corresponds almost unambiguously to only one set of initial information, is practically impossible to derive the initial information by just looking at a particular certificate. So even if a nonauthorized party had access to a message that contained the certificate of a dictionary that is (or a part of which is) to be used as an encrypting key, it would not help him to deduce, what did the dictionary look like, and thus not enable him to decrypt the corresponding encrypted message.

The practice of augmenting the dictionary with each successfully exchanged piece of information is also good from the cryptographic viewpoint, because it means that the key is changed very often, which makes it even more difficult for unauthorized persons to retrieve the plaintext form of the transferred information.

The description of figs. 3a and 3c also involved preparing for a situation where the certificates do not match. In the cryptographic embodiments of the invention this means that either there must be a fall-back encrypting and decrypting scheme that is taken into use if a comparison of the certificates shows a mismatch, or alternatively a detected mismatch between certificates must trigger a process of not only resetting the dictionaries but also filling them with a sufficient amount of fresh, successfully exchanged information so that a dictionary is again available as a source of encrypting and decrypting keys.

Compressing according to the invention can also be utilized by allowing the actual contents of certain message types to vary dynamically according to how well did the compression succeed. As an example we may consider a service where a user requests a stock exchange information server to send quotes of his favourite stocks in an SMS message. The number of characters in a single SMS message has been tra- ditionally limited to 160, and even if such limitations may be relaxed in some future developments of short messaging, compressing will definitely aid in any case to put more information into a given amount of transmission capacity. If similar messages have been exchanged between the user's terminal and the stock exchange information server also earlier, dictionaries have been accumulated at both ends, which en- ables compressing at least the data that represents repeatedly occurring information such as the acronyms of the most frequently quoted stocks. When a subsequent message is being composed at the server end, the server notes that it can compress some parts of the message according to the present invention. The server can then utilize the characters left free regarding the upper limit (e.g. 160 characters in tradi- tional SMS) to transmit some additional information, like more ticker symbols, in the same message. Fig. 5 illustrates schematically such an application of the invention. At step 501 a transmitting end device composes a message that contains at least the basic amount of information that was requested. At step 502 it compresses the message. At 503 is examines, whether after the compression there was any transmission capacity left from a fixed amount of transmission capacity that is to be reserved for transmitting the message. A negative finding at step 503 leads to just transmitting the message at step 504, while a positive finding at step 503 causes additional information to be added into the message at step 505 before transmitting it. The additional information may naturally also require compressing before transmission. Additionally the invention does not require that the basic message is compressed before the examining step 503, if it is possible to deduce from the amount of basic information and the knowledge about the compressing scheme that transmission capacity will be left over; in such a case compressing the whole message takes place after the additional information has been added.

Claims

1. A method for compressing digital data before transmitting it from a communications device of a first party to a communications device of a second party, characterized in that it comprises the steps of:

- maintaining (302, 303) a collection of digital data (105, 115) that represents an archive of information (104, 114, 201) that has been exchanged between the first party and the second party

- composing (308) new information (202) to be transmitted from the communications device of the first party to the communications device of the second party

- representing the composed new information as digital data and

- compressing (311) the digital data that represents the composed new information (202) by utilizing correspondences between the new information (202) and information from said archive (201), so that at least some of the resulting compressed digital data (203) refers to information in said archive (201).

2. A method according to claim 1, characterized in that

- the step of maintaining (302, 303) a collection of digital data (105, 115) that represents an archive of information (104, 114, 201) comprises maintaining a lookup table (201) that associates pieces of information that have been exchanged between the first party and the second party with codes that are shorter than character-by- character digital representations of said pieces of information, and

- the step of compressing (311) the digital data that represents the composed new information (202) comprises finding at least one piece of the new information (202) that is the same as a certain piece of information in the archive (201), and selecting the code associated with such piece of information in the archive (201) to constitute a part of the compressed digital data (203).

3. A method according to claim 1, characterized in that it comprises the steps of:

a) formulating (304, 305) a first indication about a currently valid state of said collection of digital data that represents, at the communications device of the first party, an archive of information that has been exchanged between the first party and the second party b) receiving (307) from the communications device of a second party a second indication about a currently valid state of a collection of digital data that represents, at the communications device of the second party, an archive of information that has been exchanged between the first party and the second party, and

c) comparing (309) the first indication with the second indication in order to find out, whether they indicate the currently valid states of the collections of digital data at the communications devices of the first and second parties to be the same;

so that the step of compressing (311) the digital data that represents the composed new information (202) by utilizing correspondences between the new information (202) and information from said archive (201) is only performed after a positive finding at step c).

4. A method according to claim 3, characterized in that it comprises the steps of:

- after having transmitted the compressed digital data to the communications device of the second party, augmenting (315) at the communications device of the first party the archive of information with said new information and

- thereafter reformulating (317) said first indication so that it comes indicate the state of said collection of digital data after augmenting the archive of information with said new information.

5. A method according to claim 1, characterized in that it comprises the steps of:

- observing a certain maximum amount of transmission capacity that is available for transmitting compressed digital data from the communications device of the first party to the communications device of the second party

- compressing (502) digital data that represents a certain basic set of information (501) to be transmitted to the communications device of the second party, thus producing a basic set of compressed digital data,

- comparing (503) the amount of data in the basic set of compressed digital data with said maximum amount of transmission capacity in order to find out, whether said maximum amount of transmission capacity could accommodate more than said basic set of compressed digital data and - if said maximum amount of transmission capacity is found capable of accommodating more than said basic set of compressed digital data, adding (505) more digital data to the basic set of compressed digital data before transmitting it to the communications device of the second party.

6. A method according to claim 1, characterized in that:

- the step of maintaining (302, 303) a collection of digital data (105, 115) involves maintaining a data structure that represents organization of information (104, 114, 201) that has been exchanged between the first party and the second party and

- the step of compressing (311) the digital data that represents the composed new in- formation (202) involves arranging the new information (202) according to the organization represented by said data structure.

7. A method for decompressing digital data after receiving it from a communications device of a first party to a communications device of a second party, characterized in that it comprises the steps of:

- receiving compressed digital data (312) from the communications device of the first party to the communications device of the second party

- finding a part in the compressed digital data that refers to a piece of information in said archive and

- decompressing (313) the received compressed digital data by utilizing the found reference to information in said archive (104, 114, 201), so that at least some of the resulting decompressed digital data represents information that is same as informa- tion in said archive (104, 114, 201).

8. A method according to claim 7, characterized in that

- the step of maintaining (302, 303) a collection of digital data (105, 115) that represents an archive of information (104, 114, 201) comprises maintaining a lookup table (201) that associates pieces of information that have been exchanged between the first party and the second party with codes that are shorter than character-by- character digital representations of said pieces of information, and - the step of decompressing (313) the received compressed digital data comprises finding at least one code in the received compressed digital data that is associated with a certain piece of information in the archive, and selecting digital data that represents said piece of information in the archive to constitute a part of the decom- pressed digital data.

9. A method according to claim 8, characterized in that it comprises the steps of:

- after having decompressed the received compressed digital data, augmenting (314) at the communications device of the second party the archive of information with new information represented by the decompressed digital data and

- thereafter reformulating (316) an indication so that it comes indicate the state of said collection of digital data after augmenting the archive of information with said new information.

10. A method according to claim 7, characterized in that:

- the step of decompressing (313) the received compressed digital data involves utilizing found references to information in said data structure, so that at least some of the resulting decompressed digital data represents information that is organized according to the organization represented by said data structure.

11. A method for communicating digital data in compressed form between a communications device of a first party and a communications device of a second party, characterized in that it comprises the steps of:

a) maintaining (302, 303) at both said communications devices collections of digital data that represent an archive of information that has been exchanged (301) between the first party and the second party

b) composing (308) at the communications device of the first party new information to be transmitted to the communications device of the second party

c) representing the composed new information as digital data d) compressing (311) the digital data that represents the composed new information by utilizing correspondences between the new information and information from said archive, so that at least some of the resulting compressed digital data refers to information in said archive,

e) transmitting (312) the compressed digital data from the communications device of the first party to the communications device of the second party,

f) finding at the communications device of the second party a part in the compressed digital data that refers to a piece of information in said archive and

g) decompressing (313) the received compressed digital data by utilizing the found reference to information in said archive, so that at least some of the resulting decompressed digital data represents information that is same as information in said archive.

12. A method according to claim 11, characterized in that it comprises, before step b), the steps of:

- formulating (305) at the communications device of the first party a first indication about a currently valid state of said collection of digital data that represents an archive of information that has been exchanged between the first party and the second party,

- formulating (304) at the communications device of the second party a second indi- cation about a currently valid state of said collection of digital data that represents an archive of information that has been exchanged between the first party and the second party, and

- transmitting (307) said second indication from the communications device of the second party to the communications device of the first party;

and before step d) the step of comparing (309) the first indication with the second indication in order to find out, whether they indicate the currently valid states of the collections of digital data at the communications devices of the first and second parties to be the same;

so that step d) is only performed after the first and second indications are found to indicate the currently valid states of the collections of digital data at the communications devices of the first and second parties to be the same.

13. A method for communicating digital data in compressed form between a communications device of a first party and a communications device of a second party, characterized in that it comprises the steps of:

a) maintaining (302, 303) at both said communications devices collections of digital data that represent the organization of information that has been exchanged (301) between the first party and the second party

c) representing the composed new information as digital data

d) compressing (311) the digital data that represents the composed new information by utilizing correspondences between the organization of the new information and the organization of data represented by said archive,

f) finding at the communications device of the second party a part in the compressed digital data that refers to the organization of data represented by said archive and

g) decompressing (313) the received compressed digital data by utilizing the found reference to the organization of data represented by said archive, so that at least some of the resulting decompressed digital data is organized according to the organization of data represented by information in said archive.

14. A method for encrypting digital data before transmitting it from a communications device of a first party to a communications device of a second party, characterized in that it comprises the steps of:

- maintaining (302, 303) a collection of digital data that represents an archive of information that has been exchanged between the first party and the second party

- composing (308) new information to be transmitted from the communications device of the first party to the communications device of the second party

- representing the composed new information as digital data

- taking a certain part of said collection of digital data to constitute a key and - encrypting (311) the digital data that represents the composed new information by utilizing a symmetric cryptographic method in which a certain key acts both as an encryption key and decryption key, and using said key taken as a certain part of said collection of digital data as the encryption key.

15. A method for decrypting digital data after receiving it from a communications device of a first party to a communications device of a second party, characterized in that it comprises the steps of:

- receiving (312) encrypted digital data from the communications device of the first party to the communications device of the second party

- taking a certain part of said collection of digital data to constitute a key and

- decrypting (313) the received encrypted digital data by utilizing a symmetric cryptographic method in which a certain key acts both as an encryption key and decryp- tion key, and using said key taken as a certain part of said collection of digital data as the decryption key.

16. A communications device for the use of a first party for compressing digital data and transmitting it to a communications device of a second party, characterized in that it comprises:

- means for maintaining a collection of digital data (105, 115) that represents an archive of information that has been exchanged between the first party and the second party

- means (101, 111) for composing new information to be transmitted from the communications device of the first party to the communications device of the sec- ond party

- means for representing the composed new information as digital data and

- means (102, 112) for compressing the digital data that represents the composed new information by utilizing correspondences between the new information and information from said archive, so that at least some of the resulting compressed digi- tal data refers to information in said archive.

17. A communications device for the use of a second party for receiving digital data from a communications device of a first party and decompressing said received digital data, characterized in that it comprises:

- means for maintaining a collection of digital data (105, 115) mat represents an ar- chive of information that has been exchanged between the first party and the second party

- means for receiving compressed digital data from the communications device of the first party to the communications device of the second party

- means for finding a part in the compressed digital data that refers to a piece of in- formation in said archive and

- means (102, 112) for decompressing the received compressed digital data by utilizing the found reference to information in said archive, so that at least some of the resulting decompressed digital data represents information that is same as information in said archive.