WO1986002650A1

WO1986002650A1 - Novel polypeptides having growth factor activity and nucleic acid sequences encoding the polypeptides

Info

Publication number: WO1986002650A1
Application number: PCT/US1985/002103
Authority: WO
Inventors: Joseph P. Brown; Daniel R. Twardzik; Hans Marquardt; George J. Todaro
Original assignee: Oncogen
Priority date: 1984-10-30
Filing date: 1985-10-23
Publication date: 1986-05-09
Also published as: AU6655190A; ES8802185A1; IL76866A0; ES8702925A1; NO862613D0; ES557708A0; ES8801379A1; ES554630A0; ES548316A0; ES8801378A1; JPS62501071A; ES8802074A1; NO862613L; PT81408A; PT81408B; FI862734A0; AU633377B2; ES8802061A1; ES8707741A1; ES557145A0

Abstract

Polypeptides having growth factor activity and precursors thereof are synthesized using conventional or recombinant techniques. The framework structure of these peptides are shown in figures 1 and 2. The structure, particularly those involving six cysteines, provide loops analogous to mammalian growth factors. The peptides are disclosed as being useful as mitogens, additives in nutrient media, in assays for Epidermal Growth Factors (EGF) receptors, and in wound healing.

Description

NOVEL POLYPEPTIDES HAVING GROWTH FACTOR ACTIVITY AND NUCLEIC ACID SEQUENCES ENCODING THE POLYPEPTIDES .

BACKGROUND OF THE INVENTION

Field of the Invention

A significant number of polypeptides secreted by mammalian cells are found to have growth factor activity. These compounds have been found to be substantially conserved over a wide range of mammals. Much of the interest in these compounds is stimulated by their association with oncogenicity. There is also interest in how the production of the growth factors is regulated and how they in turn regulate cellular activity.

It has also been noted that infection of mammalian cells by viruses results in proliferation of the growth of the cells. For the purpose of this - invention, of particular interest are members of the poxvirus family, such as variola, vaccinia and, such viruses associated with particular diseases such as Shope fibroma virus, Yaba tumor virus, and Molluscum contagiosum virus (MCV) .

It would be of interest to determine whether viruses causing cellular proliferation upon infection, produced polypeptides involved with growth factors, either acting as the growth factor or the growth factor receptor. These compounds could then be used in the studies of viral actions, in assays for the presence of the virus, in nutrient media, as mitogens, and in the development of therapeutic agents for treating viral infection. It is also of interest to develop compounds which may be agonists or antagonists of growth factors for use άi vitro in growing cell cultures, in investigating mitotic processes, and in therapy. Description of the Prior Art

Ven atesan et al. , J. Virol. (1982) 4 :637-646 describes the DNA sequencing of structural . genes encoding vaccinia virus proteins. Cooper e_t al. , ibid (1981) 3_7:284-294, report the translation of mRNA's encoded within the inverted terminal repetition of the vaccinia virus genome. Proliferative diseases for members of the poxvirus family have been reported for Shope fibroma virus (Shope, J. Ex . Med. (1932) 5_6_:793-822; Yaba tumor virus (Niven et al., J. Path. Bacteriol. (1961) 8_1:1-14) and Molluscum contagiosum virus (MCV) (Postlethwaite, Arch. Environ. Health (1970) 21_^:432-452). Descriptions of epidermal growth factor (EGF) may be found in Scott et. al_. , Science (1983) 223^:236-240 and Gray et al., Nature (1983)

303:722-725. The presence of three disulfide bridges in EGF and transforming growth factor (TGF) is reported by Savage et al., J. Biol. Chem. (1973) 248:7669-7692. See also Doolittle et al. , Nature (1984) 307:558-560. The receptor binding region of the EGF molecule has been suggested as lying in the loop between the third and fourth cysteine residues. (Komoriya e_t al_. , Proc. Natl. Acad. Sci. USA (1983) 1:1351-1355.)

New descriptions of vaccinia virus growth factor (VGF) may be found in Brown et al.. , Nature

(1985) 313:491-492; Reisner, Nature (1985) 313:801-803 and Blomquist et al. , Proc. Natl. Acad. Sci. USA (1984) 1_:7263-7367. The disclosure of these references is incorporated herein by reference. Expression of foreign peptides employing a baculovirus vector in insect cells is described by Maeda et al. , Nature (1985) 3_15_:592-594 and Carbonell et al., J. of Virology (1985) 5_6:153-160. These disclosures are also incorporated herein by reference. SUMMARY OF THE INVENTION Novel polypeptide compositions are provided finding analogy in fragments of viral proteins, which . compositions act as mitogens and can be used in nutrient media, as reagents for the detection of growth factor receptors or the presence of growth factors, and as competitors for transforming growth factor and epidermal growth factor. The compositions find therapeutic use, for example, to promote epithelial- ization and healing of burns and wounds. The novel compositions may be synthesized. The subject peptides may be formed as oligopeptides or fused proteins employing recombinant techniques in a wide variety of hosts.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is an amino acid sequence comparison of the vaccinia virus protein with known growth factors, where the initial amino acid of the WF is aspartic acid (D) at position 20; and

Figure 2 is a fragment of the vaccinia virus protein, beginning at amino acid 44 and terminating at amino acid 90, with residues which are identical in the polypeptide fragments, mEGF, hEGF and rTGF being cross-hatched.

DESCRIPTION OF THE SPECIFIC EMBODIMENTS

Novel compositions which can act as agonists or antagonists of epidermal growth factor (EGF) or transforming growth factor (TGF) are provided, which provide for a wide variety of applications in cell culture, diagnostics, ij vivo therapy, and in combination with other peptides in the formation of hybrid polypeptides as immunogens for production of antibodies. Polynucleotide sequences may be isolated or prepared which may be introduced into an expression vector for expression of the subject compositions. The compositions find analogy in fragments of viral proteins, particularly poxvirus proteins. The poxvirus proteins are found to have regions which have structures analogous to surface membrane proteins. The structural gene coding for the polypeptide has regions analogous to regions functioning as a leader sequence and processing signal and as a transmembrane integrator sequence, with the growth factor fragment sequence intermediate the leader and integrator sequence. The subject polypeptides are characterized by being capable of having at least one loop or circle, usually three loops or circles, as a result of cysteines which form bridges, where the physiologically active portion of the molecule providing a growth factor related activity will be from about 12 to 65 amino acids, usually 15 to 60 amino acids. Of the three loops, two of the loops are of from 12 to 15 amino acids (14-17 annular amino acids) , exclusive of the cysteine bridge, more particularly- one is of 12 or 13 amino acids (14 or 15 annular amino acids) and the other of 15 amino^" acids (17 annular amino acids) , where the N-terminal proximal loop will usually be of from 12 to 13 amino acids, more usually of 12 amino acids, and the middle loop will be of 15 amino acids. The third loop or C-proximal loop will be of

8 amino acids (10 annular amino acids) and including flanking amino acids will have the following formula:

PP³- (aa²⁵-C-aa²⁷)m - C-aa²⁹-aa³⁰-G-Y

_Pp4._(aa40._aa39._aa3S_)n._c. R- aa 3^J5- G-Al1

wherein: the one letter symbols for amino acids have their conventional meaning, wherein C is cysteine, G is glycine, Y is tyrosine, and R is arginine; Al intends a neutral amino acid, which will be described in more detail below, particularly, wherein Al is an aliphatic amino acid of from 2 to 6, . more usually of from 3 to 6 carbon atoms having from 0 to 1 hydroxy1 group, more particularly serine, threonine, valine, leucine or isoleucine, wherein the hydroxy substituted amino acids have from 3 to 4 carbon atoms, and the unsubstituted or alkyl substituted glycine amino acids have from 5 to 6 carbon atoms; aa 25 is a neutral am o acid which may be aliphatic, particularly of from 3 to 4 carbon atoms, or aromatic, particularly of 9 carbon atoms, and having from 0 to 1 oxy substituents, e.g., alanine, serine, threonine or tyrosine; _^ aa 27 may be neutral or basic, being particularly of from 4 to 6 carbon atoms, and when neutral having from 0 to 1 carboxamide group, e.g., arginine, valine, leucine, isoleucine, asparagine or glutamine; aa 29 is a neutral amino acid, being aliphatic or aromatic, wherein aromatic is exemplified by histidine and aliphatic is of from 2 to 6, more usually of from 3 to 6 carbon atoms, having from 0 to 1 hydroxyl group, e.g., serine, threonine, leucine, valine or isoleucine; aa 30 is a neutral ammo acid, being aliphatic or aromatic, wherein aromatic is exemplified by histidine and aliphatic is of from 3 to 6 carbon atoms, having a chain of other than hydrogen or from 5 to 6 atoms, and having from 0 to 1 hydroxyl group, particularly serine, isoleucine and valine, and wherein aa 29 and aa30 will usually be different, particularly one being histidine and the other being serine;

35 . aa is a neutral or acidic amino acid,-of - from 5 to 6 carbon atoms, where neutral is exemplified by valine, leucine and isoleucine and where acidic is exemplified by aspartic and glutamic acids; aa38 is an aliphatic neutral substituted amino acid or acidic amino acid, wherein the substituent is carboxamide and is of from 4 to 5 carbon atoms, e.g., asparagine, glutamine, aspartic acid or glutamic acid; aa 39 is an aromatic amino acid or a neutral aliphatic amino acid of from 3 to 4 carbon atoms having an hydroxyl substituent, preferably aromatic, e.g., histidine, tyrosine, serine or threonine; aa 40 is neutral unsubsti.tuted ali.phati.c or basic amino acid, wherein the neutral amino acids are of from 3 to 5 carbon atoms, e.g., alanine, valine, lysine or arginine; m and n are 0 or 1; PP 3 and PP4 are either hydrogens, indicating the termination of the polypeptide or may be polypeptide chains of not greater than a total of 1000 amino acids, usually of not greater than a total of about 500 amino acids, pre^'ferably where at least 90% of the amino acids, more preferably at least about 95% of the amino acids present are in one of the two polypeptide chains; in some instances, the chain may be of only one amino acid and not more than 100 amino acids, frequently of not more than about 50 amino acids, depending upon the use of the polypeptide and the role of the extended chain; the polypeptide chains may be related to the naturally occurring polypeptide chains associated with naturally occurring growth factors and pox virus proteins or may be other than the naturally occurring chains or fragments thereof associated with the polypeptide chain specifically set forth in the formula, usually unrelated.

(By unsubstituted is intended no other heterosubstituents than the carboxy and amino group present in glycine. All the amino acids are the natural L-stereomer.) The definitions of the amino acids are set forth below. Neutral (Ne) aliphatic (Al) unsubstituted G. A substituted oxy s. T thio C, M amido N, Q aromatic (Ar) unsubstituted F substituted Y heterocyclic H, w

Charged basic (Ba) K, R acidic (Ac) D, E

The abbreviations in the parentheses refer to the particular amino acid groups. The amino acids are the naturally occurring L-amin'o acids. The loop between the cysteines in the above formula will have from 0 to 1 acidic amino acid, preferably no acidic amino acids, and the amino acids immediately flanking the cysteines outside the loop will include at least one charged amino acid, preferably a basic amino acid, more preferably, having in addition an amido substituted aliphatic amino acid. Of the amino acids in the loop from 4 to 6, usually 5 amino acids, will be neutral aliphatic amino acids and not more than 2, usually 2, will be aromatic amino acids.

Of particular interest are compounds, where the polypeptide is of fewer than 130 amino acids, more particularly of fewer than 50 amino acids, and at least 40 amino acids, preferably at least about 42 amino acids. Of particular interest are polypeptides including amino acids 44 to 86 as indicated in Figure 1 of WP. Preferably aa29 is a substituted or unsubstituted aliphatic amino acid of from 3 to 6 carbon atoms having from 0 to 1 hydroxyl group, particularly serine; aa 30 is preferably histidine, serine or an unsubstituted aliphatic amino acid of from 5 to 6 carbon atoms, particularly where one of aa 29 and aa30 is histidine and the other is serine;

Al is preferably a substituted aliphatic amino acid having one hydroxyl group and of from 3 to 4 carbon atoms; aaaa 3 35 is preferably an unsubstituted aliphatic amino acid of from 5 to 6 carbon atoms; aaaa 3 388 iiss pprreeffeerraabbllyy aa ssuubbssttiittuuted aliphatic amino acid of from 4 to 5 carbon atoms,, wherein the substitution is carboxamide; aa 39 is preferably aromatic, more preferably histidine.

Of particular interest is the presence of two loops, with the C-proximal loop joined to the second loop, where the amino acids of the second loop may be widely varied. The second loop is preferably of from about 14 to 16 amino acids exclusive of the cysteine bridge, more preferably of about 15 amino acids. Of the amino acids, from 6 to 9, preferably 7 to 9, more preferably about 8 are aliphatic amino acids either substituted or unsubstituted, preferably not more than about 3 of the amino acids being substituted, more preferably from 1 to 2 amino acids being substituted; there will be from 2 to 4 aromatic amino acids, more particularly 3 aromatic amino acids, desirably histidine and tyrosine; there will be from 2 to 4 acidic amino acids, preferably 3 acidic amino acids, more particularly aspartic acid; and there will be from 0 to 2, more usually 1 basic amino acid, particularly arginine. Desirably, the cysteine forming the subject loop closest to the cysteine of the other loop, will be separated by from 0 to 2, more usually from 0 to 1 amino acids, particularly arginine.

Of particular interest for some applications for the subject compounds are compounds of the following formula:

PP 1 -aa1 -C-aa 3 -aa 4 -aa 5 -aa 6 -aa 7* -aa 8 * -Y-C- aa 11* -aa 12 (,aa 12a, ) -G _-aa 14 -C _-aa 16 -A . r 1 -A _ l. 2 -aa 19 -aa 20 aa²¹-Ac-aa²³-aa²⁴? (aa²⁵-C-aa²⁷C-aa²⁹-aa³⁰ G-Y-Al¹-G-aa³⁵-R-C-aa³⁸*-aa³⁹-aa⁴⁰ ) aa⁴¹-L-aa⁴³-aa⁴⁴-PP²

wherein: the symbols between the parentheses (except for aa ^a) in the formula have already been described; PP 1 and PP2 are the same or different and may be hydrogens, indicating the terminal portion of the indicated polypeptide or may be polypeptides having a total of up to about 1000 amino acids, more usually of up to about 500 amino acids and may have a total of as few as 1 amino acid, or may individually or separately be polypeptides of from 1 to 100 amino acids, more usually from about 1 to 75 amino acids, more particularly from about 5 to 50 amino acids; these polypeptides will have specific applications in modifying the specifically described sequence for a predetermined purpose; PP 1 and PP2 may be the same or different from the natural pox virus polypeptide, usually different; the * indicates the conservation of the amino acid of the vaccinia virus peptide and mouse and human epidermal growth factor, where a space is introduced between amino acids 55 and 56 of the vaccinia virus peptide in order to provide for the alignment of cysteines defining the loops, as well as other amino acids; aa may be any amino acid, more particularly an aliphatic amino acid, basic amino acid or acidic amino acid, preferably an unsubstituted aliphatic amino acid of from 2 to 6 carbon atoms, more particularly glycine and leucine; aa 3 i.s a neutral ammo acid, particularly of from 2 to 4 carbon atoms, more particularly glycine and proline;

4 aa may be any amino acid, particularly of from 2 to 6 carbon atoms, which may be neutral, acidic or basic, more particularly of from 3 to 6 carbon atoms, including proline, aspartic acid, serine and arginine;

5 aa is an acidic or a neutral substituted aliphatic amino acid, particularly glutamic acid or an hydroxy substituted amino acid of from 3 to 4 carbon atoms, more particularly serine; g aa is an unsubstituted aliphatic amino acid of from 2 to 6, usually 2 to 3, carbon atoms, or an aromatic amino acid, particularly glycine, histidine or tyrosine;

7 aa is an acidic or neutral substituted aliphatic amino acid, particularly of from 3 to 5, more particularly 4 carbon atoms, such as aspartic acid and threonine; g aa is a neutral unsubstituted aliphatic amino acid of from 2 to 5, usually 2 to 3, carbon atoms, particularly substituted with carboxamide (4 to 5 carbon atoms), e.g., glutamine, particularly glycine; aa is a neutral amino acid, either aliphatic or aromatic, more particularly aliphatic, particularly of from 5 to 6 carbon atoms, more particularly leucine or phenylalanine; aa 12 i.s an aroma,t_i.c ammo acid or a carboxamide substituted aliphatic amino acid of from 4 to 5 carbon atoms, particularly histidine or asparagine; aa is a neutral aliphatic amino acid or acidic amino acid, more particularly glycine or aspartic acid; p is 0 or 1; aa is an acidic amino acid or neutral aliphatic amino acid substituted or unsubstituted of from 4 to 5 carbon atoms, having from 0 to 1 hydroxyl group, particularly aspartic acid, threonine or valine; aa is a neutral or basic aliphatic amino acid, either substituted or unsubstituted, of from 4 to 6 carbon atoms, particularly isoleucine, arginine or methionine;

Ar intends an aromatic amino acid, which may have a carbocyclic or heterocyclic ring, and includes histidine, phenylalanine or tyrosine;

2 Al is a neutral aliphatic amino acid of from

2 to 6, preferably of from 3 to 6 carbon atoms, particularly alanine, leucine and isoleucine; aa 19 may be any ammo acid, particularly aliphatic acidic or basic amino acids, more particularly of from 4 to 6 carbon atoms, more preferably of from 5 to 6 carbon atoms, particularly arginine, valine and glutamic acid; aa 20 is an aci.di.c amm. o acid or neutral aliphatic amino acid, either substituted or unsubstituted of from 3 to 6, usually 3 to 5, carbon atoms, having from 0 to 1 hydroxyl group, and includes aspartic acid, glutamic acid, glutamine, serine or alanine;

21 aa is a neutral unsubstituted aliphatic amino acid or acidic amino acid of from 5 to 6 carbon atoms, particularly a neutral aliphatic amino acid, which includes isoleucine, leucine or glutamic acid;

Ac is an acidic amino acid which is aspartic acid or glutamic acid;

23 aa is a neutral aliphatic or basic amino acid, of from 2 to 6 carbon atoms, when neutral, of from 2 to 4 carbon atoms having from 0 to 1 hydroxyl substituent, e.g., glycine and serine; and when basic, being lysine or arginine, particularly lysine; aa 24 is an aliphatic amino acid, proline or

5 an aromatic amino acid, particularly a thiosubstituted aliphatic amino acid, more particularly methionine, or tyrosine; aa 41 is a neutral unsubstituted aliphatic or acidic amino acid of from 4 to 6 carbon atoms,

10 particularly valine or aspartic acid; aa is a neutral aliphatic or basic amino acid, particularly a neutral aliphatic amino acid which may be substituted or unsubstituted of from 4 to 6 carbon atoms, either unsubstituted or carboxamide

15 substituted and includes valine, leucine, arginine and lysine; and aa 44 may be any amino acid, particularly other than a basic amino acid, and may be acidic, neutral or aromatic, when other than aromatic, being of

20 from 3 to 4 carbon atoms, particularly aspartic acid, alanine, or tryptophan.

Of particular interest is when PP has the following sequence:

2 ,5_c P.-PI'-aa-27-aa-26-aa-25-aa-24-aa-23-aa-22-aa-21-aa-20 aa ,-19-a,a,-18-a,a-17-aa-16-aa-15-aa-14-aa-13-aa-12-aa-11 aa -10-aa-9-aa-8-aa-7-aa-6-aa-5-aa-4-aa-3-aa-2-aa-1

wherein:

I ■ 30 PP in combination with the subsequent amino acid symbols in the above sequence is the equivalent of

PP . The above sequence comes within the definition of PP 1; PP1' may be hydrogen or an amino acid sequence.

One or more of the amino acids, symbolized by aa~^x (x

35 is any number) , may be a bond, so as to serve to reduce the number of amino acids in the N-terminal chain.

Therefore, all or a portion of the amino acid sequence indicated may be present. When a portion of the sequence is present, preferably the sequence will involve contiguous amino acids, that is, amino acids in their numerical order without deletions. Usually, there will be at least one amino acid, more usually at least three, more usually at least five, where the remaining upstream amino acids may be absent, so that

PP 1' may be joined to aa-1, aa-5, or the like; aa may be any amino acid, particularly aliphatic, of from about 3 to 6 carbon atoms preferably neutral or basic, more particularly neutral or basic of from about 5 to 6 carbon atoms; aa -2 may be any amino acid, either aliphatic or aromatic, particularly aliphatic, more particularly of from about 4 to 6 carbon atoms;

_-3 aa may be aliphatic or aromatic, particularly aliphatic, either polar or non-polar, of from 2 to 6, more usually of from 2 to 5 carbon atoms, generally having from 0 to 1 hydroxyl substituent; aa -4 may be any aliphatic ammo acid,. particularly neutral, generally of from 3 to 6 carbon atoms, usually of from 3 to 5 carbon atoms, and may be polar or non-polar, particularly proline and asparagine; aa -5 may be any aliphatic am o acid, particularly a neutral or basic amino acid, more particularly of from about 4 to 6 carbon atoms, preferably an aliphatic neutral amino acid;

—6 aa is an aliphatic amino acid of from about 3 to 6, more usually of from about 4 to 6 carbon atoms, particularly acidic or aliphatic, more particularly acidic; da -7, -8, -11, -15, -16, -17, -23 and -27 α_ai_τ.c_Ω aliphatic amino acids of from 3 to 5 carbon atoms, either polar or non-polar, particularly polar having from 0 to 1 hydroxyl substituent; a.a-9, -20, -23 and -25, a._re_ „no_n-_po_l,a_r„ aliphatic amino acids of from 2 to 6, more usually of from 2 to 3 carbon atoms; aa -10 and -21 are pol,ar al _i .p .ha _t__i .c amm. o aci.d,s of from 4 to 5 carbon atoms, particularly having a carboxamido substituent; aa -12,' -14 and -19 are a -l,i•ph_at_i.c amm■ o aci -d_s of from 2 to 6 carbon atoms, more particularly of from

4 to 6 carbon atoms; aa^" ' ~ ~ are aliphatic acidic amino acids; and

—26 aa is an aromatic amino acid, particularly phenylalanine.

Of particular interest is an N-terminal fragment of from aa -1 to aa-24 and aa-1 to aa-7, more particularly aa^" to aa^" . PP may conveniently be an unrelated amino acid sequence which may serve as a fusion protein, particularly to provide VGF as an immunogen for production of antibodies to VGF and its congeners, such as EGF .and TGF-α.

The primary aspects of the subject compositions are the sequence in the parentheses, particularly between the cysteine at position 28 and the cysteine at position 37 and the loops generated by the cysteines at positions 2 and 10 and the cysteines at positions 15 and 26. Thus, desirably, the subject compositions have the loop created by the cysteines at positions 28 and 37 in conjunction with the loop created by the cysteines at positions 15 and 26. Varying combinations of polypeptide sequences may be prepared by employing as one fragment the C-terminal portion of one sequence with the N-terminal portion of another sequence as a second fragment, whereby polypeptides of from about 40 to 65 amino acids, usually about 50 to 60 amino acids are provided.

Desirably, the juncture will be made at some point b ,_e_t_ween aa20 and, aa27, par_t.i.cul-,arl.y aa20 to aa25 Preferably the fragments are joined at a site not more than 5 amino acids from the sequence C-X-C, where X is any amino acid. In each case, the framework structure of the cysteines is retained with the cystine bridges defining loops of the sizes described previously.

Thus, a fragment may be employed from any EGF, TGF, VGF (portion of the vaccinia virus having substantial homology with other growth factors and depicted in Fig. 2) , or other growth factor having the same framework structure of the subject compositions and similar physiologic activity. This will be regardless of the mammalian source, such as primate, e.g., human, rodent, e.g., rat and mouse, bovine, avian, porcine, etc.

The fragments will generally be of from about 15 to 50, usually 15 to 45 ammo acids, since aa 20 does not intend the twentieth amino acid of the compound, but only of the specific sequence which has been specifically defined.

Of particular interest is where VGF is modified where VGF has an alanine substituted for a tyrosine which introduces an SphI restriction site. This provides for a convenient site for linking one fragment from the VGF gene to a fragment from another growth factor. Of course, the entire sequence could be synthesized without any amino acid changes. Thus the subject polypeptides may be synthesized in accordance with conventional polypeptide synthesizing techniques, using a commercially available polypeptide synthesizer. The subject compositions may be unglycosylated or glycosylated. Depending upon the particular amino acid sequence, one or more glycosylation sites may be present, usually not more than six, more usually not more than three, glycosylation sites. The glycosylation sites may or may not be glycosylated. The saccharides will usually add not more than 6kDal (kilodaltons) , usually not more than about 5kDal, to the total molecular weight of the active polypeptide.. Sugars involved will be mannose, glucose, N-acetylglucosamine, galactose, glucuronic acid, galacturonic acid, sialic acid, etc.

The subject compositions find a wide variety of applications d^ vitro and ±τx vivo as agonists or antagonists for growth factors, such as epidermal growth factor (EGF) , and transforming growth factor (TGF) , particularly TGF-I.

A discussion of growth factors, used by themselves or in combination with other compositions, particularly polypeptide compositions, for regulating the growth of cells, and other activities, may be found, for example, in Handbook of Experimental Pharmacology, Tissue Growth Factors, ed. Baseraga, Vol. 57, Springer-Verlag, Berlin, 1981, Chapter 3, particularly pages 98-109; and Carpenter, Ann. Rev. Biochem. (1979) 48_:193-216. hEGF appears to be identical to human urogastrone. EGF has been found to exert a variety of effects on prenatal and neonatal tissue growth. Among the effects are precocious eye-opening, wound healing, incisor eruption, and accelerated maturation of the lung. EGF receptors are found in a wide variety of adult tissues. EGF is found to stimulate phosphorylation of its own receptor. EGF is also found to be related to increased bone resorption.

Transforming growth factor, particularly TGF-I or TGF- , has many analogous activities to that of EGF. TGF binds to the EGF receptor leading to phosphorylation of the receptor, enhancement of its tyrosine-specific kinase activity and to stimulation of cell growth. Cohen, in: Biological Response Mediators and Modulators (ed. August, J.T.), Academic, New York, 1983, pp. 7-12; Tarn et al. , Nature (1984) 309:376-378; Ibbotson et al. , Science (1983) 221:1292-1294.

The subject compounds have particular application as drugs as agonists for EGF, and for wound healing, such as epithelialization of wounds, such as burns, eye wounds, surgical incisions, and the like. The active ingredient may be employed in a convenient . vehicle, e.g., Silvadene, in amounts ranging from about 0.01 to 0.5, usually from about 0.075 to 0.2μg/ml. The formulation is spread over the wound, so as to provide a complete coating of the wound with the formulation. Treatments may be as frequent as four times a day or as infrequent as every other day or less, depending upon the nature of the wound, its response to the treatment, the concentration of the active ingredient, and the like.

The subject compositions can be used as reagents in diagnostic assays or for the preparation of reagents, such as polyclonal or monoclonal antibodies. As reagents, they may be used for the detection of analogous growth factors or for the detection of antibodies to the growth factors in physiological fluids, such as blood. Depending upon the particular protocol and the purpose of the reagent, the polypeptide may be labeled or unlabeled. A wide variety of labels have been used which provide for, directly or indirectly, a detectable signal. These labels include radionuclides, enzymes, fluorescers, particles, chemiluminescers, enzyme substrates or cofactors, enzyme inhibitors, magnetic particles, etc. See for example, U.S. Patent Nos. 3,654,090, 3,817,837, 3,935,074, 3,996,345, 4,277,437, 4,374,925, and 4,366,241. A wide variety of methods exist for linking the labels to the polypeptides, which may involve use of the N-terminal amino group for functionalization to form a pyrazolone, while other free amino groups are protected, where the pyrazolone may then be contacted with various reagents, e.g., amino groups, to link to the detectable signal generating moiety. By protecting the arginine amino acids associated with the third loop or proximal thereto, other arginines may be functionalized for conjugation to amino groups or thio groups in accordance with known ways. Alternatively, the polypeptide may be contacted with an active agent, e.g., an activated carboxylic acid and randomly substituted, where biologically active material may be separated from biologically inactivated material as a result of the random substitution. Finally, depending upon the method of synthesis, the polypeptide may be modified to provide for the desired functionality as part of the synthetic procedure.

The subject compositions can also be used for monitoring EGF receptors. The subject compositions can also be used for monitoring cellular response to EGF and/or TGF by providing for competition between these naturally occurring materials and a composition according to the subject invention. In this way, changes in the receptor conformation can be monitored. Depending upon the particular composition of this invention which is employed and the purpose for the additive, for m vitro use, the concentration of the additive will vary widely, due to fluctuations in activity, varied purpose, and variations in receptors. The subject compositions can also be used for various therapeutic purposes involving growth stimulation or control of bone formation. These compounds may be administered in appropriate physiological carriers intraperitoneally, subcutaneously, intravenously, intraarterially, or by application to the site of interest. In addition, the subject compositions can be introduced into liposomes, which may or may not involve the use of antibodies for site direction. Various carriers include phosphate buffered saline, saline, water, or the like. The concentration of the additive will vary widely, depending upon its ultimate use and activity. Other additives may also be included in the formulations, such as EGF, TGF, other growth factors, bacteriocides, e.g., antibiotics, bacteriostats, buffers, etc. For preparing antibodies, the subject polypeptides where PP 1-4 are hydrogen or short oligopeptide chains (fewer than five amino acids) , may be joined to antigenic polypeptides or proteins, for injection into mammalian hosts. The antigenic protein will have at least about 60 amino acids and will usually be not more than 10 kilodaltons (kDal) . Numerous techniques exist for joining to polypeptides, either at a specific site or randomly, using bifunctional reagents, e.g., p_-mal'eimidobenzoic acid, glutaraldehyde, p_,p'-benzidine, etc. Common antigenic proteins include bovine serum albumin, keyhole limpet hemocyanin, tetanus toxoid, etc. The subject polypeptides are joined to the antigenic protein in sufficient number to provide the desired immunogenic response. Usually there will be two or more booster injections after the initial injection. For antisera, blood is removed from the immunized host and the ^' immunoglobulin fraction isolated. For monoclonal antibodies, the spleen is isolated and splenocytes fused with an appropriate fusion partner in accordance with conventional ways. The resulting hybridomas are then screened for antibodies binding to the epitopic sites of the subject polypeptide. These antibodies may be used for a variety of purposes, such as diagnostic reagents, therapy, etc. The antibodies when used as reagents may be labeled or unlabeled, as described for the polypeptides.

The subject compositions can be prepared in a variety of ways depending on the size of the composition. Particularly below about 80, more particularly below about 60 amino acids, the composition can be prepared by synthesis in accordance with conventional ways. See, for example, Merrifield, Solid-Phase Peptide Synthesis, "The Peptides Analysis, Synthesis, Biology," Special Methods in Peptide Synthesis, Part A, Vol. 2, Gross and Meinhofer eds., Academic Press, NY, 1980, pp. 1-284. See also, U.S. Patent No. 4,127,526.

Alternatively, the use of hybrid DNA technology can be employed, where DNA sequences can be used which code for the desired polypeptide or precursor thereof.

DNA sequences can be synthesized employing conventional techniques such as overlapping single strands which may be ligated together to define the desired coding sequence. The termini can be designed to provide restriction sites or one or both termini may be blunt-ended for ligation to complementary ends of an expression vector. For expression of the sequence an initial methionine is provided. Expression vectors are generally available and are amply described in the literature.

Instead of synthesizing the structural gene, a poxvirus may be isolated and by various techniques, the mRNA isolated, which codes for the polypeptide including the growth factor, the mRNA reverse transcribed, the resulting single-stranded (ss) DNA used as a template to prepare double-stranded (ds) DNA and the ds DNA gene isolated. This gene may then be manipulated in a variety of ways to remove undesired untranslated region or undesired codons, for example, employing primer repair, in vitro mutagenesis to introduce a restriction site or different amino acid at one or more appropriate sites, or introduction into a vector, followed by restriction and exonuclease digestion to remove the terminal bases. In this way, the gene may be reduced to the desired number of codons. Where a convenient restriction site is internal to the coding region, the coding region may be restricted and the lost nucleotides replaced by employing an adapter for joining the coding region to the desired flanking region, such as another coding region which codes for a foreign polypeptide to be joined to the subject polypeptide to provide a fused protein.

The DNA coding for the growth factor sequence may be excised from the W genome by cleavage with Alul and Hpall to give a 190bp fragment. This may then be manipulated as described above.

Expression vectors are characterized by having transcriptional and translational regulatory initiation and termination signal regions, where a DNA sequence having an open reading frame may be inserted between them and will be under the transcriptional and translational control of the signals. In addition, the expression vector may have one or more markers which allow for selection of the host having the expression vector and maintenance of the expression vector in the host.

The transcriptional initiation may be subject to inducible control, by temperature change, chemicals, or the like. In this manner, one can grow the host cells to high density, prior to initiation of the production of the desired product.

In addition, there may be one or more replication systems. For extrachromosomal maintenance, it will be necessary to have a replication system which is functional in the host to be used for expression. Where the host is other than a bacterium, it will frequently be desirable to have a second replication system which allows for cloning in a bacterium, to enhance the availability of the plasmid and allow for purification and characterization. A wide variety of prokaryotic or eukaryotic hosts may be employed, including unicellular microorganisms, both prokaryotic and eukaryotic, cold-blooded and warm-blooded eukaryotes, such as insects and mammals, or the like. By employing different constructs for expression of the gene, one can introduce the constructs into different hosts. The host may provide for different products by providing for glycosylation to varying degrees with the same or different sugars or providing for an unglycosylated product. Usually, the glycosylation may occur at one or more, usually not more than two glycosylation sites present in the VGF sequence, where for glycosylation at least one sugar will be added and usually not more than about 6kDal of sugars, usually not more than about

4kDal of sugars, frequently not more than about 2kDal of sugars.

Desirably, in the subject invention, one can use the leader sequence naturally present with the polypeptide of the subject invention to provide for secretion in a eukaryotic host, particularly a mammalian host, plus other maturation processes, e.g., glycosylation. Thus, by employing the DNA sequence encoding for the subject oligopeptide joined to the naturally occurring secretory leader and processing signal, with the subject DNA sequence by itself or fused to a DNA sequence coding for a foreign polypeptide not naturally joined to the subject oligopeptide, one can provide for secretion of the subject polypeptides with concommitant removal of the secretory leader processing signal in an appropriate eukaryotic host. Alternatively, where the leader and processing signal are not functional in the intended expression host, the leader and processing signal may be substituted with a leader and processing signal recognized by the intended expression host. Where integration into the host genome is desired, a stable replication system is not required. Normally, the sequences of interest will be flanked by. sequences homologous with a sequence present in the host genome to enhance the probability for recombination. Desirably, a marker is included, particularly one which allows for amplification, such as genes expressing metallothioneins, dihydrofolate reductase or thymidine kinase, so that the insertion sequence is not only maintained but amplified.

The subject polypeptides may be used with various hosts for inducing an immunogenic response, enhancing cellular proliferation, wound healing or the like. For wounds, epithelialization and vasculari- zation is observed, as well as rapid restoration of the strength of the wound, that is, resistance to separation or tearing. Hosts include mammals, such as rodents, domestic animals, primates and humans.

The following examples are offered by way of illustration and not by way of limitation.

EXPERIMENTAL The Dayhoff protein sequence library of 2676 sequences was obtained on magnetic tape from the protein identification resource (Georgetown,

Washington, D.C.), and searched for sequences related to rat TGF (rTGF) , where the three most closely related sequences were mouse EGF (mEGF) (Scott et al_^. , 1983, supra; Gray et e_l . , 1983, supra) and human EGF (hEGF) (Gregory et ^L. , Int. J. Pept. Protein Res. (1977)

9_:107-118), which are known to be homologous to human, mouse and rat TGF (Marquardt et al_,. , Proc. Natl. Acad. Sci. USA (1983) 8_0_:4684-4688) and residues 45 to 85 of a 140 residue polypeptide encoded by the vaccinia virus genome (Venkatesan e_t a^. , 1982, supra) . The alignment of the W polypeptide, rTGF, mEGF, and hEGF is shown in Figure 1. The observed homology has a probability due to chance of less than 0.00301 by Fisher's exact test. The W peptide has uncharged and hydrophobic. residues near the N-terminus between residues 5 and 15 and near the C-terminus between residues 100 and 124. These residues may be considered by analogy to integral membrane glycoproteins as having a N-terminal hydrophobic signal sequence, which is removed proteolytically during or immediately after translation, and a C-terminal transmembranous sequence, which serves to anchor the mature protein in the membrane. Cleavage of the viral polypeptide at arg-43 and arg-90 would lead to release of a soluble polypeptide. The 140 residue W polypeptide may be assumed to give rise first to a membrane-associated protein of approximately 120 residues after removal of a signal peptide and then to a soluble growth factor peptide of about 47 residues.

The following DNA sequences were designed for synthesis and expression in an E. coli host. The sequences were designed to employ primarily bacterial preferred codons. The sequences contain several useful restriction sites, a few being indicated. The VGF sequence differs from the natural sequence in being DGMACRC rather than DGMYCRC. The subsequent chimeric sequences are prepared by synthesizing the two fragments bounded by BssHII and SphI sites and SphI and BamHI sites and ligating these fragments into the plasmid vector containing the modified TGF or VGF structural gene. Synthetic human TGF gene

BssHII

CGCGCCATGGTTGTTTCTCACTTTAACGACTGCCCGGACTCTCATACTCAGTTT M V V S H F N D C P D S H T Q F

TGCTTTCATGGTACCTGCCGTTTTCTGGTTCAGGAAGAAAAACCGGCATGCGTT C F H G T C R F L V Q E D K P A C V

BamHI

TGCCATTCTGGCTACGTTGGCGCACGTTGCGAACACGCTGACCTGCTGGCTTAAGGATCC C H S G Y V G A R C E H A D L L A Ter

Synthetic human TGF /VGF gene

BssHII CGCGCCATGGTTGTTTCTCACTTTAACGACTGCCCGGACTCTCATACTCAGTTT M V V S H F N D C P D S H T Q F

SphI

TGCTTTCATGGTACCTGCCGTTTTCTGGTTCAGGAAGAAAAACCGGCATGCGTT C F H G T C R F L V Q E D K P A C V TGF

BamHI

TGCTCTCATGGCTACACTGGAATTCGTTGCCAGCATGTTGTTCTGGTCGACTACCAGCGTTAAGGATCC C S H G Y T G I R C Q H V V L V D Y Q R Ter

VGF

Synthetic VGF/human TGF gene

BssHII

CGCGCCATGCTGTGCGGCCCGGAAGGCGACGGCTAC M L C G P E G D G Y

SphI

TGCCTGCATGGCGACTGCATCCATGCACGTGACATCGACGGCATGGCATGCGTT C L H G D C I H A R D I D G M A C V

^■e- •>

VGF TGF

BamHI

TGCCATTCTGGCTACGTTGGCGCACGTTGCGAACACGCTGACCTGCTGGCTTAAG C H S G Y V G A R C E H A D L L A Ter Synthetic VGF gene

BssHII

CGCGCCATGCTGTGCGGCCCGGAAGGCGACGGCTAC M L C G P E G D G Y

SphI

TGCCTGCATGGCGACTGCATCCATGCACGTGACATCGACGGCATGGCATGCCGT C L H G D C I H A R D I D G M A C R

BamHI

A synthetic VGF gene is prepared from the following fragments:

SphI EcoRI

V C S H G Y T G VGFsl 5' CGTTTGCTCTCATGGCTACACTGG VGFs2 3' GTACGCAAACGAGAGTACCGATGTGACCTTAA

EcoRI . Hindi BamHI

I R C Q H V V L V D Y Q R * VGFs3 5' AATTCGTTGCCAGCATGTTGTTCTGGTCGACTACCAGCGTTAAG

VGFs4 3' GCAACGGTCGTACAACAAGACCAGCTGATGGTCGCAATTCCTAG

The VGF gene may be assembled according to conventional procedures and used for insertion into an appropriate expression vector, joined to other fragments to extend the N- and/or C-terminus, or the like.

Isolation of Naturally Occurring VGF

Cell culture and virus. Cercopithecus monkey kidney (BSC-1) cell monolayers were maintained in Eagle's basal medium supplemented with 10% fetal calf serum. W (strain WR) was grown in Hela cells and purified by sucrose density gradient sedimentation

(Moss, B. , (1981) in Gene Amplification and Analysis, eds. Chirickjian, J.G. and Papis, T.S.

(Elsevier /North-Holland, NY), Vol. 2, pp. 253-266). BSC-1 cell monolayers were infected with 15 plaque-forming units (pfu) per cell of purified virus, and incubated at 37°C with approximately 1ml of Eagle's basal medium supplemented with 2% fetal calf serum per 2x10 cells. Mock-infected cells were treated in an identical manner. Cell culture supernatants were clarified by low speed centrifugation and lyophilized. The residue was then resuspended in 1M acetic acid and dialyzed extensively against 0.2M acetic acid. Insoluble material was removed by centrifugation and the supernatant was lyophilized and resuspended in 1/lOOth of the original volume of 1M acetic acid and stored at 4°C.

Chromatography. Gel filtration was performed on columns of Bio-Gel P-10 (Bio-Rad) equilibrated in 1M acetic acid. Sizing by HPLC utilized two Bio-Sil TSK-250 columns (Bio-Rad) in series.

125 Radioreceptor Assay. The binding of I-labeled

125 EGF ( I-EGF) to its receptor on monolayers of A431 cells was modified from the procedure described by

Cohen and Carpenter, Proc. Natl. Acad. Sci. USA (1975)

22:1317-1321. Cells (1x10 per well) were fixed on 24-well plates (Linbro, Flow Laboratories) with 10% formalin in phosphate-buffered saline prior to assay.

Formalin-fixed cells do not slough off plates as easily as do unfixed cells, and replicate values were thus more consistent. Under these assay conditions,

12S 10 I-EGF (1x10 cpm/nmol) saturates the binding assay at 3nM; assays were performed at 10% of the saturation value. TGF and VGF concentrations are expressed as ng equivalents of EGF per ml—i.e., the amount required to

125 produce an inhibition of I-EGF binding equivalent to that produced by a known amount of EGF. Radioimmunoassay. Each 50μl reaction contained the following: 20mM sodium phosphate at pH 7.4, 200nM NaCl, 40mM dithiothreitol, 0.1% bovine serum albumin, 0.1% NaN₃, ¹²⁵I-labeled peptide (2xl0⁴cpm) corresponding to the 17 carboxyl-terminal residues of TGF-α (Linsley et a^L. , Proc. Natl. Acad. Sci. USA (1985) 8_2:356-369) , antiserum at a final dilution of 1:5000, and other additions as specified. The reaction was initiated by the addition of antiserum and was continued at 23°C for 90min. An equal volume of 10% formalin-fixed S_. aureus (Pansorbin, Calbiochem) was then added, and incubation was continued for an additional 30min at 23°C. The immunoadsorbant was removed by sedimentation, and the amount of bound

125I-labeled peptide was measured. The amount of bound peptide as corrected for nonspecific binding measured in the absence of antibody (less than 5% of the total) and expressed as a percentage of maximal binding.

Cellular DNA synthesis assay. Diploid human fibroblasts obtained from explant of newborn foreskin were seeded at a density of 3x10 4 cells per well

(96-well plates, Nunclon, Roskilde, Denmark) and were grown to confluency in Dulbecco's modified Eagle's medium (GIBCO)/10% newborn calf serum. Cultures were then placed in medium containing 0.2% newborn calf serum, and two days later EGF (lOng/ml) or VGF (lOng equivalents of EGF per ml) was added. After 8hr, cultures were labeled with 5-[ 125I]iodo-2*-deoxyuridine (Amersham, lOμCi/ml, 5Ci/mg; lCi = 37GBq) , and the amount of isotope incorporated into trichloroacetate- insoluble material was determined as described (Van Zoelen e_t al. , Proc. Natl. Acad. Sci. USA. (1984) 81_:4085-4089) . Results

The medium derived from BSC-1 cells 24hr after infection with W was tested for the presence of. material that could compete with 125I-EGF for binding to EGF receptor-rich human epidermoid carcinoma cells

(A431) . W-infected cells released a potent activity that competed with EGF, which activity is designated

VGF. Control medium volume from mock-infected BSC-1 control cultures contained minimal activity in competition with EGF.

Monitoring VGF production, at the earliest time examined, 2hr after infection, enhanced levels of

VGF were observed in the culture medium. By 12hr, maximum amounts of this activity were found in culture supernatant with only a slight increase noted at 24hr.

The level of EGF production was found to be a function of the multiplicity of virus infection as demonstrated by ^"the following Table 1.

TABLE 1: Effect of Multiplicity of Infection on VGF Release

Virus multiplicity, VGF released, pfu per cell ng eq. of EGF per ml

0

10 2.7

20 4.5

40 6.1

80 10.1

Cultures of BSC-1 cells were infected at the pfu-to-cell ratio indicated, and the supernatants (approx. Iml/2xl0⁶ cells) were harvested at 24hr after infection. Samples of each were acidi¬ fied, lyophilized, and tested in duplicate radioreceptor assays for EGF as described (Delarco and Todaro, Proc. Natl. Acad. Sci. USA (1978) _75_:401-405). ng eq., nanogram equivalents. Partial purification of VGF. The activity competing with EGF found in W-infected BSC-1 cells was partially purified from acid-extracted culture supernatants at 24hr after infection as described above. Acid-solubilized polypeptides (10.5mg) from the supernatants were applied to a Bio-Gel P10 column equilibrated with 1M acetic acid and samples of each fraction tested for activity competing with EGF. The major active peak (fraction 42) eluted slightly after the M 29,000 carbonic anhydrase marker with an apparent M of 25,000. The molecular weight was confirmed by utilizing tandemly linked Bio-Sil TSK 250 HPLC sizing columns, with all of the activity eluting as a major peak in the region of the M 25,000 protein marker. One microgram of partially purified VGF was equivalent to 90ng of EGF in the radioreceptor competition assay.

Immunological comparison of VGF and TGF. In the radioimmunoassay described above, a 50% displacement of antigen from antibody to the carboxyl-terminal 17 amino acids of rat TGF-α molecule is observed at an antigen concentration of approximately 0.2-0.3ng equivalents of EGF, where 125I-labeled TGF-α competes with TGF-α. When VGF was tested at equivalent concentrations, no competition was observed. In a competitive radioimmunoassay for native EGF, VGF preparations, even when tested at 50ng of EGF equivalents/ml, exhibited a minimal displacement (<10%) of 125I-EGF from a polyclonal antibody to native EGF.

Biological activity of VGF. A comparison of levels of VGF produced by BSC-1 cells infected with 15pfu of W per cell with that of TGF-α produced by retrovirus-transformed cells is shown in the following Table 2. TABLE 2: Comparison of the biological activity of VGF with TGF-α and EGF

Stimulation of

EGF receptor Induction of anchorage- binding,² DNA synthesis,³ independent ng eq. of [¹²⁵I]IdU cell growth,^"4

Growth EGF/ml of incorporated Soft-agar factor¹ medium (cpm/dish) colonies/plate

None ———, 1,779 <20

VGF 2.3 3,760 108

TGF-α 0.2 NT 294

EGF ——— 8,482 346

NT, not tested.

¹ VGF [90ng equivalents (eq) of EGF per μg of protein] was purified by gel filtration followed by elution from a C.„ μBondapak (Waters Associates) column. TGF-α was purified from Snyder-Theilen feline sarcoma virus-transformed Fisher rat embryo -cells as described in Marquardt et al., Proc. Natl. Acad. Sci. USA (1983) 80:4684-4688; EGF was purified from mouse submaxillary gland (Cohen e _ al_. ,

J. Biol. Che . (1980) 255:41834-41842).

² Quantitation of EGF equivalents was based on a standard ¹²⁵I-EGF binding competition curve as described. Mitogenesis assays were as described; quiescent cultures of diploid human fibroblasts received lOng of purified EGF/ml or the same number of EGF equivalent of VG /ml. Values for [¹²⁵I]iodo- deoxyuridine ([¹²⁵I]IdU) incorporated represent the average of triplicate determinations.

** The number of soft-agar colonies represent the average number of colonies containing a minimum of 20 NRK cells per six random low- power fields 10 days after seeding (1.5X10¹* cell/ml) with purified EGF (5ng/ml) or the same number of EGF equivalents of VGF/ml and 2.0ng of TGF-β/ml purified from human platelets as described in Assoian et_ al. , ibid. (1983) 258:7155-7160. Plates of NRK cells treated with TGF-β alone above did not form colonies.

Further purification of VGF. Pooled fractions from the gel filtration column (25-35) were concentrated by vacuum centrifugation, resuspended in 0.05% trifluoroacetic acid (TFA) , clarified and injected into a 3.9mm x 30cm μBondapak C- ₀ column

1o (Waters, Milford, MA) . Peptides were eluted with a linear 20-60% gradient of acetonitrile in 0.05% TFA at a flow rate of .l.Oml/min at 22°C. Aliquots of each fraction were assayed in a radioreceptor assay for EGF-competing activity. The peptide corresponding to the peak activity was collected and diluted with 0.05% TFA and reinjected into a μBondapak column and eluted utilizing isocratic conditions. The acetonitrile concentration was about 22 to 25%.

Amino-Terminal Sequence of VGF

VGF (18pmol) purified, as described, from vaccinia virus-infected monkey cells, was subjected to automated repetitive Edman degradation in the Model 470A protein sequences (Applied Biosystems) . The phenylthiohydantoin amino acids were analyzed by rpHPLC.

The amino-terminal amino acid sequence is as follows:

D-S-G-N-A-I-E-X-X-X-P-E-I-X-N-A (X, unidentified residues)

A comparison of these data with those deduced from the vaccinia virus DNA sequence shows that VGF begins with aspartic acid at residue 20 of the primary translation product (Fig. 1) .

Preparation of VGF as TrpE Fusion Protein pVG3, a plasmid containing part of the early region of Vaccinia virus is available from B. Moss (NIH) . A 260bp Sau3AI-HpaII fragment was inserted into the TrpE expression plasmid pJH14, using the BamHI and Hindlll sites. The recombinant plasmid was introduced into E. coli by transfection, and expression of the recombinant protein was induced by indoleacetic acid. The fusion protein was purified from the bacteria, and the VGF sequence was excised by digestion with the enzyme Lys C. The digest was shown to contain an activity that competed with EGF in the radioreceptor assay. Rabbit antiserum to vaccinia virus precipitates polypeptides of lOkDal and 25kDal from lysates of these cells. The EGF receptor binding activity was found to be about llOng eq./ g protein. By comparison TGF-α expressed in yeast is 115ng eq./mg protein.

Preparation of VGF in Silkworm using Baculovirus Vector

An additional expression system which is used to express the VGF recombinant protein is an insect system. See Maeda et a_l. and Carbonell et al_. , supra. In one such system, Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign genes. The virus grows in Spodoptera frugiperda cells. The envelope gene can be cloned into non-essential regions (for example, the polyhedrin gene) of the virus and is .placed under control of an AcNPV promoter (for example, the polyhedrin promoter) . Successful insertion of the VGF gene construct will result in inactivation of the polyhedrin gene and production of non-occluded recombinant virus (i.e., virus lacking the proteinaceous coat coded for by the polyhedrin gene) . These recombinant viruses are then used to infect Spodoptera frugiperda cells in which the inserted gene is expressed.

Construction. The VGF gene of interest is placed under the control of a suitable promoter for an insect cell system. Plasmids pAcδlO and pAc611 contain the polyhedrin gene cloned into a plasmid vector possessing an ampicillin resistance marker. Polylinkers are inserted into this gene which are 50 bases downstream from the transcriptional start site of the polyhedrin gene and 7 bases before the first ATG. The VGF gene is cloned into a convenient polylinker site, so that it is under the control of the polyhedrin promoter. The ATG initiation methionine codon and the translational termination codons are those of the VGF gene;.,the transcriptional start sites and polyadenylation signals are those of the polyhedrin gene.

Transformation. The host cell for AcNPV is Spodoptera frugiperda (SF) . In order to produce a recombinant virus stock, the VGF containing plasmid is mixed with AcNPV DNA and transfected into SF cells using the calcium phosphate technique. Recombinant viruses are isolated from the medium and plaque purified on SF cells. Recombinant plaques are identified by hybridization using radiolabeled VGF DNA as probe.

Expression. Once recombinant virus is identified, it is then expanded on SF cells. The recombinant virus stock is then used to '-infect SF cells, the cells lysed, and supernatants screened for production of VGF protein, which is purified as described above for vaccinia virus-infected mammalian cell.

The predicted sequence of VGF produced in insect cells is:

CHO

DSGNAIETTSPEITNATTDIPAIRLCGPEGDGYCLHGDCIHARDIDGMYCRCSHGYTGIR CQHVVLVDYQRSENPNTTTSYIPSPGI LVLVGIIIITCCLLSVYRFTRRTKLPIQDMWP

This 121-residue sequence starts with the N-terminal sequence that has been determined directly for VGF purified from W-infected monkey cells (that is, at residue 20 -of the VGF open reading frame) and continues to the last residue of the open reading frame. It thus lacks the signal peptide (residues 1-19 of the open reading frame) but includes the transmembranous sequence (YIPSPGIMLVLVGIIIITCCLLSVY) .

The predicted MW of the 121-residue peptide . is 13,304, not counting carbohydrate. The apparent MW of the baculovirus-produced VGF is 17,000, significantly smaller than that obtained from W-infected cells, indicating that the two forms have probably been processed differently. The baculovirus-produced VGF may lack an additional portion of the N-terminal sequence, or a portion of the

C-terminal sequence, or both, and/or could differ in the type and extent of glycosylation.

Burn Treatment with Natural VGF Three female piglets of approximately 10 pounds were anesthetized with Ketamine and Rompum, their backs shaved and the hair totally removed with a commercial depilatory cream. A brass template (3x3cm, 147gm) was equilibrated in a 70°C water bath and then placed in firm contact with the skin for exactly lOsec. Five wounds were placed on each side of the spine and were separated from each other by approximately one inch. The top of the resulting blister was totally removed and treated twice a day with vehicle (Silvadene) alone, vehicle containing the factor, or untreated. The pigs ate and drank at will. After 9 or 10 days of treatment, the pigs were anesthetized again and the eschar was removed as much as possible. All burns were photographed and a punch biopsy was taken in each burn in an area judged to be epithelialized.

The following Table 3 represents the approximate percentage of each burn epithelialized as judged visually.

TABLE 3

E p i t h e l i a l i z e d

Left Side Right Side

Untreated Silvadene Growth Factor Untreated Silvadene Growth Factor

Mg/ l μg/ml human EGF natural VGF CO σ.

Pig 1 9 days 55 75 30,60,70 0.1 15 70,65 0.1 Post-Burn rVGF* rVGF*

Pig 2 10 days 50 60 0.1 50 95 0.1 Post-Burn 75 0.5 95 0.5

40 1.0 60 1.0 human TGF rat TGF

Pig 3 9 days 90 0.1 15 20 90 0.1 Post-Burn 85 1.0 65 1.0

* rVGF - VGF prepared in insect cells described above

It is evident from the above results, that the VGF is a potent epithelializing agent comparing satisfactorily with other growth factors which have been previously tested and demonstrated to have mitogenic activity.

In accordance with the subject invention, novel compositions are provided having a wide ranging capability in a variety of fields, such as diagnostics, in vitro and in vivo effects on cells, acting as mitogens, additives in nutrient media, use as agonists and antagonists to EGF and TGF, acting as immunogens, acting as therapeutics, enhancing wound healing, and the like. Furthermore, by providing for a small oligopeptide having binding activity, the oligopeptide can be used by itself or in combination with other polypeptides, fused to the other polypeptides to vary the properties of the other polypeptides, resulting in binding of the polypeptide to cells having growth factor receptors. Thus, one can reversibly bind various polypeptides to cells having growth factor receptors, affecting the properties of the cells in predetermined ways.

Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within the scope of the appended claims.

Claims

WHAT IS CLAIMED IS:

1. A polypeptide of the formula

PP³- (aa²⁵-C-aa²⁷)m - C-aa²⁹-aa³⁰-G-Y I I pp4._(aa 0._aa39._aa38_)n__c_ _R_ _aa35_ ^1

wherein: PP 3 and PP4 are hydrogen; Al is a neutral amino acid of from 2 to 6 carbon atoms having from 0 to 1 hydroxyl group; aa 25 is a neutral amino acid of from 3 to 4 carbon atoms having from 0 to 1 hydroxyl group or tyrosine; aa 27 is of from 4 to 6 carbon,atoms and is a neutral amino acid having from 0 to 1 carboxamide group or a basic amino acid; aa 29 is an aliphatic ammo acid of from 3 to

6 carbon atoms having from 0 to 1 hydroxyl group or histidine; aa is an aliphatic amino acid of from 3 to

6 carbon atoms having from 0 to 1 hydroxyl group or histidine, wherein aa 29 and aa30 are different; aa 35 is an aliphatic amino acid of from 5 to 6 carbon atoms or an acidic amino acid; aa 38 is an aliphatic carboxamide substituted amino acid of from 4 to 5 carbon atoms or an acidic amino acid; aa 39 i.s an aromati.c amm. o aci.d or an aliphatic amino of from 3 to 4 carbon atoms having an hydroxyl substituent;

40 aa is an unsubstituted aliphatic amino acid of from 3 to 5 carbon atoms or a basic amino acid; m and n are 0 or 1. 2. A polypeptide according to Claim 1 joined to at least one amino acid at either of its termini, wherein the resulting extended amino acid has a sequence other than an envelope protein of vaccinia virus and is. of less than about 1000 amino acids.

3. A polypeptide according to Claim 1, glycosylated with sugars in an amount up to about 6kDal.

4. A polypeptide of the formula

PP

-Y-C- aa 11-aa12(.aa12a.) -G_-aa14-C-aa.16-A_r1-A .l. 2-aa19-aa20 aa21-A,c-aa23-aa2?-(.aa25-C_-aa27C--aa29-aa30

G-Y-Al¹-G-aa³⁵-R-C-aa³⁸-aa³⁹-aa⁴⁰) aa 41-_L-aa43-aa44-_P_P2

wherein: the symbols in the parentheses from aa 25 to aa are as defined in Claim 1;

1 . . , aa is any ammo acid;

3 aa is a neutral amino acid of from 2 to 4 carbon atoms; aa is any amino acid; aa 5 i.s an acidic or hydroxy substituted aliphatic amino acid of from 3 to 4 carbon atoms; g aa is an unsubstituted aliphatic amino acid of from 2 to 3 carbon atoms or an aromatic amino acid; 7 aa is an acidic amino acid or hydroxy substituted aliphatic amino acid of from 3 to 4 carbon atoms; aa is an unsubstituted aliphatic amino acid of from 2 to 3 carbon atoms or a carboxamide substituted amino acid of from 4 to 5 carbon atoms; aa is an unsubstituted neutral aliphatic amino acid, of from 5 to 6 carbon atoms or phenylalanine; aa 12 is histidine or neutral carboxamide substituted aliphatic amino acid of from 4 to 5 carbon atoms; aa 12a is gl,ycme or aspartic aci.d,; p is 0 or 1; aa is an acidic amino acid or neutral aliphatic amino acid of from 4 to 5 carbon atoms having from 0 to 1 hydroxyl group;

16 aa is a neutral aliphatic amino acid of from 4 to 6 carbon atoms, or arginine;

Ar is an aromatic amino acid; 2 Al is a neutral aliphatic amino acid of from

2 to 6 carbon atoms; aa 19 i.s any ammo acid; aa is an acidic amino acid or neutral aliphatic amino acid of from 3 to 5 carbon atoms having from 0 to 1 hydroxyl group; aa 21 is an acidic amino acid or neutral aliphatic amino acid of from 5 to 6 carbon atoms;

Ac is an acidic amino acid; aa 23 is a neutral aliphatic amino acid of from 2 to 6 carbon atoms having from 0 to 1 hydroxyl substituent or a basic amino acid; aa is methionine, proline or tyrosine; aa 41 is a neutral unsubstituted aliphatic or acidic amino acid of from 4 to 6 carbon atoms; aa 43 is a neutral aliphatic amino acid of from 4 to 6 carbon atoms or a basic amino acid; aa 44 i.s any ammo acid; and PP 1 and PP2 are the same or different and are hydrogen, an amino acid or a polypeptide of up to 1000 ammo acids, at least one of P 1 and P2 being other thai the natural flanking region of vaccinia virus growth factor, when aa1-44 define the naturally occurring amino acids of vaccinia virus growth factor.

5. A polypeptide according to Claim 4, joined to an immunogenic polypeptide at either terminus, said immunogenic polypeptide having at least about 60 amino acids.

6. A polypeptide according to Claim 5, wherein: aaa 1 i.s leuci.ne, aa3 is glyc. e, aa4 is c fi 7 proline, aa is glutamic acid, aa is glycine, aa is

8 11 12 aspartic acid, aa is glycine, aa is leucine, aa is histidine, p is 0, aa is aspartic acid, aa is isoleucine, Ar 1 is histidine, Al2 is alanine, aa19 .s arginine, aa 20 i.s aspart.i.c aci.d,, aa21 i.s i.so,leuci.ne, Ac is aspartic acid, aa 23 is glycine, aa24 is methion e. the amino acids in the parentheses are as defined in C C -lliaaii -mm 33.,, aaaa4 411 is valine, aa43 is valine, and aa44 is aspartic acid.

7. A hybrid growth factor wherein the N-terminal fragment of from about 15 to 50 amino acids of one growth factor is joined to a C-terminal fragment of from 15 to 45 amino acids of a different growth factor, said fragments being obtained from growth factor polypeptides having a formula according to Claim 4.

8. A DNA sequence encoding a hybrid growth factor according to Claim 7.

9. A DNA sequence encoding a polypeptide according to Claim 1, wherein said sequence includes a methionine codon at the 3' terminus of the coding sequence and a stop codon at the 5'-terminus of the coding sequence. 10. An expression vector having a replication system functional in a microorganism host, transcriptional and translational initiation and termination regulatory signals functional in said host; 5 and a DNA sequence according to any of Claims 8 or 9, between said initiation and termination signals and under their regulatory control.

10 11. Antibodies prepared in immunogenic response to an immunogen according to Claim 5.

12. A composition for use in a method for inducing epithelialization of a wound of a host, said 15 method comprising: applying to the site of said wound an amount sufficient to cause epithelialization of said composition, said composition comprising a polypeptide of the formula 20

PP -aa -C-aa -aa -aa -aa -aa -aa -Y-C- aa 11-aa12(,aa12a.) -G-aa14-C-aa16-A_r1-A .l. 2-aa19-aa20 aa 21-A,c-aa23-aa2«-,(aa25-C_-aa27„C-aa29-aa30

G-Y-Al¹-G-aa³⁵-R-C-aa³⁸-aa³⁹-aa⁴⁰)

2 ,5, aa41-.L-aa43-aa44- _nP_nP2

wherein: the symbols in the parentheses from aa 25 to aa 40 are as defined in Claim 1;

30 aa is any amino acid;

3 aa is a neutral ammo acid of from 2 to 4 carbon atoms; aa 4 i.s^• any ammo acid; aa is an acidic or hydroxy substituted

35 aliphatic amino acid of from 3 to 4 carbon atoms; aa is an unsubstituted aliphatic amino acid of from 2 to 3 carbon atoms or an aromatic amino acid; 7 . aa. is an acidic amino acid or hydroxy substituted aliphatic amino acid of from 3 to 4 carbon atoms; g aa is an unsubstituted aliphatic amino acid of from 2 to 3 carbon atoms or a carboxamide substituted amino acid of from 4 to 5 carbon atoms; aa is an unsubstituted neutral aliphatic amino acid, of from 5 to 6 carbon atoms or phenylalanine; aa 12 is histidine or neutral carboxamide substituted aliphatic amino acid of from 4 to 5 carbon atoms; aa 12a i.s glycine or asparti.c acid; p is 0 or 1; aa 14 is an acidic amino ac d or neutral aliphatic amino acid of from 4 to 5 carbon atoms having from 0 to 1 hydroxyl group;

Ar is an aromatic amino acid; 2 Al is a neutral aliphatic amino acid of from

2 to 6 carbon atoms; aa 19 is any amino acid; aa 20 is an acidic ammo acid or neutral aliphatic amino acid of from 3 to 5 carbon atoms having from 0 to 1 hydroxyl group; aa is an acidic amino acid or neutral aliphatic amino acid of from 5 to 6 carbon atoms;

Ac is an acidic amino acid; aa 23 is a neutral aliphatic am o acid of from 2 to 6 carbon atoms having from 0 to 1 hydroxyl substituent or a basic amino acid; aa 24 is methionme, proline or tyrosine;

41 aa is a neutral unsubstituted aliphatic or acidic amino acid of from 4 to 6 carbon atoms; aa 43 is a neutral aliphatic amino acid of from 4 to 6 carbon atoms or a basic amino acid;

44 . aa is any am o acid. 13. VGF of a purity of at least 90ng equivalents of EGF per microgram of protein.