T7 Phage DNA replication:

 

T7 belongs to a group of odd phages such as T1, T3, T5 and T7.  The phages have an icosahedra head structures with a short tail, with a fiber.  The genome size of T1, T3, and T7 is 40 kbp and that of T5 is 120 kB.  The genome of T7 phage is 39936-7 bp, double stranded and linear.  It has 160 bp long direct repeats at both ends.  The genome is encoded with 56 genes with an ability to code for 55 proteins but only 44 have been identified so far.  In this linear configuration, left end can be distinguished from the right end. 

 

 

The electron micrograph of the virus shows the structural features of the T7 phase, where the elaborate neck and tail structure found in T4 Phages are missing.

 

T1 phage

 

T1 phage

 

T2 phage

 

T4 phage

T4 phage

T4 phage

 

T4 phage

 

T5 phage

 

T7 phage

 

                                            

The viral genome gets entry, first binding to the bacterial surface through its cone like tail and fiber. This leads to the injection of the left end of the genome. Even before the rest of the genome enters into the cell, the left part of the genome is the first to be transcribed by host RNA polymerase for its promoters are like bacterial promoters.

It has three types of promoters- class-I, class-II and class-III 

Transcription by host RNAP uses class-I type and ensues till it reaches rho dependent termination region at 1.3-1.4 mpu.  The first phase of transcription accounts for 20% of the genome from the left end.  This part of the genome doesn’t contain any restriction sites for the host REs (Restriction enzymes) to act. 

The DNA end that enters first into the host cell during infection is the last part of that enters into the capsid at the time of phage assembly and maturation.

 

 

              

 Class-1                Class-2                 Class-3

  5’AGG---II--------ori-I---------------I-----------------II---AGG3’

 

 

The genome is divided into 3 class of genes based on temporal expression of them.  This is also based on the kind of promoters used and the kind of RNA polymerases employed for transcription.

 

 

 

This is the line diagram of the T7 genome showing class I promoter regions and Class II promoters. Host RNA pol transcribes (from 0.0 to 1.3mpu); from 1.3 to 7.3 and from 7.3 to the end viral RNA polymerase transcribe the genome. The rest is transcribed by viral RNA pol till the end of the genome.

 

 

First Phase: 

In the first five minutes or so of infection, the phage bind to certain surface receptors facilitated by the tail fiber, then a channel is produced through which the left hand of the DNA containing early genes enters first into the host cell.  The DNA is injected slowly. This part contains few genes and using host RNA polymerase transcribes the same.  The required promoter elements typical of E.coli are found at the extreme left end. 

 

Class I promoter:

 

--I -------(-35) TTGACA----------(-10) TATAAT------+1>>>

 

This part of the DNA that has entered (a part of the DNA) has no restriction sites for host restriction enzymes to act.  The very early genes transcribed generate a transcript of 7000 ntds long ends at Rho dependent termination at Ter region between 1.3 and 1.4 map positions.  The transcript consists of cistrons separated by hairpin structures, which are cleaved by RNAse III and release the individual cistrons for translation.  Individual cistrons engage with cellular ribosomes and translate.

 

 

The above diagram is simple representation of Transcription and translation process of a given gene

 

The products are anti-restriction proteins (that inhibit host’s Eco-B and V restriction enzymes), a protein kinase (gp20) that inactivates host RNAPs by phosphorylating its B and B’ subunits.   This segment also generates T7 RNAP and DNA ligase.  Expression of these genes takes place with in 6 minutes of infection.  The proteins are called Class-I proteins and promoters as class-I promoters.   Expression of T7 RNAP is important for the expression of rest of the genes, for the next part of the genome has promoters for T7 RNAP only.

 

0-----(-)-35—(-)10------------------------------ 1.0 -----1.3-Ter-1.4---I

III = Host RNAP-promoters.      

1.0 mpu = T7 RNAP.

1.3-1.4 = Host RNAP terminator region.

 

The products of the genes transcribed at this (first) stage have very great impact on the production of phages.  Among the five genes transcribed, the gene product RNAP is unique for the phage.  It also produces a kinase (gp20) that inactivates host RNAP.  Some products inhibit host restriction enzymes like E.coli-K and E. coli-B.  As a result they redirect the host’s cellular metabolism.   The mRNA produced is polycistronic.

Between each cistron, it has a small hairpin structure, which is processed by host RNAse-III to release individual mRNAs.

 

 

Second Phase:

 As more DNA enters the cell, the DNA from mpu 1.0 onwards contain promoters for T7 RNAP which are slightly different from host RNAP promoters and they are considered as week promoters. The consensus sequence of the class II promoter is as follows.

 

TAATCCG (-10) ACTCACTATA +1 GGGAG A>>>.

 

These promoters don’t have distinct regions like -35 and -10 sequences as found in host’s promoters.  These promoters, nearly ten in numbers, are called class-II promoters and the products produced are also called class-II proteins.  These are expressed in about 4-12 minutes after infection.  The products are generally involved in DNA metabolism, i.e. DNA replication.  One of the products acts as an inhibitor of host RNAP.  The other proteins are SSBs, Endonuclease, Lysozyme, Primase/Helicase, DNA polymerase and 5’Exonuclease.   Endonuclease and 5’exonucleases degrade host’s DNA into individual nucleotides, which are used for the synthesis of viral DNA.  SSBs provide stability to separated DNA strands, primase provide the primers for DNA Pol.

 

Third Phase: 

At this phase, the genome is completely inside the host cell and the T7 RNAP uses class-III promoters and generates whole lot of transcripts for the production of structural proteins for Head and tail structures.  The sequence is  used by T7 RNAP very efficiently; so they are all strong promoter for T7-RNAP.  There are five such promoters and the consensus sequence is given below and transcripts produced are polycistronic.  Consensus sequence of class III promoter is given below.

 

(-17) TAATACAACTCACTAT +1->AAGGAG (+6)

 

This generates proteins for head and tail and also for DNA maturation.

 

Occasionally the T7 RNAP uses the class-II promoters and reads through till it reaches the end of the genome at terminator region and produces a long polycistronic transcript of approximately 32000 ntds long from 1.1 mpu onwards to the end of 20 mpu, which is processed by Ribonuclease III to release individual cistronic mRNAs for translation.

 

Third Phase:

At this phase T7 DNA gets replicated.  As more and more number of dsDNA are produced (all of them are concatemeric). 

Such segments are used for transcription from class II and Class III promoters.  Translation of them leads to the production of structural proteins for head and tail, but also proteins for capsid assembly and DNA maturation.  Once the viral particles assemble T7 lysozyme cause cell lyses to release phage particles.

 

 

The genomic DNA is 36937 long and divided into 20 map units (mpu) like 1, 2 etc ending in 20 mpu.  Class-I proteins are coded by the genes located from 0.1 to 1.3 mpu; Class-II proteins are coded by the genes located from 2.0 to 7.3 mpu and Class-III proteins are coded from genes located from 7.3 to the end of the DNA (20mpu).  Promoter for-I is located at the extreme end of the genome and tripartite structure and host RNA polymerase recognizes it and its transcription is terminated at 1.3 in rho dependent manner.  Viral T7-RNAPs start at phi 1.1A and extend up to phi 3.8

 

Replication Origin:

 

Replication origin is located in between 1.1b and 1.1 mpu.  It is located after two T7 RNAP promoters as AT-rich region.

 

5814—T7 first-promoter +1>>-------5905;---T7 second promoter (5905)TTAATACGACTCACTAT(5921)-------(5927) CCTTAAGGTTTAACTTTAA’GACCC’TTAAGTGTTAATTAGATTTAAATTAAAGAATTACTAAG(5990)-GAGAGGA—ATG—initiating site for 1.1 protein.  The GACCC is the primase binding site and the GAGAGGA is the shine delgarno  ribosome binding sequence

The following segment forms the core of the ORIGIN.

 

5’TTTAAGACCCTTAAGTGTTAATGAGATTTAAATTAAAGAATTACTAAGAGAGGACTTTAA3’

 

                                                                 1.1b (A/T rich)1.1

Left 5’---------------I---I--------I-----I--------I=========I------II--

                                                                           (61 bp)

The primary Origin of replication lies in between 1.1b to 1.1 mpu and it is rich.

 

On to the left of it there is a promoter for T7-RNAP at which transcription initiates and the same progresses into a noncoding region which happens to contain the origin sequences (AT rich), it opens into a bubble, and the same is used by replication components for initiating replication.  The SSBs (viral products) bind to single strands and stabilize the ss DNA and the Primase Gene 4 product) and Helicase bind to the lagging strand at each ends of the fork.  The primase lays primer for leading strands.  The Helicase moves in 5—>3’ direction and unwinds the dsDNA ahead of the replication bubble.  The replication bubble exhibits bidirectional movement in the beginning, but after the completion of the replication of left end of the genome, the replication will be unidirectional in the right part of the genome.

 

 

The replication origin is found at the left part of the T7 genome; initially the replication looks like bidirectional, but the left part of the genome is replicated early and the rest of the DNA Replication appears as if it is unidirectional.

 

 

 

T7 helicase is an hexameric protein complex

 

 

 

 

 

 

 

 

 

 

This provides structures for the assembly viral DNA polymerase whether host polymerase loading complex is required or not is unclear.  The same enzyme whether as dimer at each fork operate, one on leading strand and the other on lagging strand is a question mark, but the replication is very efficient.  The primers are removed and host DNA Pol-I fills the gaps and viral ligases join the nicks.

 

The Helicase-primase, bound to lagging strands, also uses specific sequences and generates four-nucleotide long (5’pppACCC3’ or 5’pppACCA3’) primers all along the length of the lagging strand as the replication complex progresses. These are used for synthesizing discontinuous strand.

 

The viral RNAP perse has cryptic RNAse-H activity, so it may nibbles its own RNA transcript and provides a short piece of RNA, as a primer for continuous strand synthesis. Initially the replication is bi-directional. As the left part of the genome is hardly 20 %, its replication will be over in no time and the rest of genome is replicated as if it is unidirectional.

 

Host DNA-Pol-I removes primers and fills the gaps. Viral T7 DNA Ligases seals the ends.  

With the completion of the first round of replication, another cycle of replication is initiated and the process is repeated to generate substantial number of DNA molecules. 

 

However, the ends of each ds DNA gaps remain with single strands as tails (why?).  The primers on 3’ of the parental template, after the completion of replication are removed by E.coli Pol-I, but there is no provision for filling up this gap so replication of Viral dsDNA leads to the generation sticky ends, but compatible; hence individual genomes with the compatible ends base pair complementarily and produce dimers or tetramers or octameric concatamers.

 

As the replicated DNA ends have complementary over hanging tails they get base paired and generate concatameric dimers or tetrameric long fragments.  These are used for packaging the DNA into preformed proviral capsids.

 

 

 

 

 

 

 

 

 

 

 

Out of 20 required proteins, seven are involved in replication and others have capsid building and one or two have regulatory roles, like gp2 has negative control over the expression of early genes.

 

 

T7DNA Polymerase (T7 Sequenase):

 

The Gp-5 with 98kd molecular mass is DNA polymerase and possesses 5’-3’ Pol activity, 3’-5’ exonuclease, but no 5’-3’ exonuclease activity. This protein is coupled to host 11.7 (13) KD theoredoxin in 1:1 ratio. This complexing makes it an efficient enzyme with high processivity and high fidelity.  In combination with gp6 it exhibit properties similar to that of E.coli Pol-I, in having 5’-3’pol, 5’-3’exonuxlease, and 3’-5’ exonuclease activity.  Purification of this enzyme without EDTA makes the enzyme to loose its 3’-5’ exonuclease activity.  Selective oxidation of amino acids in the vicinity of iron binding sites also leads to the loss of 3’-5’ exonuclease activity.  Iron is an essential component for T7-pol exonuclease activity.  Mutants in gp5 lack 3’-5’ exonuclease property.  Chemically modified T7 DNA polymerase does not discriminate between dideoxy-NTPs from dNTPs. 

 

Substituting Mn2+ for Mg2+ removes nucleotide bias.  This

 

 

The above diagram is the 3-D model of T7 DNA polymerase; one ball stick model and the other is ribbon model showing the theoredoxin , bound DNA and the exonuclease domain

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

property has made this enzyme so popular among molecular biologists; they love to use it in sequencing reactions. The modified enzyme is called Sequenase.

 

[There is a problem with typed material, what is it I cannot understand]