Replication of ΦX174 Phage DNA:


Understanding of ssDNA replication is very fascinating.   A large number of DNA viruses have single stranded genomes. Ex. Phi X 174 family of phages such as PhiX-S13, phi-A, phi-C, phi K, Phi R, 6SR, BR2, G4, st-2, alpha-3 are similar in nature, but the last three show different mode of replication.


PhiX174 virus has been extensively studied. It is an isometric virus, with 20 triangular faces, and 12 vertices (pointed regions where five faces meet).  Each of the triangular surfaces is made up of 3 subunits of gp-F (gp means gene products).  Every vertices has 5 gp-G protein subunits and in the center of the five gp-G one spike protein called gp-H is present and it is like a fiber.

The phage infects through its spike protein binding to a lipopolysaccharide receptor present on the host cell surface.  The exact mechanism of entry is not known (we always say that we know lot of things about the Viruses, but when comes to specifics, all the time we keep on saying we don’t know).

Electron micrographs of PhiX viral particles










Simple representation of PhiX Phage




Size of the genome is 5386 nucleotides. It is single stranded and circularly modulated DNA.  It is encoded with 10 genes but generates 11 proteins. This genome exemplifies, how a reading frame, if organized in a specific designed way, with minimum length of DNA, one can generate more proteins. This is accomplished by what is called overlapping reading frame.  In this, the reading frame of a gene is organized in such a way if one gene ends in a particular position; the succeeding gene starts with few nucleotides overlapping the terminal region of the first gene.  This is what, classically called overlapping genes, where reading of two genes are overlapped in their sequence.


                                                            2 nd Gene

 5’----ATGAAA-1 st gene------------UAATGGAG



Gene, Gene size

Protein (Kd)




Role in (+) strand synthesis

A*=4497-136=1022, A* overlaps A


Shuts of Hos DNA replication

B = 5075-51=362, B overlaps A


Capsid morphogenesis

K = 51-221=170 Overlaps A,


Stimulates phage production and increase phage burst size

C =133-393=263


DNA maturation

D = 390-848=342


Capsid morphogenesis and assembly

E = 568-843=270


Host cell lyses

J = 848-964=116


DNA condensation and packing

F = 1001-2284=1283


Major coat protein

G = 2395-2922=527


Major Pentamer spike protein

H = 2931-3917=986


Minor spike protein-central









There are four distinct intergenic regions (IR), one between J and F (1001 to 1070), second between F and G (2200 to 2300, the numbers are from 0 map unit position), the third is in between G and H and the fourth is between H and A.   The IR between F and G is used as origin for (-) strand synthesis.  The origin for (+) strand synthesis is at left end of A gene, located approximately between 4299 to 4328.

The reading frame exhibits overlapping of gene segments.  The small A* reading frame starts at 44494 ntds position and ends at 133 from the zero map unit.  The gene B starts at the right side of the A and ends in the terminal part of A.  The K gene starts at the last part of A and slightly overlaps C.  The end of the A gene overlaps the first codon of the gene C.  The gene E overlaps D more than half of D.


Linearized PhiX genome with gene and intergenic regions and overlapping portion of certain genes


Transcription appears to start at 3 sites, and the transcripts appear as polycistronic.   A shift in the reading frame, for example generates E from D.  This is a par excellent example to show, how a genome with its minimum size can generate more proteins.  A length of 5386 ntds produces 11 proteins, i.e. equivalent to 586 ntds per proteins.  In general most of the prokaryote genes have an average size of 1000bp.









Replication Cycle events:





SsDNA(+)>dsRf DNA

0 to 5 minute

Adsorption, penetration, complementary strand synthesis


5-20 minutes

Transcription of (-) (-)  strand,

RF > ss DNA(+)

20-40 minutes

Generates(+) strand by rolling circle mode

PhiX genome with individual genes and with some overlapping genes such as A, A*, B, K, C, D and E

Life cycle:


Infection is through gp-G spike binding to host cell’s lipopolysaccharide, especially N-acetyl glucose amine component of the outer membrane. 

The penetration site is a region at which the outer and inner membranes are joined, like an attachment sites in higher forms. 

A distinctive phospholipase found in this region has a role in viral infectivity. 

DNA penetrates along with gp-H, so the protein is called pilot protein. 

As soon as the ssDNA enters into cytoplasm, it gets coated with host ssBs and immediately the DNA is rendered super coiled.


With in 20-30 minutes of infection the viral genome produces sufficient amount of A protein and also A* product. 

The A* has a non-specific SS DNA endonuclease activity. 

During host’ DNA replication, single stranded DNAs generated, are subjected endonuclease digestion. 

By this act, virtually, it shuts of host DNA synthesis.  The super coiled SSDNA is replicated to produce ds RF DNA, which in turn produces ssDNA (+) forms, which in turn generates more RF forms. 

Ultimately 500 or more viral particles produced are released by cell lyses.


Synthesis of (--) strand DNA:


The SS super coiled DNA generates stem loop structures at Origin site for minus strand synthesis.  

The origin is located in about a region, about 100 ntds long, found between gene F and G.  This region has two stem loop structures.


          ORI (-):



                               Loop2- 5’ TT GTC CTT—3’ (--) strand

                                           <---- GA.ppp5’ primer



The stem loop structure act as the recognition point for the assembly of primosome complex. Primosome complex initiates the synthesis of minus strand.

At the base of the second loop there are sequences, which are required for the primase to lay primer. 

Primosome complex formation initiates with Pri-A, it is supported by pri-B, pri-C and Dna-T, which assemble on the loop 2 structure. 

Dna-B is a helicase assisted by Dna-C in ATP dependent manner joins the Pri-A complex.

Then Dna-G joins the helicase, DNA-G is an RNAP called primase.  This RNAP is different from the normal RNAP, which is involved in gene transcription and Rifamycin sensitive, while the RNA primase is insensitive to Rifamycin.

This primosome complex with its helicase-primase moves in 5’--->3’ direction on (+) strand. 

While it is moving, when it comes in contact with 5’GTC sequence, it halts and produces a complementary RNA strand of 10 to 11 ntds long.  It slides backwards for laying the primer, just like E.coli primase producing primer on the lagging strand.

The primosome moves in 5’ to 3’ producing primers at every 1000 to 1500ntds intervals. 

The host DNA pol-III binds to primers and extends 3’ OH group of the primers till it reaches another primer and stops. 

DNA pol-I removes the primers and fills the gaps.  A DNA ligase seal nicks.

The synthesis of (-) strand is akin to the synthesis of discontinuous strand in E.coli.


The double stranded DNA produced is now called RF form (replicative form).  

The replicative form also gets super coiled.  Super coiling is the most important pre-requirement for replication and transcriptional activities. 

The replicative form of DNA is now subjected to transcription of some genes.  It is now known transcription is initiated at three  sites and generates polycistronic RNAs

The prominent promoter for initiation of transcription is at the left of A gene. 

Though the transcript looks like a polycistronic one, its translation mostly generates gp-A, a 60 KD protein. 

This protein has both sequence specific endonuclease and ligase properties. 

The truncated A gene product is A*.  Its translation initiates near half the way from the 5’ end of A transcript. 

The A* has a property to recognize host DNA and digest it, with out affecting the translational machinery. 

At this stage the RF form of DNA produces protein A in sufficient numbers and A* in less quantity. 

The mode of replication here is exactly like E.coli’ lagging strand synthesis, producing Okazaki like fragments.



(+) SS-DNA + pri-A, B, C+Dna-T+Dna-B+Dna-G at nPASà primosome à Primosome ABC complex


SsDNA + primosome complex+4dNTPs+DNAP-III+DNAP-I + DNA ligase---->

------->  ds replicative form of DNA (RF-DNA).



The super coiled replicative forms are called RF1 and the nicked replicative forms are called RFs.





A list of proteins required for (-) strand synthesis:




size (kd)




76. Monomer

 It has a helicase like activity




Assist pri-A



23. Monomer

Assists pri-A



22. Trimer

Assist primosome assembly



50. Hexamer





Assists Dna-b








An helicase 3’> 5’







19. Tetramer

Binds ss region all the time and they are replaced when other factors bind


Top and Ligases


Removal of super coils, sealing of nicks




Host DNA polymerase




Host enzyme responsible for removing primers and filling the gap





Synthesis of (+) strand DNA:


The origin for (+) strand synthesis is located at the 5’ end of the gene A.

The site is about 29 ntds long and has some sequences for recognition, an A-T rich region, a consensus sequence, and gp-A binding sequence.


              5’--4305-G   I  A-4306—3’            


                                     (A-T rich)     (gp-binding site)


The gp-A is globular protein; it has an active site with an amino acid sequence –tyr-val-ala-lys-tyr-val-asn-lys-.

The gp-A binds to a site in the + strand of RF DNA. It also binds to a recognition region where it identifies the cleavage site.  

Binding melts the RF at AT rich region and creates a replication bubble with a replication fork. 

This event leads the enzyme (gp-A) to cut DNA between G at 4305 and A at 4306, producing G3’OH and 5’P-A. 

The enzyme also binds to 5’p of A nucleotide covalently through its own OH group of tyrosine amino acid residue leaving 3’G-OH free. 

With a nick in the (+) strand and gp-A bound to 5’-A, yet another protein Rep-A, which is a helicase binds to the (-) strand at this position. 

This complex acts as the primosome for the assembly host’s DNAP-III.

The Rep-A is bound to (-) strand, gp-A is bound to 5’end of A of + strand, at this point DNAP-III (Holozyme) binds to the nick region in the RF with G3’OH as a free end.  At the same time the Holozyme also gets associated with Rep-A and gp-A.

Now this complex drives through the RF DNA in 3’à 5’ direction on (--) strand, because the rep-A, a helicase, a motor protein moves in 3’ to 5’ direction.

As other components are associated with the helicase, they also move with the motor protein.

The DNA pol-III uses the 3’G-OH as the primer end and extends the chain in complementary fashion in an uninterrupted manner. The (-) strand synthesis is like continuous strand synthesis in E.coli.  

During this progression of the complex ds replicative form of DNA unwinds ahead of the fork.

The unwinding is facilitated by the action of Topoisomerase-II (Gyrase) ahead of the fork. 

The (+) strand, whose 5’ end is bound to gp-A, peels off in form of a loop.   

When the replication complex progresses the entire length and encounters the newly synthesized (+) origin sequences, the gp-A, which is still bound to the 5’end of the (+) strand, performs three exquisite reactions.  

First it cuts the newly made  (+) strand exactly at the same site, same nucleotides i.e. between 5’G and 3’A, where the first cut was made to produce G with 3’OH group.

Second, the 5’p-A covalent bond to the first is transferred to the cut free 3’G-OH and ligates it.  Thus the old (+) strand is converted into circular ss DNA. 

Third, while the 5’p-A bond is transferred to 3’G-OH of the same strand, (it becomes circular), the 5’P of the newly made strand is covalently added to its another tyrosine residue, which is located in the same catalytic site of the enzyme.

The Rep-A and the Holozyme associated with the (-) strand continue to produce complementary copy (i.e. + strand). 

With one more round of replication one more circular ssDNA is released and the replication progresses relentlessly.  

This way it can produce any number of copies.

This process is augmented, by the replication of + ssDNA produced to form RF form of DNA.  The RF form of DNA immediately undergoes super coiling. This is the stable form of DNA.









This RF-DNA builds up in numbers, at which time both ds RF DNAs and ssDNA are found in the same cytoplasm. 

The RF form of DNA is transcribed to generate few polycistronic mRNAs.  It is important to note that the strand used for transcription is (-) strand and not the (+) strand.

Generation of RF form is very important for it is this form that is stable, and it is the stable form that is capable of transcription.

Transcription and translation leads to the production of viral particles.

Important viral proteins required for viral assembly and production viral particles are gp-F, G, B, H and D; they all assemble in a sequence to produce a pro-head not conformationally fully formed structure. 

Another protein that plays a key role is gp-C, which binds to the cut 5’ end of the +DNA and by rolling mode; the (+) strand is peeled off from the (-) strand. 

With gp-C bound to the 5’end, while the entire length unwinding, another protein called gp-J associates with (+) strand all along the length of the DNA.

This leads to the packaging of the single stranded DNA into the not yet fully formed pro-head. 

Once the DNA packages into the pro-head certain conformational changes are induced and final form of phage particle are produced.  Cell lyses leads to the release of the viruses.