From tglaser@umich.edu Tue Jun 8 14:43:09 2004 Date: Wed, 02 Jun 2004 18:00:32 -0400 From: Tom Glaser To: Dan Burns Cc: Miriam Meisler Subject: Re: Ribosomal proteins Dear Dan, Sure, I remember you, and am glad to hear from you. I think this is really an interesting topic. Unfortunately, I missed Hopkin's talk but telephoned her afterwards and exchanged some papers. It turns out her work was published this month in a new "open-access" journal, PLoS Biology (Public Library of Science): Amsterdam A, Sadler KC, Lai K, Farrington S, Bronson RT, Lees JA and Hopkins N (2004) Many ribosomal protein genes are cancer genes in zebrafish. PLoS Biol 2: E139. We have generated several hundred lines of zebrafish (Danio rerio), each heterozygous for a recessive embryonic lethal mutation. Since many tumor suppressor genes are recessive lethals, we screened our colony for lines that display early mortality and/or gross evidence of tumors. We identified 12 lines with elevated cancer incidence. Fish from these lines develop malignant peripheral nerve sheath tumors, and in some cases also other tumor types, with moderate to very high frequencies. Surprisingly, 11 of the 12 lines were each heterozygous for a mutation in a different ribosomal protein (RP) gene, while one line was heterozygous for a mutation in a zebrafish paralog of the human and mouse tumor suppressor gene, neurofibromatosis type 2. Our findings suggest that many RP genes may act as haploinsufficient tumor suppressors in fish. Many RP genes might also be cancer genes in humans, where their role in tumorigenesis could easily have escaped detection up to now.  (2004) Defects in Ribosomal Protein Genes Cause Cancer in Zebrafish. PLoS Biol 2: E159. It is a fascinating story, and quite mysterious. In one tumor in their paper, at least, the mutated copy of RP gene was lost, as if the initial imbalance initiated tumor formation and then secondary mutations allowed the tumor to grow. Generally speaking, heterozygous RP mutations cause cells in a somatic clone to grow slower, and thus to be negatively selected compared to surrounding 'normal' cells. Quite the opposite of Hopkins' result. There has been some discussion in the literature regarding "extraribosomal" functions of ribosomal genes but, as you say, this has mostly been the province of serendipity: Wool IG (1996) Extraribosomal functions of ribosomal proteins. Trends Biochem Sci 21: 164-165. The discovery of DNA-binding motifs in ribosomal proteins has led to the conjecture that the transition of the ribosome from an RNA to an RNP machine occurred by adding pre-existing proteins. Supportive, but circumstantial, evidence for the hypothesis is adduced from the finding that many ribosomal proteins have a second function apart from the particle. These extraribosomal functions are enumerated. Ardini E, Pesole G, Tagliabue E, Magnifico A, Castronovo V, Sobel ME, Colnaghi MI and Menard S (1998) The 67-kDa laminin receptor originated from a ribosomal protein that acquired a dual function during evolution. Mol Biol Evol 15: 1017-1025. The 67-kDa laminin receptor (67LR) is a nonintegrin cell surface receptor that mediates high-affinity interactions between cells and laminin. Overexpression of this protein in tumor cells has been related to tumor invasion and metastasis. Thus far, only a full-length gene encoding a 37-kDa precursor protein (37LRP) has been isolated. The finding that the cDNA for the 37LRP is virtually identical to a cDNA encoding the ribosomal protein p40 has suggested that 37LRP is actually a component of the translational machinery, with no laminin-binding activity. On the other hand, a peptide of 20 amino acids deduced from the sequence of 37LR/p40 was shown to exhibit high laminin-binding activity. The evolutionary relationship between 23 sequences of 37LRP/p40 proteins was analyzed. This phylogenetic analysis indicated that all of the protein sequences derive from orthologous genes and that the 37LRP is indeed a ribosomal protein that acquired the novel function of laminin receptor during evolution. The evolutionary analysis of the sequence identified as the laminin-binding site in the human protein suggested that the acquisition of the laminin-binding capability is linked to the palindromic sequence LMWWML, which appeared during evolution concomitantly with laminin. Draptchinskaia N, Gustavsson P, Andersson B, Pettersson M, Willig TN, Dianzani I, Ball S, Tchernia G, Klar J, Matsson H, Tentler D, Mohandas N, Carlsson B and Dahl N (1999) The gene encoding ribosomal protein S19 is mutated in Diamond-Blackfan anaemia. Nat Genet 21: 169-175. Diamond-Blackfan anaemia (DBA) is a constitutional erythroblastopenia characterized by absent or decreased erythroid precursors. The disease, previously mapped to human chromosome 19q13, is frequently associated with a variety of malformations. To identify the gene involved in DBA, we cloned the chromosome 19q13 breakpoint in a patient with a reciprocal X;19 chromosome translocation. The breakpoint occurred in the gene encoding ribosomal protein S19. Furthermore, we identified mutations in RPS19 in 10 of 40 unrelated DBA patients, including nonsense, frameshift, splice site and missense mutations, as well as two intragenic deletions. These mutations are associated with clinical features that suggest a function for RPS19 in erythropoiesis and embryogenesis. I think the laminin receptor example (Ardini et al.) is most striking; both proteins were purified and then the researchers were forced to confront the fact that they had identified the same protein, relying on very different functions. I don't think there is any way to predict extraribosomal functions a priori, any more than one can guess the function of a newly discovered gene, without mutational data. As for riboproteins, it is especially hard to know, since homozygotes are lethal, early in development (this occurs much earlier in mice [~32 cell stage] compared to fish, which have large maternal stores of ribosomes in the egg cytoplasm and can survive for a few days with no zygotic ribosome synthesis.) We are studying one such mouse riboprotein, associated with a particular phenotype, and are trying to assess its potential extraribosomal function(s). The S19 riboprotein evidently has an extraribosomal function in hematopoesis in humans, judging from the Diamond-Blackfan anemia story. cDNA microarrays (RNA profiling) are one strategy; some have found that expression of different RPs varies from tissue to tissue, and condition to condition. RPs are thought to co-regulated so as to match the products stoichiometrically, even though the genes are dispersed throughout the genome: Hariharan N, Kelley DE and Perry RP (1989) Equipotent mouse ribosomal protein promoters have a similar architecture that includes internal sequence elements. Genes Dev 3: 1789-1800. The promoters of the mouse ribosomal protein genes rpL30, rpL32, and rpS16 are of equal strength, as indicated by in vivo measurements of polymerase loading and by their relative efficiency in driving the expression of a linked reporter gene. The equipotency of these promoters appears to derive from a remarkably similar architecture in which five or more elements are distributed over a 200-bp region that spans a polypyrimidine-embedded cap site. Three trans-acting factors are shared by the rpL30 and rpL32 promoters, one of which, delta, recognizes a common CNGCCATCT motif in the first (untranslated) exons. Site-specific mutagenesis demonstrated that delta-factor binding is critical for rpL30 promoter function. The repeated occurrence of this novel promoter architecture among ribosomal protein genes with very different coding specificities is most readily explained by convergent evolution. Niehrs C and Pollet N (1999) Synexpression groups in eukaryotes. Nature 402: 483-487. In 1960, Jacob and Monod described the bacterial operon, a cluster of functionally interacting genes whose expression is tightly coordinated. Global expression analysis has shown that the highly coordinate expression of genes functioning in common processes is also a widespread phenomenon in eukaryotes. These sets of co-regulated genes, or 'synexpression groups', show a striking parallel to the operon, and may be a key determinant facilitating evolutionary change leading to animal diversity. Another issue is the finding of huge numbers of processed ribosomal protein pseudogenes, cDNA copies that have inserted into the genome and mutated, for the most part. These are believed to be non-functional, and are not found in fish. There may be more RP pseudogenes than any other such class -- they have been used as a case-study (below), and these pseudogenes are widely dispersed: Zhang Z, Harrison P and Gerstein M (2002) Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome. Genome Res 12: 1466-1482. Mammals have 79 ribosomal proteins (RP). Using a systematic procedure based on sequence-homology, we have comprehensively identified pseudogenes of these proteins in the human genome. Our assignments are available at http://www.pseudogene.org or http://bioinfo.mbb.yale.edu/genome/pseudogene. In total, we found 2090 processed pseudogenes and 16 duplications of RP genes. In relation to the matching parent protein, each of the processed pseudogenes has an average relative sequence length of 97% and an average sequence identity of 76%. A small number (258) of them do not contain obvious disablements (stop codons or frameshifts) and, therefore, could be mistaken as functional genes, and 178 are disrupted by one or more repetitive elements. On average, processed pseudogenes have a longer truncation at the 5' end than the 3' end, consistent with the target-primed-reverse-transcription (TPRT) mechanism. Interestingly, on chromosome 16, an RPL26 processed pseudogene was found in the intron region of a functional RPS2 gene. The large-scale distribution of RP pseudogenes throughout the genome appears to result, chiefly, from random insertions with the numbers on each chromosome, consequently, proportional to its size. In contrast to RP genes, the RP pseudogenes have the highest density in GC-intermediate regions (41%-46%) of the genome, with the density pattern being between that of LINEs and Alus. This can be explained by a negative selection theory as we observed that GC-rich RP pseudogenes decay faster in GC-poor regions. Also, we observed a correlation between the number of processed pseudogenes and the GC content of the associated functional gene, i.e., relatively GC-poor RPs have more processed pseudogenes. This ranges from 145 pseudogenes for RPL21 down to 3 pseudogenes for RPL14. We were able to date the RP pseudogenes based on their sequence divergence from present-day RP genes, finding an age distribution similar to that for Alus. The distribution is consistent with a decline in retrotransposition activity in the hominid lineage during the last 40 Myr. We discuss the implications for retrotransposon stability and genome dynamics based on these new findings. I think this might interfere with direct DNA microarray screening strategies, aimed at finding coordinately destabilized chromatin regions, for example. For the most part, RP genes are not duplicated in mammals; there are a couple of exceptions -- owing to tandem segmental duplications in the genome (see above) -- but it is not yet know whether both copies are expressed. In most cases, there is a solitary gene for each RP. (interestingly, most yeast RP genes are dupliated). The RP genes are perhaps the most highly expressed Pol II transcription units in the genome. So it doesn't seem like RP genes are clustered within MAR segments, but that is a neat idea. I was not aware of the SIDD work or website. Last week, Dave Engelke gave a beautiful talk about tRNA genes in yeast; these are transcribed at high levels via a different RNA polymerase (Pol III), and this occurs within a specialized part of the nucleus -- the nucleolus. Genes located next tRNA genes are 'pulled into' the nucleolus and effectively 'silenced' as a result, and so location must have been selected for or against during evolution. Similar to your thinking, a crude 'parsing' of sorts. In our experience, though, the functional RP genes are pretty thoroughly mixed in with other genes, and are not obviously set apart. In bacteria, they are clustered into operons. Anyway, I find this stuff riveting. Tom Dan Burns wrote: Dear Tom, I don't know if you remember me - I was the old math guy in your genetics class several years ago. I was wondering whether I could bother you with a question which Miriam Meisler suggested I might ask you. It arises from some remarks of Nancy Hopkins in her lecture at the opening of the LSI Building, in particular the finding of her lab that in zebra fish, some ribosomal proteins might be linked to cancers appearing in (mutant) early development via a "secondary" (i.e., non-RP) role for the protein. I don't have a good reference for that since it seemed to be the current topic of her lab and she was speaking about things being prepared for publication as I understood it. Miriam suggested that you were interested in such "other roles" for RP's, and I was wondering whether you could give me any pointers as to where one could get informed about these roles. My main question is how do people come up with candidates for functions in which RP's play a role? It is easy to imagine that somebody came across this analyzing a function of interest and working back to an ID of an RP as playing a role, though that is good scientific serendipity, and couldn't be carried out systematically. On the other hand, discovery based methods like, mainly, DNA microarrays would seem to have real problems because the RP's are so pervasive just because of their primary housekeeping role. I had an idea to try to set up a computational screen for potential gene correlations with RP's, following work of Craig Benham (he works on stress induced destabilization of DNA duplexes = "SIDD"). The idea out there that Benham and coworkers are follwoing is the following: SIDD is correlated with matrix attachment regions in nuclear DNA, and Benham has developed a computational scan for these regions (there is even a web server for this computation, which I just learned today). The "major guess/working hypothesis" is that genomic intervals delimited by attachment regions should "parse" the DNA into correlated (colocalized, certainly, but coregulated? co-expressed?) genes. My question is: could one look for functional correlations with RP's via this kind of colocation heuristic? In the simplest case, one would be hoping that if the RP had a significantly different function, it might have another gene copy for that function which might be colocated with the genes shared in that function within some matrix attachment segment. Does this sound like nonsense? Any comments would be appreciated! Thanks, and best wishes, Dan Burns ---------- Forwarded message ---------- Date: Tue, 18 May 2004 10:37:55 -0400 From: meislerm@umich.edu To: Dan Burns Subject: Re: MMSS tom glaser is interested in ribosomal proteins. When I mentioned it to him, he said that the idea of 'other functions' is definitely out there now, but i didn't ask about cancer specifically. ... Quoting Dan Burns : Re: Nancy Hopkins. Yes, the story about asking for so little is touching/breathtaking. Especially after seeing that her whole gallery of fish the next day, after "cooking" for 10 years, didn't take much space at all, especially for the kind of results she was getting. By the way, is that posssible link of ribosomal proteins to cancer new? Do people have anything about the putative (or in this case implied?) other functions of RP's? ... -- Tom Glaser, MD PhD University of Michigan Medical Center 4520 MSRB I, Box 0651 1150 West Medical Center Drive Ann Arbor, MI 48109-0651 (734) 764-4580 (office) (734) 647-3255 (lab) (734) 615-9712 (secretary Sherry Taylor) (734) 763-2162 (fax) tglaser@umich.edu