Skip to main content Skip to navigation

Pól Ó Catnaigh's PhD Summary


This section has been cut and pasted directly from my thesis and then formatted for this site.

N.B. as addressed under 'caveats', GO term analyses are time dependent due to regular changes in the GO term databases. For example, the ‘poly (A) RNA binding’ GO term (GO:0044822) was deleted on 26/01/17, after my thesis was submitted.

7 Summary and conclusions


The idea that there are two UNR genes in humans was suggested by Jeffers et al. (1990). No subsequent evidence was forthcoming to substantiate that claim. It was shown here that a region on Chromosome 10 had a similar degree of identity to Jeffers’ probe as did the equivalent region on chromosome 1 that overlapped the human UNR gene. It is suggested here that Jeffers had mistakenly inferred that the region on chromosome 10 was a second UNR gene (section 1.1.5). A search of the human genome failed to locate a second UNR gene but did locate a previously discovered partial processed pseudogene of UNR on chromosome 5 (over 600 base pairs) and a stretch of over 300 base pairs on chromosome 7 that was almost 90% identical to part of the UNR gene that encodes a single exon (section 1.1.5).


It was noted that a disproportionately high number of arginine residues from the UNR protein are recorded as having at least one non-silent mutation in the COSMIC database (section 1.6.1).


Whilst it is not a novel finding, per se, the homemade ECL solution recipes stated in section 2.2.7 offer other labs the potential to save money compared to buying commercial ECL solutions. Whilst some optimisation was carried out within the lab, most credit is due to Dr Andrew Turnell for the suggestion and to Haan & Behrmann (2007) for the original research.


It was shown that UNR levels fall in cultured HeLa cells as they become more confluent (section 3.1.3) but this relationship is reversed in cultured U2OS cells (section 3.3.2).


It was shown that UNR and TP53 colocalise within stress granules in HeLa cells at one or two hours post-treatment with 1 mM sodium arsenite (section 3.1.6).


UNR levels were shown to be lower in 50%- or 70%-confluent U2OS cells that had been treated with 1 mM sodium arsenite for one hour compared to similar cells that were mock-treated with sterile PBS. This mirrored a similar reduction in TP53 levels (section 3.3.2).


Among the most reproducible novel UNR-interacting proteins discovered by RIP-mass spectrometry were the ubiquitin E3 ligase, HUWE1; the nucleolar protein, NARR; the multifunctional protein, SQSTM1; and the erythropoiesis-related protein, LDB1. Of these, HUWE1 was validated by Western blot and immunofluorescence microscopy and SQSTM1 was validated by Western blot. Each of these proteins would be interesting topics of future research. HUWE1 targets proteins involved in many processes, including cellular proliferation (e.g. MYC, which represses UNR transcription and has its own translation promoted by UNR) and promotes restart at stalled replication forks (Choe et al., 2016). Whilst the functional purpose of an interaction between UNR and HUWE1 is currently unknown, it is possible that UNR modulates HUWE1 function. Among other things, this would add an additional layer of complexity to the UNR-MYC relationship. As discussed below, UNR appears to regulate protein expression on multiple levels. It is further possible that interaction between UNR and a ubiquitin E3 ligase, such as HUWE1, could implicate UNR in protein turnover. Little work has been published on the NARR protein, which is the product of an alternative reading frame of an alternatively spliced RAB34 transcript. It has been shown to be a nucleolar protein that is heavily phosphorylated during M phase (Zougman et al., 2011). UNR levels spike during M phase and this has been linked to the control of the cell cycle. It would be interesting to see if NARR and UNR play a joint role in the regulation of mitosis. SQSTM1 was shown to be involved in a number of interesting physiological pathways and pathologies, including autophagy (Katsuragi et al., 2015), tumour metastasis (Qiang et al., 2014) and Alzheimer’s disease (Salminen et al., 2012). As UNR has also been reported to influence many of the same pathways as SQSTM1, it would be interesting to investigate whether or not the UNR-SQSTM1 interaction is important in respect to these processes. LDB1 has been shown to be involved in insulin expression (Hunter et al., 2013) and erythropoiesis (Li et al., 2010). UNR was shown to be differentially expressed in patients with uncontrolled diabetes (Xavier et al., 2014) and work by the von Lindern group in Holland has discovered links between UNR and Diamond Blackfan anaemia (Horos et al., 2012). As LDB1 influences transcription by LIM domain-containing transcription factors, an interaction between UNR and LDB1 could lead to UNR affecting pathways such as erythropoiesis and pathologies such as diabetes at the transcriptional level.


Groups of statistically significant putative UNR-interacting proteins from each cell type, either arsenite stressed or unstressed, were subjected to gene ontology overrepresentation analysis. The most overrepresented biological process GO term for all conditions was ‘mRNA metabolic process’ and the top two most overrepresented molecular function GO terms were ‘RNA binding’ and ‘poly(A) RNA binding’ (section 4.13). It was noted that ‘ribonucleoprotein complex’ was a recurring highly overrepresented cellular component GO term (section 4.13). It is also suggested that UNR may be involved in selenium metabolism which was considered interesting in light of the findings of Xavier et al. (2014) (sections 4.13 and 1.7). Some evidence was found that UNR may bind to a number of proteins that are located at adherens junctions (e.g. section 4.10.5). This was interesting, given that some immunofluorescence microscopy evidence suggested that UNR can be found concentrated at cell-cell junctions under certain conditions (section 4.12.1). The adherens junction cellular component GO term was also significantly overrepresented among proteins that were upregulated in siUNR-treated HeLa cells or either unstressed or arsenite-stressed siUNR-treated U2OS cells (sections 6.4.1, 6.6.5 and 6.8.1). It was also overrepresented among proteins that were downregulated in either unstressed or arsenite-stressed siUNR-treated U2OS cells (sections 6.6.7 and 6.8.3). These findings suggested that UNR can both interact with proteins at adherens junctions and also modulate the expression levels of proteins at adherens junctions, making UNR a potential key player in the control of the adherens junction. This is suggested as a potential avenue of future research.


RNA was then extracted from a portion of the RIP samples and sequenced in an attempt to identify and quantify UNR-interacting transcripts. PCA was used to validate samples and remove outliers. DESeq2 was then used to find significant UNR-interacting transcripts. The significant hits were then entered into a GO term overrepresentation analysis tool to explore whether or not UNR interacts with specific groups of transcripts.


Multiple novel UNR-interacting transcripts are suggested in chapter 5, although there was little identity between the transcripts flagged up as binding to UNR in each of the three cell types. Notwithstanding that observation, some highly overrepresented GO terms were generated using DESeq2-suggested UNR-interacting transcripts merged from arsenite-stressed and unstressed SaOS-2 cells and from merged arsenite-stressed and unstressed cells from all three cell lines.


Among the top hits for SaOS-2 were ‘RNA processing’, ‘RNA binding’ and ‘poly(A) RNA binding’ (adjusted p-values ranging from 6.52x10-11 to 3.12x10-10). The cellular component GO term ‘nuclear part’ had an adjusted p-value of 6.79x10-21 (section 5.6.2). Among all cell types, the most overrepresented biological process GO term was ‘nucleobase-containing compound metabolic process’ (adjusted p-value = 1.97x10-11). Other top hits included ‘chromosome organization’ and ‘chromatin modification’ (adjusted p-values = 6.52x10-9 and 1.07x10-7, respectively). Among the most overrepresented molecular function GO terms were ‘RNA binding’ (adjusted p-value = 2.51x10-9) and ‘poly (A) RNA binding’ with an adjusted p-value of 1.08x10-6. The most overrepresented cellular component GO term was ‘nuclear part’ with an adjusted p-value of 4.51x10-23 (section 5.7.3). This evidence suggests that UNR interacts with transcripts that encode proteins that are involved in RNA- and nuclear-related processes. It is interesting to consider that UNR interacts with both proteins and transcripts that encode proteins involved in processes such as poly (A) RNA binding. This implies a potential fundamental role for UNR in the control of protein expression, potentially at the transcriptional, translational and post-translational levels.


A large number of proteins that are differentially regulated following siUNR-treatment in HeLa and U2OS cells (the latter with or without arsenite stress prior to harvesting) are presented in chapter 6. Akin to the analysis carried out with data from chapters 4 and 5, GO term overrepresentation analysis was carried out using these proteins.


Among the most overrepresented biological process GO terms generated using proteins upregulated in siUNR-treated HeLa cells were ‘posttranscriptional regulation of gene expression’ and terms related to viral processes. 26 proteins were observed that were annotated with ‘poly (A) RNA binding’, whereas only 7.07 were expected (adjusted p-value = 2.03x10-5). ‘Extracellular exosome’ and, as stated above, ‘adherens junction’, were among the most overrepresented cellular component GO terms (section 6.4.1). Among those proteins downregulated by siUNR treatment in HeLa cells, the most overrepresented biological process GO term was ‘oxidative phosphorylation’. The top molecular function GO term was ‘protein binding involved in cell-cell adhesion’ and others included ‘RNA binding’ and ‘poly (A) RNA binding’. Mitochondrion-associated GO terms were among the most overrepresented cellular component GO terms (section 6.4.3). These findings further corroborate the hypothesis that UNR has a fundamental role to play in the overall control of protein synthesis. They also suggest that UNR may play a role in cellular respiration.


The two most overrepresented biological process GO terms generated using proteins upregulated in unstressed siUNR-treated U2OS cells were ‘mRNA processing’ and ‘RNA splicing’. ‘Poly (A) RNA binding’ was the top molecular function GO term with an adjusted p-value of 5.11x10-12. ‘Adherens junction’ was among the most overrepresented cellular component GO terms, with 19 proteins observed with 3.26 expected (section 6.6.5).


Among the proteins downregulated by siUNR treatment in unstressed U2OS cells, there were no overrepresented biological process GO terms with adjusted p-values <0.01 and only two very general overrepresented molecular function GO terms. ‘Anchoring junction’ and ‘extracellular exosome’ were the most overrepresented cellular component GO terms (section 6.6.7).


There were neither any significantly overrepresented biological process nor molecular function GO terms obtained using proteins upregulated in siUNR-treated U2OS cells that were arsenite-stressed prior to being harvested. There were four significant cellular component GO terms, all of which had adjusted p-values >0.01, of which the most significant was ‘adherens junction’ (section 6.8.1).


A large number of proteins were significantly downregulated on siUNR treatment in U2OS cells that were stressed with arsenite prior to harvesting; 86 were entered into the GO tool (section 6.8.3). These yielded a large number of significantly overrepresented GO terms. The most significant biological process GO terms included terms related to translation and protein targeting. Another top ten hit by adjusted p-value was ‘viral transcription’. ‘RNA binding’ (adjusted p-value = 2.79x10-17) and ‘poly (A) RNA binding’ (adjusted p-value = 5.69x10-14) were the most overrepresented molecular function GO terms by p-value. ‘Cytosolic ribosome’ (adjusted p-value = 1.78x10-16) and ‘adherens junction’ (adjusted p-value = 4.25x10-15) were the most overrepresented cellular component GO terms by p-value (section 6.8.3).


It was found that the method used to detect significant proteins among those discovered by mass spectrometry had been quite stringent as, when the criteria were relaxed, the p-values associated with significantly overrepresented GO terms became much stronger. The same was not true when putatively non-significant genes were added (section 6.9).


This work has produced evidence that supports a role for UNR as a potential master regulator of protein expression. The exact cellular function(s) of UNR remain(s) to be discovered but it is hoped that this work will both widen the current knowledge of UNR and, potentially, help guide future researchers both in the UNR field and beyond.


7.1 Caveats


It is considered useful to conclude with a number of caveats. Firstly, mass spectrometry data can be biased in terms of differential detectability of processed tryptic digest peptides (Fricker, 2015). The peptides that were detected here were fed into software that utilises databases, both of which are constantly changing and, thereby, changing the probabilities associated with protein detection. For example, the Scaffold report from section 4.3.1 mentions the use of version 4.5.3 that used Mascot version 2.5.0 and the human Uniprot database from 18th June 2015. As of 17th May 2017, Scaffold is on version 4.7.5, Mascot is on 2.6 and Uniprot had last updated its human proteome data on 25th March 2017. Ultimately, a parsimony-based approach was used to assign peptides to proteins and this is another potential source of error (Cottrell, 2011). Likewise, RNA-Seq has many limitations from the sequencing depth to similar issues surrounding databases. Finally, gene ontology databases also change over time, both with respect to the number of terms and the group of terms annotated to each individual protein (Khatri & Draghici, 2005). Possible errors are therefore magnified as false positive are likely to be present in the input data and some false negatives will have been rejected. This could potentially hide true associations or cause false relationships to be suggested. This is expected to have been particularly pronounced with the RNA-Seq data as it was not known whether a putative interaction with UNR would lead to a transcript being expressed at a greater rate or translationally repressed. Furthermore, the putative UNR-interacting transcripts were fed into a database that ultimately looked at protein attributes (biological process, molecular function and cellular component). It is hoped that a deeper exploration of the knockdown data provided in Chapter 6 could be used to strengthen future analysis of the RNA-Seq data from Chapter 5.



References


Choe, K. N., Nicolae, C. M., Constantin, D., Imamura Kawasawa, Y., Delgado-Diaz, M. R., De, S., Freire, R., Smits, V. A. & Moldovan, G.-L. (2016). HUWE1 interacts with PCNA to alleviate replication stress. EMBO Rep e201541685. EMBO Press.

Cottrell, J. S. (2011). Protein identification using MS/MS data. J Proteomics 74, 1842–1851.

Fricker, L. D. (2015). Limitations of Mass Spectrometry-Based Peptidomic Approaches. J Am Soc Mass Spectrom 26, 1981–1991. Springer US.

Haan, C. & Behrmann, I. (2007). A cost effective non-commercial ECL-solution for Western blot detections yielding strong signals and low background. J Immunol Methods 318, 11–19.

Horos, R., Ijspeert, H., Pospisilova, D., Sendtner, R., Andrieu-Soler, C., Taskesen, E., Nieradka, A., Cmejla, R., Sendtner, M. & other authors. (2012). Ribosomal deficiencies in Diamond-Blackfan anemia impair translation of transcripts essential for differentiation of murine and human erythroblasts. Blood 119, 262–72.

Hunter, C. S., Dixit, S., Cohen, T., Ediger, B., Wilcox, C., Ferreira, M., Westphal, H., Stein, R. & May, C. L. (2013). Islet α-, β-, and δ-cell development is controlled by the Ldb1 coregulator, acting primarily with the islet-1 transcription factor. Diabetes 62, 875–86.

Jeffers, M., Paciucci, R. & Pellicer, A. (1990). Characterization of unr; a gene closely linked to N-ras. Nucleic Acids Res 18, 4891–9.

Katsuragi, Y., Ichimura, Y. & Komatsu, M. (2015). p62/SQSTM1 functions as a signaling hub and an autophagy adaptor. FEBS J 282, 4672–8.

Khatri, P. & Draghici, S. (2005). Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595. Oxford University Press.

Li, L., Lee, J. Y., Gross, J., Song, S.-H., Dean, A. & Love, P. E. (2010). A requirement for Lim domain binding protein 1 in erythropoiesis. J Exp Med 207, 2543–50.

Qiang, L., Zhao, B., Ming, M., Wang, N., He, T.-C., Hwang, S., Thorburn, A. & He, Y.-Y. (2014). Regulation of cell proliferation and migration by p62 through stabilization of Twist1. Proc Natl Acad Sci U S A 111, 9241–6.

Salminen, A., Kaarniranta, K., Haapasalo, A., Hiltunen, M., Soininen, H. & Alafuzoff, I. (2012). Emerging role of p62/sequestosome-1 in the pathogenesis of Alzheimer’s disease. Prog Neurobiol 96, 87–95.

Xavier, D. J., Takahashi, P., Manoel-Caetano, F. S., Foss-Freitas, M. C., Foss, M. C., Donadi, E. A., Passos, G. A. & Sakamoto-Hojo, E. T. (2014). One-week intervention period led to improvements in glycemic control and reduction in DNA damage levels in patients with type 2 diabetes mellitus. Diabetes Res Clin Pract 105, 356–63.

Zougman, A., Mann, M. & Wisniewski, J. R. (2011). Identification and characterization of a novel ubiquitous nucleolar protein ‘NARR’ encoded by a gene overlapping the rab34 oncogene. Nucleic Acids Res 39, 7103–13.



Pól image

Dr Pól Roibeárd Ó Catnaigh