Systems Biology: Data Analysis and Modeling



We continue to apply gene expression microarray, metabolomics and other high dimensional data platforms to identify molecular profiles associated with specific breast cancer phenotypes and responsiveness/resistance to selected systemic therapies. Data are obtained from both in vitro and in vivo experimental models and from human breast cancer specimens. For example, we initially optimized a method for the collection and processing of breast core needle biopsies for gene expression profiling (Ellis et al. Clin Cancer Res, 2002). Our early studies implicated several genes and processes (e.g., the unfolded protein response) in acquired antiestrogen resistance (Gu et al. Cancer Res, 2002). Our more recent collaborations with Dr. Louis Weiner’s laboratory (Georgetown University) have enabled us to explore data from functional genomic (RNAi) analyses of our cell models.

Working with our colleagues in bioinformatics (Dr. Yue Wang, Dr. Jianhua Xuan, Computational Bioinformatics and Bioimaging Laboratory (CBIL), Virginia Tech.; Dr. Subha Madhavan, Dr. Yuriy Gusev, Innovation Center for Biomedical Informatics (ICBI), Georgetown University), biostatistics (Dr. Aiyi Liu, NICHD; Dr. Edmund Gehan, Georgetown University; Dr. Bruce Trock, Johns Hopkins University), mathematical modeling (Dr. John Tyson, Dr. Bill Baumann, Virginia Tech), and medical oncology (Dr. Claudine Issacs, Georgetown University; Dr. Mike Dixon, University of Edinburgh; Dr. Minetta Liu, Mayo Clinic) we have continued to develop and apply novel approaches and algorithms for the visualization and analysis of complex multidimensional data sets (see also Dr. Wang's website).

Our approaches to modeling address the specific challenges and opportunities offered by working in high dimensions, which we have described in detail (Clarke et al. Nat Rev Cancer, 2008; Wang et al. Br. J Cancer, 2008). The tools that we continue to develop are targeted at addressing specific tasks in data analysis and computational biology. All tools must outperform existing approaches when tested on the same data sets. More recently, we have begun to explore the use of mathematical modeling to explore, in more detail, specific components of endocrine resistance signaling in breast cancer (Tyson et al. Nature Rev Cancer, 2011). Our overall approach to the integration of computational and mathematical modeling is described in a recent report (Clarke et al. Horm Mol Biol Clin Invest, 2011). We have reported some of our initial models of components of elasticity(Chen et al. Interface Focus, 2014; Chen et al. FEBS Lett, 2013) and key unfolded protein components (Parmar et al. Interface Focus, 2013).

Working with the team of bioinformaticians led by Dr. Subha Madhavan at Georgetown University, and following a long engagement in the caBIG community, we collaborated on the development of a powerful new informatics platform G-DOC (Madhavan et al., Neoplasia, 2012). This work is funded in part by two NCI funded centers: the In Silico Center of Research Excellence at Georgetown led by Drs. Clarke and Madhavan and through the Center for Cancer Systems Biology led by Dr. Clarke.

Selected Publications

Request a reprint (please indicate the reprints you require)
Gene expression profiling, data visualization, data analysis, and related areas from our Bioinformatics and Biostatistics Working Group. Many of the tools we have created can be obtained freely from CBIL

Year 10s

  1. Tian Y, Zhang B, Hoffman EP, Clarke R, Zhang Z, Shih IM, Xuan J, Herrington DM, Wang Y. Knowledge-fused differential dependency network models for detecting significant rewiring in biological networks. BMC Syst Biol. 2014 Jul 24;8(1):87. PubMed
  2. Chen C, Baumann WT, Xing J, Xu L, Clarke R, Tyson JJ., Mathematical models of the transitions between endocrine therapy responsive and resistant states in breast cancer. Interface Focus. 2014 May 7;11(96):20140206. doi: 10.1098/rsif.2014.0206. Print 2014 Jul 6. PubMed
  3. Chen X, Xuan J, Wang C, Shajahan AN, Riggins RB, Clarke R. Reconstruction of transcriptional regulatory networks by stability-based network component analysis. IEEE/ACM Trans Comput Biol Bioinform. 2013 Nov-Dec;10(6):1347-58. doi: 10.1109/TCBB.2012.146. PubMed
  4. Zhang B, Hou X, Yuan X, Shih IeM, Zhang Z, Clarke R, Wang RR, Fu Y, Madhavan S, Wang Y, Yu G. AISAIC: a software suite for accurate identification of significant aberrations in cancers. Bioinformatics. 2014 Feb 1;30(3):431-3. doi: 10.1093/bioinformatics/btt693. Epub 2013 Nov 29. PubMed
  5. Parmar JH, Cook KL, Shajahan-Haq AN, Clarke PA, Tavassoly I, Clarke R, Tyson JJ, Baumann WT. Modelling the effect of GRP78 on anti-oestrogen sensitivity and resistance in breast cancer. Interface Focus. 2013 Aug 6;3(4):20130012. doi: 10.1098/rsfs.2013.0012. PubMed
  6. Shi X, Gu J, Chen X, Shajahan A, Hilakivi-Clarke L, Clarke R, Xuan J. mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networks. BMC Syst Biol. 2013;7 Suppl 5:S4. doi: 10.1186/1752-0509-7-S5-S4. Epub 2013 Dec 9. PubMed
  7. Chen C, Baumann WT, Clarke R, Tyson JJ. Modeling the estrogen receptor to growth factor receptor signaling switch in human breast cancer cells. FEBS Lett. 2013 Oct 11;587(20):3327-34. doi: 10.1016/j.febslet.2013.08.022. Epub 2013 Aug 28. PubMed
  8. Chen, L., Xuan, J., Riggins, R.B., Wang, Y. & Clarke, R. Identifying protein interaction subnetworks by a bagging Markov random field-based method. Nucleic Acid Res, 2013 Jan;41(2):e42. doi: 10.1093/nar/gks951. Epub 2012 Nov 17. PubMed
  9. Wang, C., Xuan, J., Shih, I.-M., Clarke, R. & Wang, Y. “Regulatory component analysis: a semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge.” Signal Process, doi:/10.1016/j.sigpro.20112.11.028, 2012.
  10. Madhavan., S., Gusev, Y., Harris, M., Tanenbaum, D.M., Gauba, R., Bhuvaneshwar, K., Shinohara, A., Rosso, K., Carabet, L.A., Song, S., Riggins, R.B., Dakshanamurthyu, S., Wang, Y., Byers, S.W., Clarke, R. & Weiner, L.M. “G-DOC: a systems medicine platform for personalized oncology.” Neoplasia 13: 771-783, 2011.
  11. Tyson, J.J., Baumann, W.T., Chen, C., Verdugo, A., Tavassoly, I., Wang, Y., Weiner, L.M. & Clarke, R. “Dynamic models of estrogen signaling and cell fate in breast cancer cells.” Nature Rev Cancer, 11: 523-532, 2011.
  12. Zhang, B., Tian, Y., Jin, L., Li, H., Shih, I.-M., Madhavan, S., Clarke, R., Hoffman, E.P., Xuan, J., Hilakivi-Clarke, L. & Wang, Y. “DDN: a caBIGTM analytical tool for differential network analysis.” Bioinformatics, 27: 1036-1038, 2011.
  13. Gong, T., Xuan, J., Chen, L., Riggins, R.B., Li, H., Hoffman, E.P., Clarke, R. & Wang, Y. “Motif-guided sparse decomposition of gene expression data for regulatory module identification.” BMC Bioinformatics, 12:82 (doi:10.1186/1471-2105-12-82; 16 pages as published on-line), 2011.
  14. Yu, G., Li, H., Ha, S., Shih, I.-M., Clarke, R., Hoffman, E.P., Madhavan, S., Xuan, J. & Wang, Y. “PUGSVM: a caBIGtm analytical tool for multiclass gene selection and predictive classification.” Bioinformatics, 27: 736-738, 2011.
  15. Wang, C., Xuan, J., Li, H., Wang, Y., Zhan, M., Hoffman, E.P. & Clarke, R. “Knowledge-guided gene ranking by coordinative component analysis.” BMC Bioinformatics, 11:162, (13 pages as published on-line), 2010.
  16. Yu, G., Feng, Y., Miller, D.J., Xuan, J., Hoffman, E.P., Clarke, R. & Wang, Y. “Matched gene selection and committee classifier for molecular classification of heterogeneous diseases.” J Mach Learn Res, 11: 2141-2167, 2010.
  17. Zhang, Y., Xuan, J., de los Reyes, B.G., Clarke, R. & Ressom, H.W. “Reconstruction of gene regulatory modules in cancer cell cycle by multi-source data integration.” PLoS ONE, 5 (4): e10268, 2010.
  18. Chen, L., Xuan, J., Riggins, R.B., Wang, Y., Hoffman, E.P. & Clarke, R. “Multi-level support vector regression analysis to identify condition-specific regulatory networks.” Bioinformatics, 26: 1426-1422, 2010.
  19. Zhang, B., Li, H., Clarke, R. & Wang, Y. “Differential dependency network analysis to identify topological rewiring in biological networks.” In: “Medical Biostatistics for Complex Diseases.” Eds. Emmert-Streib, F. & Dehmer, M., Wiley-VCH, Berlin, Germany, pp185-203, 2010.
  20. Xuan, J., Wang, Y., Hoffman, E. & Clarke, R. “Cross phenotype normalization of microarray data.” Front Biosci, E2: 171-186, 2010

Year 00s

  1. Gong, T., Xuan, J., Riggins, R.B. & Clarke, R. “A systems biology approach to identify affected regulatory and signaling circuits in protein interaction networks.” Proc Int Conf Bioinf, Sys Biol, Intell Comp, 297-300, 2009.
  2. Chen, L., Xuan, J., Wang, C., Shih, L.-M., Wang, T.-L., Zhang, Z., Clarke, R., Hoffman, E.P. & Wang, Y. Biomarker identification by knowledge driven multilevel ICA and motif analysis.” Int J Data Mining Bioinformatics, 3: 365-381, 2009.
  3. Zhang, Y., Xuan, J., de los Reys, B.G., Clarke, R. & Ressom, H.W. “Reverse engineering module networks by PSO-RNN hybrid modeling.” BMC Genomics, 10: S15 (10 pages as published on-line), 2009.
  4. Chen, L. Xuan, J., Wang, Y. Hoffman, E.P., Riggins, R.B., Clarke, R., “Identification of condition-specific regulatory modules through multi-level motif and mRNA expression analysis,” Int J Comp Biology Drug Design, Vol2, 1.pp.1-20 2009.
  5. Clarke, R., Shajahan, A.N., Riggins, R.B., Cho, Y., Crawford, A., Xuan, J., Wang, Y., Zwart, A., Nehra, R. & Liu, M.C. “Gene network signaling in hormone responsiveness modifies apoptosis and autophagy in breast cancer cells.” J Steroid Biochem Mol Biol, 114: 8-20, 2009. PubMed 
  6. Zhang, B., Li, H., Riggins, R.B., Zhan, M., Xuan, J., Zhang, Z., Hoffman, E.P., Clarke, R., Wang, Y., “Differential dependency network analysis to identify condition-specific topological changes in biological networks” Bioinformatics, 25: 526-532, 2009. PubMed 
  7. Olivo, S., Zhu, Y., Lee, R.Y., Cabanes, A., Khan, G., Zwart, A., Wang, Y, Clarke, R., Hilakivi-Clarke, L.A., “Identification of gene signaling pathways mediating the opposite effects of prepubertal low and high fat n-3 PUFA diets on breast cancer risk.” Cancer Prev Res, 1: 522-531, 2008. PubMed
  8. Zhu, Y., Li, H., Miller, D., Wang, Z., Xuan, J., Clarke, R., Hoffman, E.P. & Wang, Y., “caBIG VISDA: modeling, visualization and discovery for cluster analysis of genomic data.” BMC Bioinformatics, 9: 383 (18 pages as published on-line), 2008. PubMed
  9. Chen, L., Xuan, J., Wang, Y., Riggins, R.B., Clarke, R. “Network-constrained support vector machine for classification.” 7th Intl Conf Machine Learning Applications (IMCLA), 60-65, 2008.
  10. Zhang, Y., Xuan, J., de los Reyes, B.G., Clarke, R. & Ressom, H.W. ”Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data.” BMC Bioinformatics, 9:203 (18 pages as published on-line), 2008. PubMed
  11. Wang, C., Xuan, J., Chen, L., Zhao, P., Wang, Y., Clarke, R., Eric Hoffman, E. “Integrative network component analysis for regulatory network reconstruction.” Proc. 4th Intl Symp Bioinformatics Res Applications (ISBRA), in press, 2008.
  12. Wang, Y., Miller, D.J. & Clarke, R. “Approaches to working in high dimensional data spaces: gene expression microarrays.” Br J Cancer, 98: 1023-1028, 2008. PubMed
  13. Zhu, Y., Wang, Z., Miller, D.J., Clarke, R., Xuan, J., Hoffman, E.P. & Wang, Y. “A ground truth based comparative study on clustering of gene expression data.” Front Biosci, 13: 3839-3849, 2008. PubMed
  14. Clarke, R., Ressom, H., Wang, A., Xuan, J., Liu, M.C., Gehan, E. & Wang, Y. “The properties of high-dimensional data spaces: implications for exploring gene and protein expression data.” Nature Rev Cancer, 8: 37-49, 2008. PubMed
  15. Wang, C., Chen, L., Zhao, P., Hoffman, E., Clarke, R., Wang, Y. & Xuan, J. “Motif-directed component analysis for regulatory network inference.” BMC Bioinformatics, 9: S21 (9 pages as published on-line), 2008. PubMed
  16. Ressom, H.W., Varghese, R.S., Zhang, Z., Xuan, J. & Clarke, R. “Classification algorithms for phenotype prediction in genomics and proteomics.” Front Biosci, 13: 691-708, 2008. PubMed
  17. Gomez, B.P. Riggins, R.B., Shajahan A., Klimach, U., Zhu, Y., Zwart, A., Wang, M., Wang, A. & Clarke, R. “Human X-box binding protein-1 confers both estrogen-independence and antiestrogen resistance in breast cancer.” FASEB J, 21:4013-27, 2007. PubMed
  18. Gong, T., Xuan, J., Wang, C., Li, H., Hoffman, E., Clarke, R. & Wang, Y. “Gene module identification from microarray data using nonnegative independent component analysis.” Gene Regulat Sys Biol, 1: 349-363, 2007. Gene Regulat Sys Biol
  19. Xuan, J., Wang, Y., Dong, Y., Feng, Y., Wang, B., Khan, J., Bakay, M., Wang, Z., Pachman, L., Winokur, S., Chen, Y.-W., Clarke, R., & Hoffman E. “Gene selection for multiclass prediction by weighted Fisher criterion.” EURASIP J Bioinformat System Biol, 2007: Article ID 64628, 15 pages, 2007. PubMed
  20. Xuan, J., Wang, Y., Clarke, R. & Hoffman, E., “An iterative nonlinear regression method for microarray data normalization,” Open Appl Informatics J, 1: 11-19, 2007. 
  21. Wang, J., Li, H., Zhu, Y., Yousef, M., Mebozhyn, M., Showe, N., Xuan, J., Clarke, R. & Wang, Y. “VISDA: an open-source caBIGTManalytical tool for data clustering and beyond.” Bioinformatics, 23: 2024-2027, 2007. PubMed
  22. Ressom, H.W., Zhang, Y., Xuan, J., Wang, Y. & Clarke, R. "Inferring network interactions using recurrent neural networks and swarm intelligence," Proc 28th IEEE EMBS Intl Conf, pp. 4241-4244, 2006. PubMed
  23. Gong, T., Zhu, Y., Xuan, J., Li, H., Clarke, R., Hoffman, E.P. & Wang, Y. “Latent variable and nICA modeling of pathway gene module composite.” Proc 28th IEEE EMBS Intl Conf, pp. 5872-5875, 2006. PubMed
  24. Feng, Y., Wang, Z., Zhu, Y., Xuan, J., Miller, D., Clarke, R., Hoffman, E.P. & Wang, Y. “Learning the tree of phenotypes using genomic data and VISDA.” 6th IEEE Symp Bioinf Bioeng (BIBE´06), 165-170, 2006.
  25. Ressom, H.W., Zhang, Y., Xuan, J., Wang, Y. & Clarke, R. “Inference of gene regulatory networks from time course gene expression data using neural networks and swarm intelligence.” IEEE Symp Compl Intel Bioinf Comput Biol, 435-442, 2006.
  26. Wang, Z., Wang, Y., Xuan, J., Dong, Y., Bakay, M., Khan, J., Clarke, R. & Hoffman, E.P. “Optimized multilayer perceptrons for molecular classification and diagnosis using genomic data.” Bioinformatics, 22: 755-761, 2006. PubMed
  27. Zhu, Y., Wang, A., Liu, M.C., Zwart, A., Lee, R.Y., Gallagher, A., Wang, Y., Miller, W.R., Dixon, J.M. & Clarke, R.“Estrogen receptor alpha (ER) positive breast tumors and breast cancer cell lines share similarities in their transcriptome data structures.” Int J Oncol, 29: 1581-1589, 2006. PubMed
  28. Zhang, J., Huang, K., Khan, J., Li, K., Bhujwalla, Z., Clarke, R., Gu, Z., Szabo, Z., Xuan, J., and Wang, Y. Computational decomposition of molecular signatures. Proc. Intl. Conf. Diagnositic Imaging and Analysis. IEEE press, in Press.
  29. Xuan, J., Dong, Y., Khan, J., Hoffman, E., Clarke, R. & Wang, Y. “Robust feature selection by weighted Fisher criterion for multiclass prediction in gene expression profiling.” Proc Intl Conf Pattern Recon, in press.
  30. Espinoza, L. A., Li, P., Lee, R., Wang, Y., Boulares, A. H., Clarke, R., and Smulson, M. E. "Evaluation of gene expression profiles of keratinocytes in response to JP-8 jet fuel," Toxicoland Appl Pharmacol, 200:93-102, 2004.
  31. Zhang, J., Wang, Y., Khan, J. & Clarke, R. "Gene selection in class space for molecular classification of cancer." Science in China Series F-Information Sciences, 47:301-314, 2004.
  32. Wang, Z., Zhang, J., Lu, J., Lee, R., Kung, S.-Y., Clarke R. & Wang Y. Discriminatory mining of gene expression microarray data. J VLSI Signal Process Sys, 35:255-272, 2003. Link to Journal
  33. Liu, A., Zhang, Y., Gehan, E. & Clarke, R. Block principal components analysis with application to gene microarray data classification. Stat Med, 21: 3465-3474, 2002 Link to Abstract  (at Statistics in Medicine)
  34. Zhang, J., Huang, K., Khan, J., Li, K., Bhujwalla, Z., Clarke, R., Gu, Z., Szabo, Z., Xuan, J, & Wang, Y. Computational decomposition of molecular signatures. Proc Intl Conf Diagnostic Imaging Analysis, in press. Link to IEEE Publications
  35. Ellis, M., Davis, N., Coop, A., Liu, M., Schumaker, L., Lee, R.Y., Srikanchana, R., Russell, C., Singh, B., Miller, W.R., Stearns, V., Pennanen, M., Tsangaris, T., Gallagher, A., Liu, A., Zwart, A., Hayes, D.F., Lippman, M.E., Wang, Y. & Clarke, R. Development and validation of a method for using breast core needle biopsies for gene expression microarray analyses. Clin Cancer Res, 8: 1155-1166, 2002. Link to Abstract (at Clinical Cancer Research).
  36. Gu, Z., Lee, R.Y., Skaar, T.C., Bouker, K.B., Welch, J.N., Lu, J., Liu, A., Davis, N., Leonessa, F., Brünner, N., Wang, Y. & Clarke, R. Association of interferon regulatory factor-1, nucleophosmin, nuclear factor kappa-B and cAMP response element binding with acquired resistance to Faslodex (ICI 182,780). Cancer Res, 62: 3428-3437, 2002. Link to Abstract (at Cancer Research).
  37. Wang, Y., Lu, J., Lee, R. & Clarke, R. Iterative normalization of cDNA microarray data. IEEE Trans Inf Technol Biomed, 6: 29-37, 2002. Link to Abstract (PubMed).
  38. Lu, J., Wang, Y., Xuan, J., Kung, S.Y., Gu, Z. & Clarke, R. Discriminative analysis of gene microarray data. Proc IEEE Neural Netw Signal Process, 11: 218-227, 2001. Link to IEEE Publications.

Year 90s

  1. Skaar, T.C., Prasad, S.C., Sharareh, S., Brünner, N., Lippman, M.E.& Clarke, R. Two-dimensional gel electrophoresis analyses identify nucleophosmin as an estrogen regulated protein associated with acquired estrogen-independence in human breast cancer cells. J Steroid Biochem Mol Biol, 67:391-402, 1998. Link to Abstract (PubMed).