EPBI 473: Integrative Cancer Biology (Fall Semester, 2005)


Instructor:    Tomas Radivoyevitch, Ph.D., Assistant Professor

Department of Epidemiology and Biostatistics, School of Medicine, BRB-G19

Case Western Reserve University

Tel: 216-368-1965; Fax: 216-368-1969

e-mail: radivot@hal.cwru.edu ; website: http://epbi-radivot.cwru.edu/

Course website: http://epbi-radivot.cwru.edu/ICB/


Prerequisites: BIOC 407 (general biochemistry), EPBI 432 (introductory statistics).

Required Reading: Introductory Statistics with R (Dalgaard, 2002; Springer); Theoretical Foundations of Cancer Therapy (Jackson, 1992; http://epbi-radivot.cwru.edu/); papers assigned in class.  

***   Please purchase Dalgaard ASAP. It will be covered at a fast pace in the first lecture. *****


Meeting Place and Times: Wednesdays, classroom: NOA 130 (4:00 pm to 6:30 pm).

Other: Cancer Center Seminars, Thursdays (4:00 pm), Wolstein Auditorium

Computer Laboratory: WG-63

Office Hours (in BRB G-19): 2:00pm–5:30pm (Mondays and Fridays)
Grading: 40% class projects, 20% homeworks, and 40% Exams and Quizzes;
HW solutions are due by email by 4pm the following Tuesday (i.e. 24 hours before the next class).

Links: (ICB) http://icbp.nci.nih.gov/ ;  http://plan.cancer.gov/biology.shtml (software)  http://www.r-project.org/; http://www.bioconductor.org/; http://www.sbml.org/; (data)  http://www.rerf.or.jp/ http://seer.cancer.gov/ ;  http://www.ncbi.nlm.nih.gov/geo/

***   Please obtain the SEER and Abomb data sets form the SEER and RERF links ASAP. The SEER data should be obtained as text files. *****


Course Description: This is a project-focused research level course in integrative cancer biology (ICB), an emergent field in which mathematical models and computer simulations are used to synthesize various forms of cancer data to yield experimentally testable scientific hypotheses. The course is designed for oncologists and cancer biologists who are interested in learning how to apply mathematics and a high level programming language (the freeware R) to analyses of cancer research data. Data on all levels will be considered, ranging from epidemiological datasets to DNA microarray datasets. 


Lecture slides with audio are available collectively here.


Supplementary material (http://epbi-radivot.cwru.edu/ICSB2005/) will be derived from ICB workshops of ICSB meetings, see http://csbi.mit.edu/icsb-2005.



Tentative Schedule:


Week 1:    Introduction to integrative cancer biology and programming in R (Dalgaard)

Install R (http://www.r-project.org/), the R packages ISwR and survival [by clicking packages and install (from CRAN) in R], and Textpad (http://www.textpad.com/) with the R syntax file r.syn (http://www.textpad.com/add-ons/synn2t.html/) for code editing.

This lecture will cover these slides (of this manuscript) and this R script from Dalgaard. 

Homework 1: Dalgaard problems 6.1 and 10.3.  Solutions



Week 2:    Statistical analyses of the SEER data using R                    

Please read the SEER data documentation and download and examine the SEER data tools in R available here.  The first half of this class will be spent on these overview codes. The second half will focus on the incidence of APL and CML using these R codes.  Plots generated by these codes are available as a powerpoint file or as html.

Homework 2: Using Poisson regression as in the sample codes for CML and APL, fit exponential incidence to SEER stomach, brain, lung and NHL over the respective age intervals of (30,70), (20,70), (20,60) and (20,75). Plot the model fits against the data using log incidence versus age. Compare the aging rate constants to those for CML and APL found using the sample codes. Solutions


Week 3:    Statistical analyses of the Japanese A-bomb survivor data using R

Examine these R codes which produce these plots.

Homework 3: See the last slide in the powerpoint file.  Solution


Week 4:    Biologically-based mathematical modeling of radiation induced CML risk and background colorectal cancer incidence

The CML risk portion of this lecture will cover these slides.  Please read through these papers and these R codes which reproduce these plots.

Exam 1:  This take home exam is due (by email) on Sunday September 25th at 8pm. Solutions


Week 5: Bioconductor, gene expression omnibus (GEO), caBIG

Install Bioconductor (http://www.bioconductor.org/) and examine this R script.  Also examine GEO at http://www.ncbi.nlm.nih.gov/geo/gds/gds_browse.cgi and, optionally, caBIG at https://cabig.nci.nih.gov/.

No homework this week


Week 6: Bioconductor (Affy and Limma)

Examine the Affy and limma packages of Bioconductor and these R scripts.

Homework 4: Use limma to perform a two-way (MTX vs. MP) ANOVA analysis of the dataset of Cheok et al 2003 (available as an R data package from http://epbi-radivot.cwru.edu/).  Solutions



Week 7:    Lymphoma microarray data; survival prediction

Read these papers: (Alizadeh et al 2000); (Shipp et al. 2002); (Rosenwald et al 2002); (Wright et al 2003);  (Hans et al 2004);  (Poulsen et al 2005) and examine this R script and these R objects of the data of Shipp et al. and Rosenwald et al.



Week 8:    Systems Biology Markup Language (SBML), SBMLR and folate system correlations in DNA microarray data

Please upgrade to R 2.2, Bioconductor 1.7 and SBMLR 1.25.1 (from http://epbi-radivot.cwru.edu/SBMLR/).  Without unzipping it, install SBMLR from within the R GUI via packages and install from local zip. The XML package is needed by SBMLR so it should also be installed. Under windows, if an error message arises from library(XML), try copying the *.dll files in the XML package libs directory to C:\windows.  Work through the example R codes that have been provided with SBMLR help and also have a look at the manuscript and corresponding R scripts in the BMCcancerFolates directory of SBMLR (see also http://epbi-radivot.cwru.edu/folates/).  Also read the folate modeling paper of Morrison and Allegra, 1989.



Homework 5: Pick a GEO dataset of biological interest to you and convert it into a corresponding R data package. Hint: create the equivalent of these starter files (written for Cheung et al.’s GDS479 data) for your dataset and run the r script therein (setup.r) in a new directory. This will generate a “data” subdirectory containing the eset object of interest (as a binary data file), a man (i.e. help) subdirectory with a skeleton file which you must edit (you can delete the .package help file), a src directory which you should delete (since we don’t have C codes to compile), and a Description file which you must edit. To avoid overwriting any edited files (by rerunning setup.r), add “Eset” to the package name (in the Description file) and to the package directory name. At this point, you should have a file structure that looks like  this. Now make sure that R’s bin directory is on your DOS path and run the DOS batch file (in starter files) from just above the package parent directory. This should yield an echo to the DOS screen which looks like this.  If successful, a windows binary R package zip file (which can be “installed from local zip”) will appear next to the package source directory. Getting this last step to work error free will require some one-time setup effort which is spelled out at http://www.murdoch-sutherland.com/Rtools/.  My machine also has miktex installed, see http://www.murdoch-sutherland.com/Rtools/miktex.html. Finally, the first two chapters in the R documentation “Writing R Extensions” are worth a look, but they contain much more information than is needed to build the simple data package asked for here, so they may not be the best place to start.


Week 9:    Purine and pyrimidine metabolism modeling in R

Read these purine metabolism papers of Curto et al 1997, 1998a,  1998b and chapter 10 of Theoretical Foundations of Cancer Therapy (Jackson, 1992; http://epbi-radivot.cwru.edu/). Also read Hofmeyr’s MCA in a nutshell.


Exam 2: Apply the R data package built in HW5 to Morrison’s folate model using the approach described in the BMCcancerFolates directory of SBMLR.


Week 10:  Models of DNA damage and repair

Read these papers and study and run this R script.


Homework 6: Use plot digitizer (http://plotdigitizer.sourceforge.net/) to convert the thymidine phosphorylation reaction rate data plotted in Figure 5 of E. E. McKee et al. Cardiovasc Toxicol 4, 155-67 (2004) [use the main plot of the lower panel] into an ascii file with numbers; e.g. cut-and-paste the plot into powerpoint and save the single slide as a gif file to be opened in the program. Now use read.table to read the data into R and plot it as V versus TdR for comparisons with the upper panel of Figure 5.  Use nonlinear least squares to fit a simple Michaelis-Menten rate law to this data to reproduce (approximately) the Vm and apparent Km given in the upper panel of Figure 5.  Solutions


Week 11:  p53 modeling

Read these papers and study and run these R scripts.  Now load/install JDesigner and Jarnac and reproduce this plot using this SBML implementation of the model of Chickarmane et al (plot and model kindly provided by Herbert Sauro). Note that Jarnac simulates the model much faster than R. We will go over these slides in class.


Week 12:  Cell cycle models and flow cytometry data

Read these papers and work through this Word file. We will go over these slides in class.


Week 13:  Happy Thanksgiving!


Week 14:  CML and tumor growth modeling

We will go over the models of Michor and Hahnfeldt described at the ICSB 2005 Workshop on Integrative Cancer Biology, see http://epbi-radivot.cwru.edu/ICSB2005/.


Week 15:  Class project presentations



This course is being developed as part of the Case Integrative Cancer Biology Program, http://epbi-radivot.cwru.edu/caseICBP/.