EPBI 473: Integrative Cancer Biology (Fall
Semester, 2005)
Instructor: Tomas Radivoyevitch, Ph.D., Assistant
Professor
Department
of Epidemiology and Biostatistics,
Tel:
216-368-1965; Fax: 216-368-1969
e-mail: radivot@hal.cwru.edu ; website: http://epbi-radivot.cwru.edu/
Course
website: http://epbi-radivot.cwru.edu/ICB/
Prerequisites:
BIOC 407 (general biochemistry), EPBI 432 (introductory statistics).
Required
*** Please purchase Dalgaard
ASAP. It will be covered at a fast pace in the first lecture. *****
Meeting Place and Times: Wednesdays,
classroom: NOA 130 (4:00 pm to 6:30 pm).
Other:
Computer Laboratory: WG-63
Office Hours (in BRB G-19): 2:00pm–5:30pm (Mondays and Fridays)
Grading: 40% class projects, 20% homeworks,
and 40% Exams and Quizzes; HW solutions
are due by email by 4pm the following Tuesday (i.e. 24 hours before the next
class).
Links: (ICB) http://icbp.nci.nih.gov/ ; http://plan.cancer.gov/biology.shtml
(software) http://www.r-project.org/; http://www.bioconductor.org/; http://www.sbml.org/; (data) http://www.rerf.or.jp/
http://seer.cancer.gov/ ; http://www.ncbi.nlm.nih.gov/geo/
*** Please obtain the SEER and Abomb data sets form the SEER and RERF links ASAP. The SEER
data should be obtained as text files. *****
Course Description:
This is a
project-focused research level course in integrative cancer biology (ICB), an emergent
field in which mathematical models and computer simulations are used to
synthesize various forms of cancer data to yield experimentally testable
scientific hypotheses. The course is designed for oncologists and cancer
biologists who are interested in learning how to apply mathematics and a high
level programming language (the freeware R) to analyses of cancer research
data. Data on all levels will be considered, ranging from epidemiological
datasets to DNA microarray datasets.
Lecture slides with audio are available collectively here.
Supplementary material (http://epbi-radivot.cwru.edu/ICSB2005/)
will be derived from ICB workshops of ICSB meetings, see http://csbi.mit.edu/icsb-2005.
Week 1: Introduction to integrative cancer biology
and programming in R (Dalgaard)
Install
R (http://www.r-project.org/), the R packages ISwR and survival
[by clicking packages and install (from CRAN) in R], and Textpad (http://www.textpad.com/) with the R syntax
file r.syn (http://www.textpad.com/add-ons/synn2t.html/)
for code editing.
This lecture will cover
these slides (of this manuscript) and this R script from Dalgaard.
Homework 1: Dalgaard problems 6.1 and 10.3. Solutions
Week 2: Statistical analyses of the SEER data using
R
Please read the SEER data
documentation and download and examine the SEER data tools in R available here. The first half of this class will be spent on
these overview codes. The second half will focus on the incidence of APL and
CML using these R codes. Plots generated by these codes are available
as a powerpoint
file or as html.
Homework 2: Using Poisson regression as in the sample codes for
CML and APL, fit exponential incidence to SEER stomach, brain, lung and NHL
over the respective age intervals of
(30,70), (20,70), (20,60) and (20,75).
Plot the model fits against the data using log incidence versus age.
Compare the aging rate constants to those for CML and APL found using the
sample codes. Solutions
Week 3: Statistical analyses of the Japanese A-bomb
survivor data using R
Examine these R codes which produce these plots.
Homework 3: See the last slide in the powerpoint file. Solution
Week 4: Biologically-based mathematical modeling of
radiation induced CML risk and background colorectal cancer incidence
The CML risk portion of this
lecture will cover these slides. Please read through these papers and these R codes
which reproduce these plots.
Exam 1: This take home exam is due
(by email) on Sunday September 25th at 8pm.
Solutions
Week 5: Bioconductor,
gene expression omnibus (GEO), caBIG
Install
Bioconductor (http://www.bioconductor.org/)
and examine this R script. Also examine GEO at http://www.ncbi.nlm.nih.gov/geo/gds/gds_browse.cgi
and, optionally, caBIG at https://cabig.nci.nih.gov/.
No
homework this week
Week 6: Bioconductor
(Affy and Limma)
Examine the Affy and limma packages of Bioconductor and these R
scripts.
Homework 4: Use limma to perform a
two-way (MTX vs. MP) ANOVA analysis of the dataset of Cheok
et al 2003 (available as an R data package from http://epbi-radivot.cwru.edu/). Solutions
Week 7: Lymphoma microarray
data; survival prediction
Read these papers: (Alizadeh et al 2000);
(Shipp
et al. 2002); (Rosenwald et al 2002);
(Wright et al
2003); (Hans et al 2004); (Poulsen et al 2005)
and examine this R script and these R
objects of the data of Shipp et al. and Rosenwald et al.
Week 8: Systems Biology Markup Language (SBML),
SBMLR and folate system correlations in DNA microarray data
Please upgrade to R 2.2, Bioconductor 1.7 and SBMLR 1.25.1 (from http://epbi-radivot.cwru.edu/SBMLR/). Without unzipping it, install SBMLR from
within the R GUI via packages and install from local zip. The XML package is needed by SBMLR
so it should also be installed. Under windows, if an error message arises from
library(XML), try copying the *.dll files in the XML
package libs directory to C:\windows. Work through the example R codes that have
been provided with SBMLR help and also have a look at the manuscript and corresponding R scripts in the BMCcancerFolates directory of SBMLR (see also http://epbi-radivot.cwru.edu/folates/). Also read the folate modeling paper of Morrison and Allegra, 1989.
Homework 5: Pick a GEO dataset of biological interest to you and
convert it into a corresponding R data package.
Hint: create the equivalent of these starter
files (written for Cheung et al.’s GDS479 data) for your dataset and run
the r script therein (setup.r) in a new directory.
This will generate a “data” subdirectory containing the eset
object of interest (as a binary data file), a man (i.e. help) subdirectory with
a skeleton file which you must edit (you can delete the .package help file), a src directory which you should delete (since we don’t have
C codes to compile), and a Description file which you must edit. To avoid
overwriting any edited files (by rerunning setup.r),
add “Eset” to the package name (in the Description
file) and to the package directory name. At this point, you should have a file
structure that looks like this. Now make sure that R’s bin directory
is on your DOS path and run the DOS batch file (in starter
files) from just above the package parent directory. This should yield an
echo to the DOS screen which looks like this. If successful, a windows binary R package zip
file (which can be “installed from local zip”) will appear next to the package
source directory. Getting this last step to work error free will require some
on e-time setup effort which is spelled out at http://www.murdoch-sutherland.com/Rtools/. My machine also has miktex
installed, see http://www.murdoch-sutherland.com/Rtools/miktex.html.
Finally, the first two chapters in the R documentation “Writing R Extensions”
are worth a look, but they contain much more information than is needed to
build the simple data package asked for here, so they may not be the best place
to start.
Week 9: Purine and pyrimidine metabolism modeling in R
Read these purine metabolism papers of Curto
et al 1997, 1998a, 1998b
and chapter 10 of Theoretical Foundations of Cancer
Therapy (Jackson, 1992; http://epbi-radivot.cwru.edu/).
Also read Hofmeyr’s MCA in a
nutshell.
Exam 2: Apply the R data package built in HW5 to Morrison’s folate model using the approach described in the BMCcancerFolates directory of SBMLR.
Week 10: Models of DNA damage and repair
Read these papers and study and run this R
script.
Homework
6:
Use plot digitizer (http://plotdigitizer.sourceforge.net/)
to convert the thymidine phosphorylation
reaction rate data plotted in Figure 5 of E. E. McKee et al. Cardiovasc Toxicol 4, 155-67
(2004) [use the main plot of the lower panel] into an ascii
file with numbers; e.g. cut-and-paste the plot into powerpoint
and save the single slide as a gif file to be opened in the program. Now use read.table to read the data into R and plot it as V versus TdR for comparisons with the upper panel of Figure 5. Use nonlinear least squares to fit a simple Michaelis-Menten rate law to this data to reproduce
(approximately) the Vm and apparent Km given in the
upper panel of Figure 5. Solutions
Week 11: p53 modeling
Read these papers and study and run these R scripts. Now
load/install JDesigner and Jarnac
and reproduce this plot using this SBML
implementation of the model of Chickarmane et al (plot and model kindly provided by
Herbert Sauro). Note that Jarnac
simulates the model much faster than R. We will go over these slides in class.
Week 12: Cell cycle models and flow cytometry data
Read these papers and work through this Word
file. We will go over these slides in
class.
Week 13: Happy Thanksgiving!
Week 14: CML and tumor growth modeling
We will go over the models
of Michor and Hahnfeldt
described at the ICSB 2005 Workshop on Integrative Cancer Biology, see http://epbi-radivot.cwru.edu/ICSB2005/.
Week 15: Class project presentations
This course is being
developed as part of the Case Integrative Cancer Biology Program, http://epbi-radivot.cwru.edu/caseICBP/.