EPBI 473: Integrative Cancer Biology (Fall Semester, 2005)
Instructor: Tomas Radivoyevitch, Ph.D., Assistant
Professor
Department of Epidemiology and Biostatistics,
Tel: 216-368-1965; Fax: 216-368-1969
e-mail: radivot@hal.cwru.edu ;
website: http://epbi-radivot.cwru.edu/
Course website: http://epbi-radivot.cwru.edu/ICB/
Prerequisites: BIOC 407 (general biochemistry), EPBI 432 (introductory
statistics).
Required
*** Please purchase Dalgaard
ASAP. It will be covered at a fast pace in the first lecture. *****
Meeting Place and Times: Wednesdays, classroom: NOA 130 (4:00 pm to 6:30 pm).
Other:
Computer Laboratory: WG-63
Office Hours (in BRB G-19): 2:00pm–5:30pm
(Mondays and Fridays)
Grading: 40% class projects, 20% homeworks, and
40% Exams and Quizzes; HW
solutions are due by email by 4pm the following Tuesday (i.e. 24 hours before
the next class).
Links:
(ICB) http://icbp.nci.nih.gov/ ; http://plan.cancer.gov/biology.shtml
(software) http://www.r-project.org/; http://www.bioconductor.org/; http://www.sbml.org/; (data) http://www.rerf.or.jp/
http://seer.cancer.gov/ ; http://www.ncbi.nlm.nih.gov/geo/
*** Please obtain the SEER and Abomb data sets form the SEER and RERF links ASAP. The SEER
data should be obtained as text files. *****
Course Description: This
is a project-focused research level course in integrative cancer biology (ICB),
an emergent field in which mathematical models and computer simulations are
used to synthesize various forms of cancer data to yield experimentally
testable scientific hypotheses. The course is designed for oncologists and
cancer biologists who are interested in learning how to apply mathematics and a
high level programming language (the freeware R) to analyses of cancer research
data. Data on all levels will be considered, ranging from epidemiological
datasets to DNA microarray datasets.
Lecture slides with audio are available collectively here.
Supplementary material (http://epbi-radivot.cwru.edu/ICSB2005/)
will be derived from ICB workshops of ICSB meetings, see http://csbi.mit.edu/icsb-2005.
Week 1: Introduction to integrative cancer biology
and programming in R (Dalgaard)
Install R (http://www.r-project.org/),
the R packages ISwR and survival [by clicking packages and install (from CRAN) in R], and Textpad (http://www.textpad.com/)
with the R syntax file r.syn (http://www.textpad.com/add-ons/synn2t.html/)
for code editing.
This lecture
will cover these slides (of this manuscript) and this R script from Dalgaard.
Homework 1:
Dalgaard problems 6.1 and 10.3. Solutions
Week 2: Statistical analyses of the SEER data using
R
Please read
the SEER data documentation and download and examine the SEER data tools in R
available here. The first half of this class will be spent on
these overview codes. The second half will focus on the incidence of APL and
CML using these R codes. Plots generated by these codes are available
as a powerpoint
file or as html.
Homework 2: Using Poisson regression as in the
sample codes for CML and APL, fit exponential incidence to SEER stomach, brain,
lung and NHL over the respective age intervals of (30,70), (20,70), (20,60) and
(20,75). Plot the model fits against the data using log incidence versus age.
Compare the aging rate constants to those for CML and APL found using the
sample codes. Solutions
Week 3: Statistical analyses of the Japanese A-bomb
survivor data using R
Examine these
R codes which produce these plots.
Homework 3: See the last slide in the powerpoint
file. Solution
Week 4: Biologically-based mathematical modeling of
radiation induced CML risk and background colorectal cancer incidence
The CML risk
portion of this lecture will cover these slides. Please read through these papers and these R codes
which reproduce these plots.
Exam 1: This
take home exam is due (by email) on Sunday September
25th at 8pm. Solutions
Week 5: Bioconductor,
gene expression omnibus (GEO), caBIG
Install Bioconductor (http://www.bioconductor.org/)
and examine this R script. Also examine GEO at http://www.ncbi.nlm.nih.gov/geo/gds/gds_browse.cgi
and, optionally, caBIG at https://cabig.nci.nih.gov/.
No homework this week
Week 6: Bioconductor
(Affy and Limma)
Examine the Affy and limma packages of
Bioconductor and these R scripts.
Homework 4: Use limma to
perform a two-way (MTX vs. MP) ANOVA analysis of the dataset of Cheok et al 2003 (available as an R data package from http://epbi-radivot.cwru.edu/). Solutions
Week 7: Lymphoma microarray data; survival
prediction
Read these
papers: (Alizadeh et al 2000); (Shipp et al. 2002);
(Rosenwald et al 2002); (Wright et al 2003); (Hans et al 2004); (Poulsen et al 2005) and examine this R script and these R objects of the data of Shipp et al. and Rosenwald et al.
Week 8: Systems Biology Markup Language (SBML),
SBMLR and folate system correlations in DNA microarray data
Please
upgrade to R 2.2, Bioconductor 1.7 and SBMLR 1.25.1 (from
http://epbi-radivot.cwru.edu/SBMLR/). Without unzipping it, install SBMLR from
within the R GUI via packages and install from local zip. The XML package is needed by SBMLR so
it should also be installed. Under windows, if an error message arises from
library(XML), try copying the *.dll files in the XML package libs directory to C:\windows. Work through the example R codes that have
been provided with SBMLR help and also have a look at the manuscript and corresponding R scripts in the BMCcancerFolates directory of SBMLR (see
also http://epbi-radivot.cwru.edu/folates/). Also read the
folate modeling paper of Morrison
and Allegra, 1989.
Homework 5: Pick a GEO dataset of biological
interest to you and convert it into a corresponding R data package. Hint:
create the equivalent of these starter files
(written for Cheung et al.’s GDS479 data) for your dataset and run the r script
therein (setup.r) in a new directory. This will
generate a “data” subdirectory containing the eset
object of interest (as a binary data file), a man (i.e. help) subdirectory with
a skeleton file which you must edit (you can delete the .package help file), a src directory which you should delete (since we don’t have
C codes to compile), and a Description file which you must edit. To avoid
overwriting any edited files (by rerunning setup.r),
add “Eset” to the package name (in the Description
file) and to the package directory name. At this point, you should have a file
structure that looks like this. Now make sure that R’s bin directory
is on your DOS path and run the DOS batch file (in starter
files) from just above the package parent directory. This should yield an
echo to the DOS screen which looks like this. If successful, a windows binary R package zip
file (which can be “installed from local zip”) will appear next to the package
source directory. Getting this last step to work error free will require some
one-time setup effort which is spelled out at http://www.murdoch-sutherland.com/Rtools/. My machine also has miktex
installed, see http://www.murdoch-sutherland.com/Rtools/miktex.html.
Finally, the first two chapters in the R documentation “Writing R Extensions”
are worth a look, but they contain much more information than is needed to
build the simple data package asked for here, so they may not be the best place
to start.
Week 9: Purine and pyrimidine metabolism modeling in R
Read these purine metabolism papers of Curto
et al 1997, 1998a, 1998b
and chapter 10 of Theoretical
Foundations of Cancer Therapy (Jackson, 1992; http://epbi-radivot.cwru.edu/). Also
read Hofmeyr’s MCA in a nutshell.
Exam 2: Apply the R data package built in HW5
to Morrison’s folate model using the approach described in the BMCcancerFolates directory of SBMLR.
Week 10: Models of DNA damage and repair
Read these papers and study and run this R
script.
Homework 6: Use plot digitizer (http://plotdigitizer.sourceforge.net/)
to convert the thymidine phosphorylation reaction
rate data plotted in Figure 5 of E. E. McKee et al. Cardiovasc
Toxicol 4, 155-67 (2004) [use the main plot of the
lower panel] into an ascii file with numbers; e.g. cut-and-paste
the plot into powerpoint and save the single slide as
a gif file to be opened in the program. Now use read.table
to read the data into R and plot it as V versus TdR
for comparisons with the upper panel of Figure 5. Use nonlinear least squares to fit a simple Michaelis-Menten rate law to this data to reproduce
(approximately) the Vm and apparent Km given in the
upper panel of Figure 5. Solutions
Week 11: p53 modeling
Read these papers and study and run these R scripts. Now
load/install JDesigner and Jarnac
and reproduce this plot using this SBML
implementation of the model of Chickarmane et al (plot and model kindly provided by
Herbert Sauro). Note that Jarnac
simulates the model much faster than R. We will go over these slides in class.
Week 12: Cell cycle models and flow cytometry data
Read these papers and work through this Word
file. We will go over these slides in
class.
Week 13: Happy Thanksgiving!
Week 14: CML and tumor growth modeling
We will go
over the models of Michor and Hahnfeldt described at
the ICSB 2005 Workshop on Integrative Cancer Biology, see http://epbi-radivot.cwru.edu/ICSB2005/.
Week 15: Class project presentations
This course
is being developed as part of the Case Integrative Cancer Biology Program, http://epbi-radivot.cwru.edu/caseICBP/.