ems {ccems}R Documentation

Equilibrium Model Selection

Description

This is the main automation function of this package. It generates a space of combinatorially complex equilibrium models and fits them to data.

Usage

ems(d, g, cpusPerHost=c("localhost" = 1), ptype="",chunkParams=list(size=500,n=1,maxnPs=2,extend2maxP=TRUE),
          smart=FALSE,pRows=FALSE,doTights=FALSE,doGrids=TRUE,doSpurs=TRUE,topN=5,showConstr=FALSE,atLeastOne=TRUE,IC=1)

Arguments

d The data as a dataframe.
g The list output of mkg.
cpusPerHost This is an integer vector where names are host names and values are their cpu numbers.
ptype Parallelization type: "" for single cpus; "SOCK" and "NWS" (networkspaces) for snow options.
chunkParams List with components: size which is the batchSize of spur model chunks, see mkSpurs; n which is the number of spur model chunks requested (this may increase internally if extend2maxP = TRUE or smart=TRUE); maxnPs which is the maximum number of parameters of models that will be fitted (internally, larger models may be generated but not fitted); and extend2maxP which is set to TRUE if n should be extended (if needed) to reach maxnPs.
smart Set to TRUE to stop when models with lastCompleted parameters (see mkSpurs) have an AIC that is bigger than that of the lastCompleted-1 parameter models, else the entire model space as defined by chunkParams is fitted.
pRows Set to TRUE if models with estimated inactive protein fractions p are wanted in the model space, else p=1 will be fixed for all models generated.
doTights Set to TRUE if spur models with infinitely tight binding single edges (with K=0) are wanted in the model space.
doGrids Leave TRUE (the default) if grid models are wanted, set to FALSE if not (e.g. if only spur models are wanted).
doSpurs Leave TRUE if the spur model space is wanted, set to FALSE if not (e.g. if only grid models are wanted).
topN The number of best models of the current batch of models that will be carried over to compete with the next batch; such carryovers are needed to allow fits of model spaces that are too large to reside in memory at one time. This number is also the number of best models summarized in html in the results folder after fitting each batch.
showConstr Set to TRUE if constrained (fixed and tracking) parameters are to be included in the html report in results.
atLeastOne Leave TRUE if only models with at least one complex of maximal size are to be considered. Set FALSE if there is no prior knowledge supportive of the assertion that the largest oligomer must be in the model.
IC The initial condition of all K parameters optimized. The default is IC=1.

Details

This is the highest level function in ccems. The other functions serve this function, though they may also be used to fit individual models manually.

Value

A list of the topN best (lowest AIC) models. This should be assigned to a variable to avoid large screen dumps. Side-effect html reports in results are the main output and purpose of this function.

Note

Spur and grid graph models have network topologies that either radiate from the hub or can be overlaid on a city block lay out, respectively. Though head node spur graph edges can be superimposed in curtain rods (see ccems) to give these graphs a grid appearance, it is better to replace the curtain rod with a set of nested arches and call such spur-grid hybrids K equality graphs or simply hybrids (i.e. a term that is more tolerant than grid). Another option is to tolerate spur edges to head nodes in a broadened definition of the term grid. Advantages of the latter option include an emphasis on parallel edges and thus equality aspects of the graph (compared to the term hybrid), more compactness/better looks (compared to the term K equality) and usage inertia. Readers are thus asked to accept this broadened definition of the term grid, i.e. to allow head node spur edges in grid graphs.

This work was supported by the National Cancer Institute (K25CA104791).

Author(s)

Tom Radivoyevitch (txr24@case.edu)

References

Radivoyevitch, T. (2009) Automated model generation and selection methods for combinatorially complex biochemical equilibriums. (In preparation)

See Also

ccems, mkg

Examples

library(ccems)
topology <- list(  
        heads=c("R1t0","R2t0"),  
        sites=list(       
                s=list(                     # s-site    thread #
                        m=c("R1t1"),        # monomer      1
                        d=c("R2t1","R2t2")  # dimer        2
                )
        )
) 
g <- mkg(topology,TCC=TRUE) 
data(RNR)
d1 <- subset(RNR,(year==2001)&(fg==1)&(G==0)&(t>0),select=c(R,t,m,year))
d2 <- subset(RNR,year==2006,select=c(R,t,m,year)) 
dd <- rbind(d1,d2)
names(dd)[1:2] <- paste(strsplit(g$id,split="")[[1]],"T",sep="") # e.g. to form "RT"
rownames(dd) <- 1:dim(dd)[1] # lose big number row names of parent dataframe
chnkPs <- list(size=4,n=1,maxnPs=1,extend2maxP=TRUE) # end sooner if maxnPs is reached, add chunks (i.e. increase n) if not
## The next line can be commented to speed up package check times.  
## Not run: 
 
top <- ems(dd,g,chunkParams=chnkPs)  # this takes roughly one minute 
## End(Not run)

[Package ccems version 1.0 Index]