ems                  package:ccems                  R Documentation

_E_q_u_i_l_i_b_r_i_u_m _M_o_d_e_l _S_e_l_e_c_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     This is the main automation function of this package. It generates
     a space of   combinatorially complex equilibrium models and fits
     them to data.

_U_s_a_g_e:

     ems(d, g, cpusPerHost=c("localhost" = 1), ptype="",chunkParams=list(size=500,n=1,maxnPs=2,extend2maxP=TRUE),
               smart=FALSE,pRows=FALSE,doTights=FALSE,doGrids=TRUE,doSpurs=TRUE,topN=5,showConstr=FALSE,atLeastOne=TRUE,IC=1)

_A_r_g_u_m_e_n_t_s:

       d: The data as a dataframe.

       g: The list output of 'mkg'. 

cpusPerHost: This is an integer vector where names are host names and
          values are their cpu numbers. 

   ptype: Parallelization type: '""' for single cpus; '"SOCK"' and 
          '"NWS"' (networkspaces)  for 'snow' options.   

chunkParams: List with components: 'size' which is the 'batchSize' of
          spur model chunks, see 'mkSpurs'; 'n' which is the number of
          spur model chunks requested  (this may increase internally if
          'extend2maxP' = 'TRUE' or 'smart=TRUE'); 'maxnPs' which is
          the maximum number of parameters of models that will  be
          fitted (internally, larger models may be generated but not
          fitted); and 'extend2maxP' which is set to 'TRUE' if 'n'
          should be extended  (if needed) to reach 'maxnPs'.  

   smart: Set to 'TRUE' to stop when models with 'lastCompleted'
          parameters (see 'mkSpurs')  have an AIC that is bigger than
          that of the 'lastCompleted-1' parameter models, else the
          entire model space as defined  by 'chunkParams' is fitted. 

   pRows: Set to 'TRUE' if models with estimated inactive protein
          fractions 'p' are wanted in the model space,  else 'p=1' will
          be fixed for all models generated. 

doTights: Set  to 'TRUE' if spur models with infinitely tight binding
          single edges (with K=0) are wanted in the model space.

 doGrids: Leave 'TRUE' (the default) if grid models are wanted, set to
          'FALSE' if not (e.g. if only spur models are wanted). 

 doSpurs: Leave 'TRUE' if the spur model space is wanted, set to
          'FALSE' if not (e.g. if only grid models are wanted). 

    topN: The number of best models of the current batch of models that
          will be carried over to compete with the next batch; such
          carryovers  are needed to allow fits of model spaces that are
          too large to reside in memory at one time. This number  is
          also the number of best models summarized in html in the
          'results' folder after fitting each batch.

showConstr: Set to 'TRUE' if constrained (fixed and tracking)
          parameters are to be included in the html report in
          'results'.

atLeastOne: Leave 'TRUE' if only models with at least one complex of
          maximal size are to be considered. Set 'FALSE' if there is no
          prior knowledge supportive of the assertion that the largest
          oligomer must be in the model.

      IC: The initial condition of all K parameters optimized. The
          default is 'IC=1'. 

_D_e_t_a_i_l_s:

     This is the highest level function in 'ccems'. The other functions
     serve this function, though they may also be used to fit
     individual  models manually.

_V_a_l_u_e:

     A list of the 'topN' best (lowest AIC) models. This should be
     assigned to a variable to avoid large screen dumps.  Side-effect
     html reports in 'results' are the main output and purpose of this
     function.

_N_o_t_e:

     Spur and grid graph models have network topologies that either
     radiate from the hub or can be overlaid on a city block lay out,
     respectively.  Though head node spur graph edges can be
     superimposed in curtain rods (see 'ccems')  to give these graphs a
     grid appearance, it is better to replace the curtain rod with a 
     set of nested arches and call such spur-grid hybrids K equality
     graphs or simply hybrids (i.e. a term that is more tolerant than
     grid). Another option is to tolerate spur edges to head nodes in a
      broadened definition of the term grid. Advantages of the latter
     option include an emphasis on parallel edges and thus  equality
     aspects of the graph (compared to the term hybrid), more
     compactness/better looks (compared to the term K equality) and
     usage inertia.  Readers are thus asked to accept this broadened
     definition of the term grid, i.e. to allow head node spur edges in
     grid graphs. 

     This work was supported by the National Cancer Institute
     (K25CA104791).

_A_u_t_h_o_r(_s):

     Tom Radivoyevitch (txr24@case.edu)

_R_e_f_e_r_e_n_c_e_s:

     Radivoyevitch, T. (2009) Automated model generation and selection
     methods for combinatorially complex biochemical equilibriums. (In
     preparation)

_S_e_e _A_l_s_o:

     'ccems', 'mkg'

_E_x_a_m_p_l_e_s:

     library(ccems)
     topology <- list(  
             heads=c("R1t0","R2t0"),  
             sites=list(       
                     s=list(                     # s-site    thread #
                             m=c("R1t1"),        # monomer      1
                             d=c("R2t1","R2t2")  # dimer        2
                     )
             )
     ) 
     g <- mkg(topology,TCC=TRUE) 
     data(RNR)
     d1 <- subset(RNR,(year==2001)&(fg==1)&(G==0)&(t>0),select=c(R,t,m,year))
     d2 <- subset(RNR,year==2006,select=c(R,t,m,year)) 
     dd <- rbind(d1,d2)
     names(dd)[1:2] <- paste(strsplit(g$id,split="")[[1]],"T",sep="") # e.g. to form "RT"
     rownames(dd) <- 1:dim(dd)[1] # lose big number row names of parent dataframe
     chnkPs <- list(size=4,n=1,maxnPs=1,extend2maxP=TRUE) # end sooner if maxnPs is reached, add chunks (i.e. increase n) if not
     ## The next line can be commented to speed up package check times.  
     ## Not run: 
      
     top <- ems(dd,g,chunkParams=chnkPs)  # this takes roughly one minute 
     ## End(Not run)

