| Title: | DeFries-Fulker Analysis and Univariate Bootstrapping |
|---|---|
| Description: | Implements the Univariate Bootstrap and the Traditional (Naive) Bootstrap for resampling multivariate data while preserving covariance structure. Also provides functions for DeFries-Fulker behavioral genetics models, including the Rodgers-Kohler formulation with robust standard errors. |
| Authors: | Patrick O'Keefe [aut, cre] |
| Maintainer: | Patrick O'Keefe <[email protected]> |
| License: | GPL-3 |
| Version: | 0.2.0 |
| Built: | 2026-05-31 08:07:30 UTC |
| Source: | https://github.com/cran/Omisc |
Title
aboot(boot)aboot(boot)
boot |
a vector of bootstrap resample statistics to use to calculate the accelleration parameter. |
a vector of accelleration parameters for use in BCa bootstrap intervals
data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis, 1,2,3, robust=FALSE) boots<-t(boots) aboot(boots)data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis, 1,2,3, robust=FALSE) boots<-t(boots) aboot(boots)
This function calculates the actual "a" estimate from the jackknife approximation of a used in BCa CI's
aCalc(X)aCalc(X)
X |
A vector of jackknife results |
An estimate of a for use in BCa.
X<-rchisq(100,2) aCalc(X)X<-rchisq(100,2) aCalc(X)
add
add(x)add(x)
x |
a list to be summed. Useful for doing elementwise summation of a list of matrices. |
returns a single summed object (e.g., a matrix)
x<-list(matrix(c(1:4),nrow=2),matrix(c(1:4),nrow=2)) add(x)x<-list(matrix(c(1:4),nrow=2),matrix(c(1:4),nrow=2)) add(x)
ajack
ajack(data, FUN, ...)ajack(data, FUN, ...)
data |
data to get the bias parameter (a) for |
FUN |
a function to be applied to the data |
... |
additional arguments passed to FUN |
a vector of accelleration parameters for use in BCa bootstrap intervals
data<-DFSimulated() ajack(data,DFanalysis, betasonly=TRUE, robust=FALSE)data<-DFSimulated() ajack(data,DFanalysis, betasonly=TRUE, robust=FALSE)
AllBootResults
AllBootResults(boot, lower = 0.025, upper = 0.975, data, FUN, ...)AllBootResults(boot, lower = 0.025, upper = 0.975, data, FUN, ...)
boot |
A matrix of bootstrap results |
lower |
the lower alpha |
upper |
the upper alpha |
data |
the data used for analysis |
FUN |
the function used for analysis |
... |
additional arguments to pass to FUN |
a matrix of results. Includes the baseline results, all output from standardBootIntervals, all results from BCa for both the jackknife and bootstrap accelleration methods. The bootstrap accelleration method is experimental.
data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis, 1,2,3, robust=FALSE) AllBootResults(boots, .025,.975, data, DFanalysis, 1,2,3, robust=FALSE)data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis, 1,2,3, robust=FALSE) AllBootResults(boots, .025,.975, data, DFanalysis, 1,2,3, robust=FALSE)
Gives just the beta weights from a linear model.
BarebonesBetas(data)BarebonesBetas(data)
data |
Data to be analyzed. Dependent variable MUST BE THE FIRST VARIABLE. |
A vector of beta coefficients
Data<-TestData() BarebonesBetas(Data)Data<-TestData() BarebonesBetas(Data)
BCa
BCa( Boot, data, alphalower = 0.025, alphaupper = 0.975, accelleration = "jack", FUN, ... )BCa( Boot, data, alphalower = 0.025, alphaupper = 0.975, accelleration = "jack", FUN, ... )
Boot |
A vector of bootstrap estimates of Theta |
data |
The data that was analyzed via the bootstrap |
alphalower |
The lower alpha for CI creation |
alphaupper |
The upper alpha for CI creation |
accelleration |
can currently take two values, "jack" and "bootstrap". "jack" returns the jackknife estimate of the accelleration parameter. "boot" is an experimental function that uses the bootstrap estimates in the calculation of the accelleration parameter. "boot" is many times faster (approximately n times faster where n is the number of observations). |
FUN |
The function used to get estimates of Theta |
... |
Additional arguments to FUN |
A matrix of BCa bootstrap CI's, the bias parameter and the accellation parameter
data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis, 1,2,3, robust=FALSE) BCa(boots, data, .025,.975, accelleration="bootstrap", DFanalysis, 1,2,3, robust=FALSE)data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis, 1,2,3, robust=FALSE) BCa(boots, data, .025,.975, accelleration="bootstrap", DFanalysis, 1,2,3, robust=FALSE)
Title
bias(boot, theta)bias(boot, theta)
boot |
A vector of bootstrap estimates of theta |
theta |
the sample estimate of theta |
z0 the bias parameter for BCa CI
X<-data.frame(rnorm(1000)) theta<-mean(X) boot<-NaiveBoot(X) boot<-lapply(boot, mean) boot<-do.call(rbind, boot) bias(boot, theta)X<-data.frame(rnorm(1000)) theta<-mean(X) boot<-NaiveBoot(X) boot<-lapply(boot, mean) boot<-do.call(rbind, boot) bias(boot, theta)
bootAnalysis
bootAnalysis(boot, collapse, FUN, ...)bootAnalysis(boot, collapse, FUN, ...)
boot |
A list of bootstrap resamples from NaiveBoot or uniboot. |
collapse |
Should the results be collapsed from list form. Can take values of NULL, cbind or rbind |
FUN |
The function to apply to the bootstrap resamples |
... |
additional arguments to be passed to FUN |
A list or matrix of results
data<-DFSimulated() data<-doubleEnter(data[,1],data[,2],data[,3]) boots<-uniboot(data, 1000, "Rs", TRUE, .5, NULL) results<-bootAnalysis(boots, cbind, FUN=DFanalysis, 1,2,3,TRUE,FALSE,FALSE,TRUE,FALSE)data<-DFSimulated() data<-doubleEnter(data[,1],data[,2],data[,3]) boots<-uniboot(data, 1000, "Rs", TRUE, .5, NULL) results<-bootAnalysis(boots, cbind, FUN=DFanalysis, 1,2,3,TRUE,FALSE,FALSE,TRUE,FALSE)
bootsample
bootsample(data, size = 1)bootsample(data, size = 1)
data |
a dataset to be bootstrapped |
size |
the size of the bootstrap sample relative to the original sample |
a dataset
X<-TestData() Y<-bootsample(X)X<-TestData() Y<-bootsample(X)
cent
cent(X)cent(X)
X |
vector to be centered |
Returns a centered vector
X<-c(1:10) cent(X)X<-c(1:10) cent(X)
centerData
centerData(data)centerData(data)
data |
The data to be centered |
The centered data
X<-data.frame(X=c(1:4),Y=c(6:9)) centerData(X)X<-data.frame(X=c(1:4),Y=c(6:9)) centerData(X)
cholcors
cholcors(X)cholcors(X)
X |
A matrix of data. |
This function returns the cholesky decomposition of the correlation matrix of the data
X<-stats::rnorm(100) Y<-stats::rnorm(100)+X Z<-cbind(X,Y) cholcors(Z)X<-stats::rnorm(100) Y<-stats::rnorm(100)+X Z<-cbind(X,Y) cholcors(Z)
DFanalysis
DFanalysis( data = NULL, proband, sibling, Rs, RK = TRUE, robust = TRUE, DE = TRUE, betasonly = FALSE, typicalSE = FALSE )DFanalysis( data = NULL, proband, sibling, Rs, RK = TRUE, robust = TRUE, DE = TRUE, betasonly = FALSE, typicalSE = FALSE )
data |
A dataframe. This is not necessary as the variables can be passed directly via the other arguments. |
proband |
Called "proband" for historical reasons this is the variable on the left hand side of the regression. |
sibling |
The right hand side version of proband. This would be the matched sibling scores. |
Rs |
This is the vector of relatedness coefficients |
RK |
Use the Rodgers and Kohler simplified version of the DF model (recommended). Data should not be double entered prior to analysis. |
robust |
Use the Kohler and Rodgers robust standard errors (recommeneded when using double entered data) |
DE |
Will the data need to be double entered? |
betasonly |
If TRUE only the beta weights from the regression analysis will be returned. |
typicalSE |
Should the typical regression standard errors be used? Default is false. |
The results from MyLM
TwinData<-DFSimulated(2000,2000,.3,.3) p<-TwinData[,1] s<-TwinData[,2] r<-TwinData[,3] DFanalysis(data=NULL, p,s,r)TwinData<-DFSimulated(2000,2000,.3,.3) p<-TwinData[,1] s<-TwinData[,2] r<-TwinData[,3] DFanalysis(data=NULL, p,s,r)
DFSimulated
DFSimulated(MZ = 250, DZ = 250, a2 = 0.3, c2 = 0.3)DFSimulated(MZ = 250, DZ = 250, a2 = 0.3, c2 = 0.3)
MZ |
Number of MZ twins to simulate |
DZ |
Number of DZ twins to simulate |
a2 |
Heritability (proportion of variance) |
c2 |
Shared environment (proportion of variance) |
A dataframe
TwinData<-DFSimulated(200,200,.3,.3)TwinData<-DFSimulated(200,200,.3,.3)
DFSimulatedChisq
DFSimulatedChisq(MZ = 250, DZ = 250, a2 = 0.3, c2 = 0.3, df = 10)DFSimulatedChisq(MZ = 250, DZ = 250, a2 = 0.3, c2 = 0.3, df = 10)
MZ |
Number of MZ twins to simulate |
DZ |
Number of DZ twins to simulate |
a2 |
Heritability (proportion of variance) |
c2 |
Shared environment (proportion of variance) |
df |
Total degrees of freedom for the Chi-Square variable |
A dataframe of Chi-Square distributed outcome observations for MZ and DZ twins
TwinData<-DFSimulatedChisq(200,200,.3,.3, 10)TwinData<-DFSimulatedChisq(200,200,.3,.3, 10)
DoubleEnter
doubleEnter(proband, sibling, Rs)doubleEnter(proband, sibling, Rs)
proband |
The proband scores |
sibling |
The matched sibling scores |
Rs |
The relatedness coefficients |
A dataframe
X<-DFSimulated(10,10,.2,.2) Y<-doubleEnter(X[,"proband"], X[,"sibling"], X[,"Rs"])X<-DFSimulated(10,10,.2,.2) Y<-doubleEnter(X[,"proband"], X[,"sibling"], X[,"Rs"])
endparallel
endparallel(clust)endparallel(clust)
clust |
dummy variable so that the function executes |
NA
print("NA")print("NA")
This is an implementation of the YHY bootstrap covariance matrix.
findSa(S, fitted, p, a = 0.5, df, n, tau = NULL, tol = 1e-07)findSa(S, fitted, p, a = 0.5, df, n, tau = NULL, tol = 1e-07)
S |
Sample covariance matrix |
fitted |
The fitted covariance matrix |
p |
the number of columns in the covariance matrix |
a |
the starting value for the a parameter |
df |
the degrees of freedom in the model |
n |
the number of participants in the model |
tau |
the population tau. If no tau is provided, the estimated tau from the model will be used |
tol |
the difference between ga and tau at which the function will converge |
a list of the "a" adjusted covariance matrix, Sa, the tau, ga, and the number of interations.
require(Omisc) require(lavaan) set.seed(2^7-1) modelTest<-' LV1=~ .7*x1+.8*x2+.75*x3+.6*x4 LV2=~ .7*y1+.8*y2+.75*y3+.6*y4 LV1~~.3*LV2 LV1~~1*LV1 LV2~~1*LV2 ' modelFit<-' LV1=~ x1+x2+x3+x4 LV2=~ y1+y2+y3+y4 LV1~~start(.5)*LV2 LV1~~1*LV1 LV2~~1*LV2 ' testdata<-simulateData(modelTest, sample.nobs = 250) fit<-cfa(modelFit, testdata) fitted<-fitted(fit)$cov fitted<-fitted[,1:ncol(fitted)] S<-cov(testdata) p<-8 a<-.5 n<-250 df<-21 findSa(S, fitted, p, .5, df, n)require(Omisc) require(lavaan) set.seed(2^7-1) modelTest<-' LV1=~ .7*x1+.8*x2+.75*x3+.6*x4 LV2=~ .7*y1+.8*y2+.75*y3+.6*y4 LV1~~.3*LV2 LV1~~1*LV1 LV2~~1*LV2 ' modelFit<-' LV1=~ x1+x2+x3+x4 LV2=~ y1+y2+y3+y4 LV1~~start(.5)*LV2 LV1~~1*LV1 LV2~~1*LV2 ' testdata<-simulateData(modelTest, sample.nobs = 250) fit<-cfa(modelFit, testdata) fitted<-fitted(fit)$cov fitted<-fitted[,1:ncol(fitted)] S<-cov(testdata) p<-8 a<-.5 n<-250 df<-21 findSa(S, fitted, p, .5, df, n)
HoffPseudoStandard
HoffPseudoStandard(betas, SDX, interceptvar)HoffPseudoStandard(betas, SDX, interceptvar)
betas |
A vector of betas from a multilevel model |
SDX |
A vector of the standard deviations of the X value for each of the X's associated with the bets |
interceptvar |
A vector of the intercept variances at the level associated with the betas |
A vector of pseudostandardized coefficients
print("none")print("none")
jackknife
jackknife(data)jackknife(data)
data |
The data to jackknife |
a list of jackknife datasets
data<-cbind(1:10,1:10) result<-jackknife(data) lapply(result,mean)data<-cbind(1:10,1:10) result<-jackknife(data) lapply(result,mean)
justBetas
justBetas(data, Y, X)justBetas(data, Y, X)
data |
A data frame |
Y |
The name or column number of the Y variable |
X |
The name(s) or column number(s) of the X variables |
A vector of unstandardized beta weights
X<-stats::rnorm(100) Y<-stats::rnorm(100)+5*(X) data<-cbind(Y,X) justBetas(data,1,2) #if you want an intercept Y<-stats::rnorm(100)+5*(X)+5 data<-cbind(Y,X,1) justBetas(data,1,c(2:3))X<-stats::rnorm(100) Y<-stats::rnorm(100)+5*(X) data<-cbind(Y,X) justBetas(data,1,2) #if you want an intercept Y<-stats::rnorm(100)+5*(X)+5 data<-cbind(Y,X,1) justBetas(data,1,c(2:3))
leave1out
leave1out(x, data)leave1out(x, data)
x |
Which row(s) of data to leave out |
data |
A dataframe or matrix. |
The reduced dataframe or matrix
data<-cbind(1:10,1:10) leave1out(5,data)data<-cbind(1:10,1:10) leave1out(5,data)
MyLM
MyLM(Y, X, robust = FALSE, betasonly = FALSE, typicalSE = TRUE)MyLM(Y, X, robust = FALSE, betasonly = FALSE, typicalSE = TRUE)
Y |
The Y variable |
X |
A matrix of X variables |
robust |
Should robust standard errors be calculated? Assumes a double entered twin dataset with twins evenly spaced in the dataset. |
betasonly |
Should only the betas be returned? Good for bootstrapping |
typicalSE |
Should the typical standard errors be included? Default is true. Can be true when robust is True. |
Returns a matrix of betas and standard errors
X<-DFSimulated(100,100,.4,.4) Y<-RK(X[,1],X[,2],X[,3]) MyLM(Y[,1],Y[,c(2:3)],TRUE)X<-DFSimulated(100,100,.4,.4) Y<-RK(X[,1],X[,2],X[,3]) MyLM(Y[,1],Y[,c(2:3)],TRUE)
The Naive Bootstrap
NaiveBoot(data, B = 1000, groups = NULL, keepgroups = FALSE, size = 1)NaiveBoot(data, B = 1000, groups = NULL, keepgroups = FALSE, size = 1)
data |
data to be bootstrapped |
B |
number of bootstrap samples to take |
groups |
grouping variable if there is one |
keepgroups |
keep the grouping variable? |
size |
size of the bootstrap resamples relative to the original sample |
a list of bootstrap resamples
X<-TestData() Y<-NaiveBoot(X)X<-TestData() Y<-NaiveBoot(X)
Parboot
Parboot(X, data, groups = NULL, keepgroups = FALSE, size = 1, HIcor = NULL)Parboot(X, data, groups = NULL, keepgroups = FALSE, size = 1, HIcor = NULL)
X |
A dummy variable to make parLapply happy |
data |
The data frame to be resampled |
groups |
A grouping variable name |
keepgroups |
Should the grouping variable be kept in the final datasets? |
size |
The size of the bootstrap sample to be returned. Should be as a proportion and must be evenly divided into nrow(data). |
HIcor |
If hypothesis imposed correlations are to be used, this is where the HI correlation matrix goes. |
A list of bootstrap samples
#A single univariate bootsrap sample X<-TestData() Y<-Parboot(data=X)#A single univariate bootsrap sample X<-TestData() Y<-Parboot(data=X)
parUniboot
parUniboot( data, B, clust, groups = NULL, keepgroups = FALSE, size = 1, HIcor = NULL, ... )parUniboot( data, B, clust, groups = NULL, keepgroups = FALSE, size = 1, HIcor = NULL, ... )
data |
data to be bootstrapped |
B |
the number of bootstrap replications |
clust |
The list of clusters to use. Should be initialized using startparallel() |
groups |
Groups to be independently bootstrapped |
keepgroups |
Should the grouping variable be kept in the final dataset? |
size |
Size of the bootstrap sample relative to the original sample |
HIcor |
A hypothesis imposed correlation matrix to be used. Default is NULL |
... |
additional arguments to be passed. Currently does nothing |
A list of dataframes of size (size*nrow(data))
#data<-TestData() #clust<-startparallel("data") #results<-parUniboot(data,1000,clust) #endparallel(clust)#data<-TestData() #clust<-startparallel("data") #results<-parUniboot(data,1000,clust) #endparallel(clust)
resample
resample(X, size)resample(X, size)
X |
A vector to be resamples |
size |
The size of the resulting vector. Should be a number such that size*nrow(X) is a whole number |
A vector of resampled X values
X<-c(1:10) resample(X,.5)X<-c(1:10) resample(X,.5)
RK
RK(proband, sibling, Rs, DE = TRUE)RK(proband, sibling, Rs, DE = TRUE)
proband |
column name or number of the proband |
sibling |
column name or number of the siblings |
Rs |
column name or number of the relatedness coefficients |
DE |
Should the data be double entered? |
A dataframe
X<-DFSimulated(100,100,.3,.3) Y<-RK(X[,1],X[,2],X[,3])X<-DFSimulated(100,100,.3,.3) Y<-RK(X[,1],X[,2],X[,3])
function for calculating the matrices for the Kohler Rodgers SE
Sfunc(X, e)Sfunc(X, e)
X |
A matrix of X variables |
e |
A matrix of error terms |
A matrix
print("Nah")print("Nah")
Title
standardBootIntervals(boot, lower = 0.025, upper = 0.975)standardBootIntervals(boot, lower = 0.025, upper = 0.975)
boot |
A vector of bootstrap results |
lower |
the lower alpha |
upper |
the upper alpha |
A matrix of the mean, median, min, max, lower and upper CI values
data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis,1,2,3,TRUE,FALSE,TRUE,TRUE,FALSE) apply(boots,1, standardBootIntervals) DFanalysis(data,1,2,3)data<-DFSimulated() boots<-NaiveBoot(data, groups="Rs", keepgroups=TRUE) boots<-bootAnalysis(boots, cbind, DFanalysis,1,2,3,TRUE,FALSE,TRUE,TRUE,FALSE) apply(boots,1, standardBootIntervals) DFanalysis(data,1,2,3)
startparallel
startparallel(data)startparallel(data)
data |
data to pass to the clusters. Must be the name of the data in quotes |
NA
#data<-TestData() #clust<-startparallel("data") #endparallel(clust)#data<-TestData() #clust<-startparallel("data") #endparallel(clust)
Simple function for creating a dataset of two related variables.
TestData(nobs = 1000, intercept = 0, beta = 5, meanX = 0, sdX = 1, sdYerr = 1)TestData(nobs = 1000, intercept = 0, beta = 5, meanX = 0, sdX = 1, sdYerr = 1)
nobs |
Number of observations, defaults to 1000 |
intercept |
Intercept of the regression. Defaults to 0 |
beta |
Beta for the regression equation, defaults to 5 |
meanX |
Mean of X, defaults to 0 |
sdX |
Standard deviation of X, defaults to 1 |
sdYerr |
Variance of the error term of Y, defaults to 1 |
A dataframe with an X and Y variable produced by the entered parameters
X<-TestData()X<-TestData()
WARNING: This function can't be used with data that is already fed through the RK function. The correlation matrix will not be positive definite.
uniboot( data, B = 1000, groups = NULL, keepgroups = FALSE, size = 1, HIcor = NULL, sampleframe = "group" )uniboot( data, B = 1000, groups = NULL, keepgroups = FALSE, size = 1, HIcor = NULL, sampleframe = "group" )
data |
The data frame to be resampled |
B |
The number of bootstrap samples. Alternatively "sampleframe" which will return the univariate sampling frame. "sampleframe" is not advised when there are many observations and/or many variables as the returned dataframe will be quite large. |
groups |
A grouping variable name |
keepgroups |
Should the grouping variable be kept in the final datasets? |
size |
The size of the bootstrap sample to be returned. Should be as a proportion and must be evenly divided into nrow(data). |
HIcor |
If a hypothesis imposed correlation matrix is to be used, this argument takes a list of hypothesized correlation matrices. IT MUST BE A LIST OF ONE OR MORE MATRICES. Multiple matrices can be entered in the case of grouped data (one for each group). If the nil-null correlation is to be used an identity matrix can be entered here (the same size as the appropriate correlation matrix). |
sampleframe |
Takes one of either "group" or "whole". When doing bootstrapping of grouped data this tells uniboot if the whole sample should be used as the sampling frame for each group (whole), or not (group). "group" should be used unless it is believed that all groups share the same underlying marginal distribution for each variable (e.g., the same mean and variance in the case of normally distributed data). |
A list of bootstrap samples
data<-TestData() X<-uniboot(data,1000)data<-TestData() X<-uniboot(data,1000)
unibootsample
unibootsample(data, size)unibootsample(data, size)
data |
A dataframe or matrix to be univariately bootstrapped |
size |
size of each bootstrap sample as a fraction of the total sample size. Total sample size must be evenly divisible by "size". |
A matrix or dataframe with nrow=nrow(X)*size
X<-c(0:9) Y<-c(20:29) Z<-cbind(X,Y) unibootsample(Z,1)X<-c(0:9) Y<-c(20:29) Z<-cbind(X,Y) unibootsample(Z,1)
unibootVar
unibootVar(X, times)unibootVar(X, times)
X |
The variable |
times |
The number of times the variable is repeated in the univariate sampling frame. This is going to be equal to the number of variables being univariately sampled |
The variance of the variable in the univariate sampling frame
X<-c(1,2) times<-100 unibootVar(X,times) var(X)X<-c(1,2) times<-100 unibootVar(X,times) var(X)
zScore
zScore(X, times)zScore(X, times)
X |
vector to be converted to z scores |
times |
exponent controlling the denominator scaling |
Returns a vector of z scores
X<-c(1:10) zScore(X, times=1)X<-c(1:10) zScore(X, times=1)
centerData
zScoreData(data)zScoreData(data)
data |
The data to be converted to z scores |
Data converted to z scores
X<-data.frame(X=c(1:4),Y=c(6:9)) zScoreData(X)X<-data.frame(X=c(1:4),Y=c(6:9)) zScoreData(X)