--- title: "Overview of `simcdm`" author: "James Joseph Balamuta" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Overview of simcdm} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Overview Within this document, we highlight the different features of the `simcdm` package as it relates to simulating cognitive diagnostic modeling data. ## Notation For consistency, we aim to use the following notation. Denoting individuals: - $N$ is the total number of individuals taking the assessment. - $i$ is the current individual. Denoting items: - $J$ is the total number of items on the assessment. - $j$ is the current item - $Y_{ij}$ is the observed binary response for individual $i$ ($1\leq i \leq N$) to item $j$ ($1\leq j\leq J$). - $s_j$ is the probability of slipping on item $j$. - $g_j$ is the probability of guessing on item $j$. Denoting attributes: - $K$ is the total number of attributes for the assessment item. - $k$ is the current attribute. - $\boldsymbol\alpha_i=\left(\alpha_{i1},\dots,\alpha_{iK}\right)^\prime$ where $\boldsymbol\alpha_i\in \left\{0,1\right\}^K$ and $\alpha_{ik}$ is the latent binary attribute for individual $i$ on attribute $k$ ($1\leq k\leq K$). Denoting the skill/attribute "Q" matrix: - $\boldsymbol q_{j}=\left(q_{j1},\dots,q_{jK}\right)^\prime$ be the $j$th row of $\boldsymbol Q$ such that $q_{jk}=1$ if attribute $k$ is required for item $j$ and zero otherwise. # Usage To use `simcdm`, please load the package. ```{r load-r-pkg} library(simcdm) ``` ## Matrix Simulation Simulations within this section are done underneath the following settings. ```{r setup-matrix-sims} # Set a seed for reproducibility set.seed(888) # Setup Parameters N = 15 # Number of Examinees / Subjects J = 10 # Number of Items K = 2 # Number of Skills / Attributes ``` ### Identifiable Q Matrix Simulation Simulate an identifiable $Q$ matrix ($J$ items by $K$ skills). ```{r sim-q-matrix} # Set a seed for reproducibility set.seed(1512) # Simulate an identifiable Q matrix Q = sim_q_matrix(J, K) Q ``` ### $\eta$ Matrix Simulation Create the ideal response matrix for each trait ($J$ items by $2^K$ latent classes). ```{r sim-eta-matrix} # Set a seed for reproducibility set.seed(4421) # Simulate an eta matrix eta = sim_eta_matrix(K, J, Q) eta ``` ### Attribute profile simulation Generate latent attribute profile classes ($2^K$ latent classes by $K$ skills). ```{r attribute-classes-gen} # Create a listing of all attribute classes class_alphas = attribute_classes(K) class_alphas ``` Generate latent attribute profile class for each subject ($N$ subjects by $K$ skills). ```{r sim-subject-attributes} # Set a seed for reproducibility set.seed(5126) # Create attributes for a subject subject_alphas = sim_subject_attributes(N, K) subject_alphas # Equivalent to: # subject_alphas = class_alphas[sample(2 ^ K, N, replace = TRUE),] ``` ### DINA Simulation Simulations within this section are done underneath the following settings. ```{r setup-sim-dina} # Set a seed for reproducibility set.seed(888) # Setup Parameters N = 15 # Number of Examinees / Subjects J = 10 # Number of Items K = 2 # Number of Skills / Attributes # Assign slipping and guessing values for each item ss = gs = rep(.2, J) # Simulate identifiable Q matrix Q = sim_q_matrix(J, K) # Simulate subject attributes subject_alphas = sim_subject_attributes(N, K) ``` ### DINA Item Simulation Simulate item data, $Y$, under DINA model ($N$ by $J$) ```{r sim-dina-items} # Set a seed for reproducibility set.seed(2019) # Simulate items under the DINA model items_dina = sim_dina_items(subject_alphas, Q, ss, gs) items_dina ``` ### DINA Attribute Simulation Simulate attribute data under DINA model ($N$ by $J$) ```{r sim-dina-attributes} # Set a seed for reproducibility set.seed(51823) # Simulate attributes under the DINA model attributes = sim_dina_attributes(subject_alphas, Q) attributes ``` ## rRUM Simulation The rRUM simulations are done using the following settings. ```{r rrum-sim-setup} # Set a seed for reproducibility set.seed(888) # Setup Parameters N = 15 # Number of Examinees / Subjects J = 10 # Number of Items K = 2 # Number of Skills / Attributes # The probabilities of answering each item correctly for individuals # who do not lack any required attribute pistar = rep(.9, J) # Simulate an identifiable Q matrix Q = sim_q_matrix(J, K) # Penalties for failing to have each of the required attributes rstar = .5 * Q # Latent Class Probabilities pis = c(.1, .2, .3, .4) # Generate latent attribute profile with custom probability (N subjects by K skills) subject_alphas = sim_subject_attributes(N, K, prob = pis) # Equivalent to: # class_alphas = attribute_classes(K) # subject_alphas = class_alphas[sample(2 ^ K, N, replace = TRUE, prob = pis),] ``` ### Simulate rRUM items Simulate rRUM item data $Y$ ($N$ by $J$) ```{r sim-rrum} # Set a seed for reproducibility set.seed(912) # Generate rRUM items rrum_items = sim_rrum_items(Q, rstar, pistar, subject_alphas) rrum_items ```