# Cohort simulations This repository contains all code files related to our ROSALI/Resiuals Cohort study simulation project. In order to save disk space, data files are not stored on this server and are instead available on https://osf.io. ## File Structure ``` πŸ“¦ simul_these β”œβ”€ catalogue.md - List and description of scenarios β”œβ”€Β πŸ—‚οΈ Analysis - ANALYSIS RESULTS β”œβ”€Β πŸ—‚οΈ Data - GENERATED DATASETS β”‚Β Β β”œβ”€Β πŸ—‚οΈ DIF - DATASETS WITH DIF β”‚Β Β β””β”€Β πŸ—‚οΈ noDIF - DATASETS WITHOUT DIF β”œβ”€Β πŸ—‚οΈ Modules - R AND STATA MODULES β”‚Β Β β”œβ”€Β πŸ—‚οΈ rosali_custom - DATASETS WITH DIF β”œβ”€Β πŸ—‚οΈ RProject - R SCRIPTS FOR VARIOUS TASKS β””β”€Β πŸ—‚οΈ Scripts - R AND STATA SCRIPTS β”œβ”€Β πŸ—‚οΈ Analysis - PCM ANALYSIS SCRIPTS └─ πŸ—‚οΈΒ Data_generation - SIMULATION SCENARIO SCRIPTS Β Β Β β”œβ”€Β πŸ—‚οΈ DIF Β Β Β β””β”€Β πŸ—‚οΈ noDIF ``` ## Naming conventions ### Initial Datasets **XXX_N** - Scenario XXX / N individuals per group ### Analyzed Datasets **noDIF / XXX_N.csv** - Analysis for scenario XXX_N by PCM __without__ accounting for DIF and confusion accounted for **DIF / XXX_N.xls** - Analysis for scenario XXX_N by PCM __with__ DIF and confusion accounted for **noDIF_prop / XXX_N.csv** - Analysis for scenario XXX_N by PCM __without__ accounting for DIF and confusion accounted for by propensity score **DIF_prop / XXX_N.xls** - Analysis for scenario XXX_N by PCM __with__ DIF accounted for and confusion accounted for by propensity score **ROSALI-DIF_prop / XXX_N_original.xls** - Analysis for scenario XXX_N by PCM __with__ DIF accounted for after detection by ROSALI and confusion accounted for by propensity score **RESIDUALS_prop / XXX_N_original.xls** - Analysis for scenario XXX_N by PCM __with__ DIF accounted for after detection by Andrich & Hagquist's residuals method and confusion accounted for by propensity score ## Reproduction - TO BE MODIFIED 1. Run **/Scripts/Data_generation/NoDIF/scenarios_noDIF_baseline.do** to simulate no DIF data 2. Run files in πŸ—‚οΈ **/Scripts/Data_generation/DIF/** to simulate DIF data 3. Run **/RProject/Scripts/Analysis/pcm_nodif.R** to analyze without accounting for DIF 4. Run files in πŸ—‚οΈ **/Scripts/Analysis/DIF/** to analyze while accounting for DIF 5. Run **/Scripts/Analysis/DIF-ROSALI/pcm_dif_rosali.do** to analyze data after accounting for DIF as detected by ROSALI 6. Run **/RProject/Scripts/Analysis/resali_analysis.R** to perform residuals DIF detection and prepare data for PCM analysis. 7. Run **/Scripts/Analysis/DIF-RESIDUALS/pcm_dif_residus.do** to analyze data after accounting for DIF as detected by the residuals method 8. Run **/RProject/Scripts/Analysis/aggregation.R** to compile and visualize results **OR** 1. [BASH ONLY] Run **prepare_file_structure.sh** (only on first run) and **autorun.sh** (by default, will take multiple weeks to run. Please modify to run in parrallel if necessary)