You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

3.0 KiB

Cohort simulations

This repository contains all code files related to our ROSALI/Resiuals Cohort study simulation project. In order to save disk space, data files are not stored on this server and are instead available on https://osf.io.

File Structure

📦 simul_these
├─ catalogue.md           - List and description of scenarios
├─ 🗂️ Analysis               - ANALYSIS RESULTS
├─ 🗂️ Data                   - GENERATED DATASETS
│  ├─ 🗂️ DIF                 - DATASETS WITH DIF
│  └─ 🗂️ noDIF               - DATASETS WITHOUT DIF
├─ 🗂️ Modules                - R AND STATA MODULES
│  ├─ 🗂️ rosali_custom       - DATASETS WITH DIF
├─ 🗂️ RProject               - R SCRIPTS FOR VARIOUS TASKS
└─ 🗂️ Scripts                - R AND STATA SCRIPTS
   ├─ 🗂️ Analysis            - PCM ANALYSIS SCRIPTS
   └─ 🗂 Data_generation     - SIMULATION SCENARIO SCRIPTS
      ├─ 🗂️ DIF
      └─ 🗂️ noDIF

Naming conventions

Initial Datasets

XXX_N - Scenario XXX / N individuals per group

Analyzed Datasets

noDIF / XXX_N.csv - Analysis for scenario XXX_N by PCM without accounting for DIF and confusion accounted for DIF / XXX_N.xls - Analysis for scenario XXX_N by PCM with DIF and confusion accounted for

noDIF_prop / XXX_N.csv - Analysis for scenario XXX_N by PCM without accounting for DIF and confusion accounted for by propensity score DIF_prop / XXX_N.xls - Analysis for scenario XXX_N by PCM with DIF accounted for and confusion accounted for by propensity score ROSALI-DIF_prop / XXX_N_original.xls - Analysis for scenario XXX_N by PCM with DIF accounted for after detection by ROSALI and confusion accounted for by propensity score RESIDUALS_prop / XXX_N_original.xls - Analysis for scenario XXX_N by PCM with DIF accounted for after detection by Andrich & Hagquist's residuals method and confusion accounted for by propensity score

Reproduction - TO BE MODIFIED

  1. Run /Scripts/Data_generation/NoDIF/scenarios_noDIF_baseline.do to simulate no DIF data
  2. Run files in 🗂️ /Scripts/Data_generation/DIF/ to simulate DIF data
  3. Run /RProject/Scripts/Analysis/pcm_nodif.R to analyze without accounting for DIF
  4. Run files in 🗂️ /Scripts/Analysis/DIF/ to analyze while accounting for DIF
  5. Run /Scripts/Analysis/DIF-ROSALI/pcm_dif_rosali.do to analyze data after accounting for DIF as detected by ROSALI
  6. Run /RProject/Scripts/Analysis/resali_analysis.R to perform residuals DIF detection and prepare data for PCM analysis.
  7. Run /Scripts/Analysis/DIF-RESIDUALS/pcm_dif_residus.do to analyze data after accounting for DIF as detected by the residuals method
  8. Run /RProject/Scripts/Analysis/aggregation.R to compile and visualize results

OR

  1. [BASH ONLY] Run prepare_file_structure.sh (only on first run) and autorun.sh (by default, will take multiple weeks to run. Please modify to run in parrallel if necessary)