Introduction to effectcheck

Important notice

This package is under active development and has not been fully validated. Results should be independently verified before use in any consequential context. Use is at your sole responsibility. Corrections, verification reports, and contributions are welcome at https://github.com/giladfeldman/escicheck or by contacting Gilad Feldman ().

Overview

effectcheck is a conservative, assumption-aware statistical consistency checker for APA-style research results. It parses test statistics from text, PDF, HTML, and Word documents, recomputes effect sizes and p-values, and flags discrepancies.

Key design principles:

Quick start

library(effectcheck)
#> ===========================================================================
#>   effectcheck v0.2.3 - DEVELOPMENT VERSION
#> ===========================================================================
#>   WARNING: This package is under heavy development and has NOT yet been
#>   fully validated. Results should be independently verified before use in
#>   any consequential context. Use is at your sole responsibility.
#> 
#>   Report issues & help verify: https://github.com/giladfeldman/escicheck
#>   Contact: Gilad Feldman <giladfel@gmail.com>
#> ===========================================================================

# Check a single APA-style result
result <- check_text("t(28) = 2.21, p = .035, d = 0.80")
print(result)
#> 
#> === EffectCheck Results ===
#> 
#> Total statistics found: 1
#>   PASS: 1 | WARN: 0 | ERROR: 0 | INSUFFICIENT: 0
#> 
#> Test types: t(1)
#> Uncertainty: high(1)
#> 
#> Use print(x, short = FALSE) for detailed results.
#> Use summary(x) for comprehensive statistics.

The output shows whether the reported effect size is consistent with what effectcheck recomputes from the test statistic and degrees of freedom.

Checking multiple statistics

effectcheck handles multiple statistics in the same text:

text <- "
Study 1 found a significant effect, t(45) = 3.12, p = .003, d = 0.91.
The ANOVA revealed a main effect, F(2, 87) = 5.44, p = .006.
The correlation was significant, r(48) = .42, p = .003.
"

results <- check_text(text)
summary(results)
#> 
#> ========================================
#>      EffectCheck Summary Report
#> ========================================
#> 
#> Version: 0.2.3
#> Generated: 2026-03-20 17:41:01.777336
#> 
#> Total statistics analyzed: 3
#> 
#> --- Status Distribution ---
#>   OK                 2 ( 66.7%) ####################
#>   PASS               1 ( 33.3%) ##########
#> 
#> --- Test Types ---
#>   F             1
#>   r             1
#>   t             1
#> 
#> --- Uncertainty Levels ---
#>   high          1
#>   low           1
#>   medium        1
#> 
#> --- Design Inferred ---
#>   independent        1
#>   unclear            2
#> 
#> --- Key Metrics ---
#>   Error rate:             0.0%
#>   Decision error rate:    0.0% (0 cases)
#>   Mean effect delta:    0.0004
#>   Median effect delta:  0.0004
#>   Max effect delta:     0.0004
#> 
#> ========================================

Understanding the output

Each row in the results represents one detected statistic. Key columns:

Working with results

effectcheck returns S3 objects with convenient methods:

# Filter by status
errors <- get_errors(results)
warnings <- get_warnings(results)

# Filter by test type
t_tests <- filter_by_test_type(results, "t")

# Get counts
count_by(results, "test_type")
#>   test_type count percent
#> 1         F     1    33.3
#> 2         r     1    33.3
#> 3         t     1    33.3

Examining variants

When effectcheck cannot determine the study design, it computes multiple variants. You can inspect these:

result <- check_text("t(28) = 2.21, p = .035, d = 0.80")

# See all computed variants for the first result
format_variants(result, 1)
#> [1] "Same-Type Variants:\n  d_ind_equalN = 0.807 (Independent samples, equal group sizes assumed)\n  d_ind_min = 2.248 (Independent samples, extreme imbalance (1 vs N-1))\n  d_ind_max = 2.248 (Independent samples, extreme imbalance (N-1 vs 1))\n  dz = 0.4104 (Paired/within-subjects design)\n  dav = 0.4104 (Paired design, average SD method)\n  drm = 0.4104 (Repeated measures design)\n\nAlternative Suggestions:\n  g_ind = 0.7852 [Hedges' g]\n      Bias-corrected version of d, recommended for small samples\n  r = 0.3854 [Correlation (r)]\n      Alternative way to express t-test effect size, useful for meta-analysis\n  dz = 0.4104 [Cohen's dz]\n      If this is actually a paired/within-subjects design"

# Get metadata about a specific variant type
get_variant_metadata("d_ind")
#> $name
#> [1] "Cohen's d (independent)"
#> 
#> $assumptions
#> [1] "Independent samples, known group sizes"
#> 
#> $when_to_use
#> [1] "Between-subjects design with known n1 and n2"
#> 
#> $formula
#> [1] "d = t * sqrt(1/n1 + 1/n2)"

Checking files

For PDF, HTML, DOCX, or plain text files:

# Single file
result <- check_file("manuscript.pdf")

# Multiple files
results <- check_files(c("study1.pdf", "study2.html"))

# Entire directory
results <- check_dir("manuscripts/")

Exporting results

results <- check_text("t(28) = 2.21, p = .035, d = 0.80")

# HTML report
generate_report(results, out = file.path(tempdir(), "report.html"))

# CSV/JSON export
export_csv(results, out = file.path(tempdir(), "results.csv"))
export_json(results, out = file.path(tempdir(), "results.json"))

Comparison with statcheck

If the statcheck package is installed, you can run both tools side-by-side:

comp <- compare_with_statcheck("t(28) = 2.21, p = .035, d = 0.80")
print(comp)

Status thresholds

effectcheck uses three status levels:

Status Meaning
PASS Delta within tolerance (or APA rounding match)
WARN Delta within 3x tolerance, or ambiguous match
ERROR Delta exceeds 5x tolerance

Supported test types

Type Example APA format
t-test t(28) = 2.21, p = .035
F-test F(2, 87) = 5.44, p = .006
Correlation r(48) = .42, p = .003
Chi-square chi2(1, N = 100) = 4.50, p = .034
z-test z = 2.33, p = .020
Mann-Whitney U = 145.00, p = .023
Wilcoxon W = 89.00, p = .041
Kruskal-Wallis H(2) = 8.73, p = .013
Regression b = 0.45, SE = 0.12, t(98) = 3.75, p < .001