Check for duplicate rows in response data — check_duplicates • rcicrdiagnostics

Duplicate rows in response data almost always indicate a merge bug or an accidental double-write. This check flags two distinct cases:

Usage

check_duplicates(
  responses,
  method = c("2ifc", "briefrc"),
  col_participant = "participant_id",
  col_stimulus = "stimulus",
  ...
)

Arguments

responses: A data frame of trial-level responses.
method: Either "2ifc" or "briefrc". Currently informational.
col_participant, col_stimulus: Column names.
...: Unused.

Value

An rcdiag_result() object. The data element contains full_duplicates and pair_duplicates data frames of the flagged rows.

Details

Fully duplicated rows (identical across every column).
Duplicated (participant, stimulus) pairs where the response or RT differs. These are harder to interpret: either the design legitimately repeats stimuli (in which case the check should be skipped), or the data pipeline glued together two runs for the same participant.

Examples

responses <- data.frame(
  participant_id = c(1, 1, 1, 2),
  stimulus       = c(1, 2, 2, 1),
  response       = c(1, -1, -1, 1)
)
check_duplicates(responses)
#> [WARN] Duplicates
#>   1 of 4 rows are fully duplicated.
#>   1 rows share a (participant, stimulus) pair with differing response or RT.