Dataset for: Peer assessment using criteria or comparative judgement? A replication study on the learning effect of two peer assessment methods