We introduced two measures: the rediscovery rate (RDR) and the false discovery rate in a validation population (vFDR).
-The RDR is the expected proportion of findings validated among those declared significant in the training sample.
-The vFDR is the expected proportion of false validated features among all those taken forward in the validation study.
RDR and vFDR are obtained by just using the training sample. These measures can also be obtained using both the training and validation sample (if available). In this case they are defined as observed RDR and observed vFDR.
In the example (in STEP 1 select 'Use example') I select all the features from the training set with a P-value < 0.001 [c.t.=-log10(P-value)=3] to be taken forward in the validation set. In the validation set, I declare significant and validated all the features with a P-value < 0.1 [c.v.=-log10(P-value)=1].
By using these settings we expect 80% (RDR=0.80) of the feature taken forward to validation to be validated (i.e. having a P-value < c.v.). The number of false positives among the features taken forward to validation approaches 0 (vFDR=0).
Since we collected a validation set and tested all the features also in the validation set, we can calculate the observed RDR and observed vFDR. They are 0.79 and 0, respectively. These values are smilar to that estimated just using the training sample, indicating that the inference is correct.