segmentation {mpss}R Documentation

Segmentation by thresholding to obtain CNV regions

Description

Performs segmentation by applying a threshold on the intensity values.

Usage

segmentation = function(data = data, location = location, chromosome = cc,
del.lim = -0.15, dup.lim = 0.15, min.probes = 10, min.probe.density = 5, min.length = 1000, min.dist = 5000,
fdr.limit = 1e-05, chi = chi, pos1 = pos1, pos2 = pos2, pos3 = NA, pos4 = NA, data1 = data1, 
data2 = data2, data3 = NA, data4 = NA)

Arguments

data A list of estimated smoothed intensities, each component corresponding to each chromosome.
location A list of genomic locations that corresponds to each value in intensity.
chromosome A vector of same length as intensity and location that indicates the chromosome number.
del.lim Threshold for deletions.
dup.lim Threshold for duplications.
min.probes Minimum number of probes in region.
min.probe.density Minimum number of probes per kb in region.
min.length Minimum length of region.
min.dist Two regions will be merged if the distance between them is less than min.dist.
fdr.limit Fdr threshold to define significant segments. See Details.
chi A list of contribution of each probe to the chi-squared statistic, each component corresponding to each chromosome.
pos1 Probe positions for platform 1. A list where each component corresponds to each chromosome.
pos2 Same as pos1, but for platform 2.
pos3 Same as pos1, but for platform 3. Set to NA if there is no platform 3 available
pos4 Same as pos1, but for platform 4. Set to NA if there is no platform 4 available.
data1 Normalized intensity ratio for platform 1, corresponding to each position in pos1. A list where each component corresponds to each chromosome.
data2 Same as data1, but for platform 2.
data3 Same as data1, but for platform 3. Set this to NA if there is no platform 3 available.
data4 Same as data1, but for platform 4. Set this to NA if there is no platform 4 available.

Details

Deletion segments are sets of consecutive probes for which 'intensity' is consistenly smaller than del.lim and duplication segments are sets of consecutive probes for which 'intensity' is consistenly larger than dup.lim.

Value

cnv A data frame with 23 columns: start: Start position of region. end: End position of region. chr: Chromosome number of region.. start.loc: Index of start position in unlist(location). end.loc: Index of end position in unlist(location). length: length of region. p: p-value. fdr: false discovery rate. numprobes: number of probes in region. scaled_x2: scaled chi squared value. cn: copy number status, 1 for deletion and 3 for duplication. m: mean of the probe intensities in the region. sd: standard deviation of the probe intensities in the region. pt: p value for test of discrepant regions. See Teo et al., 2010 for details. t_fdr: FDR for test of discrepant regions. See Teo et al., 2010 for details. m1: mean of the probe intensities from platform 1 in the region. m2: same as m1 but for platform 2. m3: same as m1 but for platform 3. m4: same as m1 but for platform 4. sd1: standard deviation of the probe intensities from platform 1 in the region. sd2: same as sd1 but for platform 2. sd3: same as sd1 but for platform 3. sd4: same as sd1 but for platform 4.

References

Teo S M et al.,(2010). Multi-Platforms Segmentation Approach for Joint Detection of Copy Number Variations. Submitted.

See Also

smoothseg2,mpsmooth

Examples

setwd(paste(searchpaths()[grep("mpss",searchpaths())],'/data/',sep=""))
load("illum12056.Rdata") #Illumina platform
#norm_y contains the normalized intensity ratios for chromosome 1 and 2.
#norm_x contains the corresponding probe locations.
illum = norm_y
illumx = norm_x
load("affy12056.Rdata") #Affymetrix platform
affy = norm_y
affyx = norm_x

data = NULL
location = NULL
chi = NULL 
for(chr in 1:2){  
     
        ss = smoothseg2(pos1 = affyx[[chr]] ,pos2 = illumx[[chr]], pos3 = NA, pos4 = NA, data1 = affy[[chr]], 
        data2 = illum[[chr]], data3 = NA, data4 = NA, maxiter = 50, lambda = 100, lambda.range=c(20,600))
         
        data[[chr]] = ss$y
        location[[chr]] = ss$pos
        chi[[chr]] = ss$chi
       
}

cc = NULL
for (i in 1:2){
     cc = c(cc, rep(i, length(location[[i]])))
}
s = segmentation(data = data, location = location, chromosome = cc,
del.lim = -0.15, dup.lim = 0.15, min.probes = 10, min.probe.density = 5, min.length = 1000, min.dist = 5000,
fdr.limit = 1e-05, chi = chi, pos1 = affyx, pos2 = illumx, pos3 = NA, pos4 = NA, data1 = affy, 
data2 = illum, data3 = NA, data4 = NA)

[Package mpss version 1.2 Index]