########################### # Thibault Leroy - 190118 # ########################### A summary of the analyses made to detect somatic mutations - Plomion et al. 2018 Individual DNA samples from the same branch were the pooled in an equimolar solution with a final concentration of 769-1,388 ng/μL. We prepared tightly sized paired-end libraries (600bp in size) as described in the “sequencing” section (Plomion et al. 2018 nature plants) and sequenced each of these libraries on one to four lanes of a HiSeq2000 or HiSeq2500 sequencer (Illumina) (Supplementary Table 19, 100 bp or 250 bp paired-end reads). We obtained 284- (L1), 250.5- (L2) and 264.9-fold (L3) haploid genome coverage for these samples. For each of the three branches (L1, L2 and L3), reads were mapped against the reference genome sequence with BWA MEM using the default parameters, except for minimum seed length(k =79). After sorting, PCR duplicates were removed with Picard (http://broadinstitute.github.io/picard/). We searched for somatic mutations, using MuTect (a program developed for the detection of somatic point mutations in heterogeneous cancer samples) to compare the three libraries (Supplementary Table 20). All vcf files generated by MuTect are available ("BBX_[BranchX]vs[BranchY]_Mutect_[date].vcf") and all candidate somatic variants detected by MuTect (list of SNPs labeled as "KEEP" in files BBX_[BranchX]vs[BranchY]_Mutect_[date].KEEP.txt). Based on these candidate mutations, the accuracy of somatic point mutations was ensured, by considering as reliable somatic mutations only those sites with the following characteristics: (i) a minimum depth of 50x in both the reference and potentially mutated libraries, (ii) no mutant (i.e. alternative) allele in the reference library, and (iii) a minimum frequency of 20 percent for the mutant allele in the potentially mutated library. Finally, we checked the consistency of the results over all analyses, to obtain a final list of highly reliable somatic mutations (see Supplementary Table 5 for details). Plomion et al. 2018 Nature Plants, Online Methods section "Detection of somatic mutations" Before reading files and to be complete, please keep in mind that: 1/ Nomenclature in files: AOSW = Branch L1 COSW = Branch L2 EOSW = Branch L3 2/ Reference vs. mutated comparisons [BranchX] is always the "normal" tissue ("reference") [BranchY] is always the "mutated" tissue tested e.g. file "BBX_L1vsL2_Mutech_call_stats_060316.txt.KEEP" lists all candidate somatic mutations between L1 and L2 (SNPs found in L2 as compared to the reference branch L1). For any other questions, please send an email to both thibault.leroy@inra.fr & christophe.plomion@inra.fr