FabElm_BarcodeDb: matK barcode database of legumes
FabElm_BarcodeDb: matK barcode database of legumes
Abstract
DNA barcoding is an imperative implementation of chloroplast rbcL and matK regions exploited as standard molecular barcodes for species identification. MatK is highly conserved in plants and has been used extensively as a phylogenetic marker for classification of plants. In this study matK sequences of Leguminosae were retrieved for variant analysis and phylogentics. From online resources, maturase sequences were retrieved; redundant sequences and partials along with poor quality reads were filtered to compile 3639 complete non-redundant matK sequences and constructed into a database for ready reference. The database FabElm_BarcodeDb made available at app.bioelm.com was constructed using available sequence resources. The chloroplast genome of plants contains matK gene of 1500 bp, positioned between intron of trnK, associated in-group II intron splicing. Mitochondrial matR and genomic matN sequences were compared with chloroplast matK. These maturase sequences share regions of homology with chloroplast and mitochondrial regions and are expected to be regulated by miRNA in producing splice variants contributing to speciation. Base substitution rates of nuclear maturase were comparable with mitochondrial maturase and are different from matK sequences. Hence, few identified species in this investigation were clustered with other tribes when analysed using matK. MatK is effective in resolving the species level variations as splicing contributes to speciation; but utilization of matK alone as a barcode marker for legumes is dubious, as it could not resolve some species identity.