INTRODUCTION
Accurate species identification is essential for biodiversity monitoring, conservation strategies, ecological research, and quality control of plant-based products (Hebert et al., 2003; Hollingsworth et al., 2016). However, traditional morphological identification methods often face significant limitations, particularly with juvenile, sterile, fragmented, or degraded plant material, and when taxonomic expertise is lacking (Bell et al., 2017; Deiner et al., 2017). These challenges underscore the need for more efficient and reliable identification approaches.
DNA barcoding, a molecular method that employs short, standardized DNA regions to identify species, has been widely adopted to address these issues (Hebert et al., 2003). In plants, DNA barcoding has proven useful in various ecological and applied contexts, including biodiversity assessments (Hollingsworth et al., 2016; Kuzmina et al., 2017), detection of botanical and entomological sources of foods (Prosser and Hebert, 2017), niche partitioning among large mammalian herbivores (Kartzinel et al., 2015), and authentication of commercial herbal products (Ichim, 2019). The technique also supports the reconstruction of ecological interactions, such as pollination and herbivory networks (Deiner et al., 2017).
Central to the success of DNA barcoding is the establishment of robust reference libraries, comprising high-quality, taxonomically verified sequences, voucher specimens, and comprehensive metadata (Hebert et al., 2003; Pentinsaari et al., 2020). These libraries ensure accurate species assignments and provide the foundation for quality control in molecular analyses. While reference databases for animals (e.g., cytochrome c oxidase subunit I–based) are now extensive and highly curated, plant barcode libraries lag behind because individual loci such as rbcL and matK have relatively low discriminatory power, and plant hybridization and polyploidy add complexity (Bell et al., 2017).
To address these limitations, efforts continue to build well-curated, taxonomically representative plant barcode reference libraries, particularly for floras of high ecological and economic importance. This library was developed through extensive sampling of native Korean vascular plant genera, including herbaceous and woody taxa across major lineages. All barcode sequences are linked to authenticated voucher specimens and deposited in publicly accessible databases, ensuring reproducibility and traceability. While rbcL is known for its universality and ease of amplification, the curated genus-level reference set established in this study enhances its utility for large-scale biodiversity assessments, particularly when used in conjunction with high-throughput sequencing platforms. This resource offers broad taxonomic coverage and will facilitate species identification in environmental samples, floristic surveys, and quality control of plant-based products. It also contributes to biodiversity conservation efforts under frameworks such as the Kunming-Montreal Global Biodiversity Framework (GBF), particularly for floras of high ecological and economic significance.
MATERIALS AND METHODS
Sample collection
This study aimed to establish a genus-level DNA barcode reference library for Korean vascular plants, based on 3,440 species of flowering plants, conifers, and ferns cataloged by the Flora of Korea Editorial Committee (2007), representing 1,045 genera and 217 families. Taxonomic classifications followed the Cronquist system for angiosperms, the Engler system for gymnosperms, and, for ferns, the system presented in Vol. 16, Pteridophyta, of Illustrated Encyclopedia of Fauna and Flora of Korea (Park, 1975), with modifications reflecting more recent taxonomic evidence (Angiosperm Phylogeny Group, 2016).
A total of 1,167 samples representing 1,033 genera were collected from the Korean Peninsula (On-line Supplemental Data Table S1). To ensure comprehensive genus-level coverage, taxon selection included not only indigenous species but also well-established introduced species, cultivated plants frequently found outside managed cultivation, and horticultural taxa commonly observed in domestic gardens. With the exception of genera comprising fewer than three species, a minimum of three species per genus was targeted for collection. Within each genus, species selection was primarily based on the availability of tissue samples. Additional criteria for sample selection were applied sequentially: morphological typicality of the species, the most recent collection date among available specimens, collection from geographically distinct locations, and the presence of additional taxonomic verification.
For DNA extraction, fresh leaf tissues were collected either by silica gel desiccation or obtained from preserved herbarium specimens. Most specimens were morphologically identified by co-authors and professional taxonomists and subsequently deposited in the herbarium of the National Institute of Biological Resources (KB). Additionally, some tissue samples were obtained from the National Institute of Biological Resources (NIBR) germplasm resource bank and are associated with discrete NIBR accession numbers (accession numbers prefixed with ‘WBN’ are provided in Online Supplemental Data Table S1).
DNA extraction, PCR amplification, and sequencing
Genomic DNA was extracted from silica gel-dried leaves using the DNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. A portion of the chloroplast rbcL region was amplified using primer pairs (1F and 724R, and aF [Levin et al., 2003] and aR [Kress et al., 2009]) and then sequenced. PCR amplification was performed in 20-μL reaction mixtures (AccuPower PCR Premix, Bioneer Co., Daejeon, Korea) containing approximately 30 ng of genomic DNA template, 1× PCR buffer with 1.5 mM MgCl2, 250 μM of each dNTP, and 1 U Taq DNA Polymerase in a Mastercycler Nexus (Eppendorf Inc., Hamburg, Germany). The PCR products were purified using the PCRquick-spin PCR Product Purification Kit (Intron, Seongnam, Korea) and then sequenced in both directions with the primers used for PCR amplification on a 3730XL sequencer (Applied Biosystems, Foster City, CA, USA).
Sequence alignment and data analysis
In total, 3,832 accessions representing 1,142 genera were incorporated into the reference library (On-line Supplemental Data Table S1). Of these, 1,167 accessions, representing 1,033 genera, were newly sequenced for the rbcL DNA barcoding site in this study (On-line Supplemental Data Table S1). To accurately evaluate the discriminatory power of the rbcL region at the genus level, an additional 2,665 sequences from 1,107 genera were retrieved from the National Center for Biotechnology Information (NCBI) Nucleotide database using BLASTn searches. To avoid overrepresentation of closely related congeners and minimize sampling bias, species from the NCBI dataset were randomly selected to ensure a representative and unbiased assessment of rbcL barcode performance for genus-level identification. Notably, five parasitic genera reported as native to Korea-Aeginetia, Boschniakia, Cyrtosia, Monotropastrum, and Phacellanthus- were excluded from this analysis due to loss of the rbcL gene or its pseudogenization in their chloroplast genome (Kim et al., 2019; Choi and Park, 2021).
In this study, a subset of the rbcL gene data matrix, spanning 64 to 498 bp at the 5′ end, was used for analysis because this region was conserved for alignment yet contained sufficient information to capture sequence variation. All sequences were aligned in Geneious Prime 2025.1.3 (https://www.geneious.com) (Kearse et al., 2012), and neighbor-joining (NJ) analysis with the Kimura 2-parameter model (Kimura, 1980) was employed to assess whether the sequence datasets formed genus-specific clusters. The NJ tree was inferred and visualized using Geneious Prime. Inter- and intra-specific variation, genetic distances, and nucleotide divergences were calculated using BioEdit v7.2.5 (Hall, 1999). To estimate the species resolution ability of the barcode region, we determined the percentage of species correctly identified in the NJ tree relative to the total number of species examined in this study.
RESULTS AND DISCUSSION
The newly constructed DNA barcode library comprises 3,832 rbcL sequences representing 1,142 genera (Tables 1, 2, On-line Supplemental Data Table S1) and is available upon request. The dataset covers Lycopodiopsida, Polypodiopsida, gymnosperms (Table 1), and angiosperms (Table 2), encompassing 100% of families and 99.3% of genera recognized by the Flora of Korea Editorial Committee (2007). The library comprises 1,045 genera listed in the Genera of Vascular plants of Korea (Flora of Korea Editorial Committee, 2007), supplemented with additional taxa, including well-established introduced species, cultivated plants occurring beyond managed habitats, and horticultural taxa commonly observed in domestic gardens across the Korean Peninsula. The genera Aeginetia, Boschniakia, Monotropastrum, Phacellanthus, and Cyrtosia were excluded because they lack the rbcL gene due to their parasitic lifestyle, which relies on fungal networks or other vascular plants. The average number of accessions per genus was 3.65, ranging from one to a maximum of eight.
Using the rbcL gene region alone for genus-level identification of vascular plants is partially effective, though it has lineage-dependent taxonomic limitations. The current analyses showed that the overall identification rate based on NJ analysis of rbcL sequences was 93.2%, with most taxa forming distinct monophyletic clades (On-line Supplemental Data Fig. S1). Furthermore, rbcL sequences achieved 100% identification accuracy at the genus level for all sampled genera within Lycopsida, Polypodiopsida, and Pinopsida. The conserved nature of the rbcL coding region provides strong discriminatory power at the genus level, particularly in non-flowering vascular plants, where it reliably forms distinct, well-supported clades that correspond to morphologically recognized genera (Hollingsworth et al., 2011; Dong et al., 2014). These findings reinforce the utility of rbcL as a core barcode marker in lycophytes, ferns, and gymnosperms, where genus-level identification success rates often exceed 90%.
Despite the broad applicability of rbcL in vascular plant DNA barcoding, its resolution at the genus level is often insufficient for species-rich and morphologically complex angiosperm families, primarily because of low intergeneric sequence divergence (Michaels et al., 1993; Kajita et al., 2001).
In this study, among 1,142 genera analyzed, analyzed, 69 genera across 11 families could not be reliably discriminated using rbcL sequences alone (Table 3). For instance, within the family Magnoliaceae, the genera Magnolia and Michelia were indistinguishable. Similarly, in Rosaceae, genera such as Cotoneaster, Crataegus, Aria, and Pyrus exhibited highly conserved rbcL sequences, precluding their separation with this marker. These observations are congruent with earlier findings that noted the conserved nature of rbcL in these groups and its limited effectiveness for resolving generic boundaries (Morgan et al., 1994; Pang et al., 2011). Apiaceae, widely used as medicinal herbs in East Asia, showed only about 57.5% genus-level identification success with rbcL in our experiments. These findings suggest that rbcL is not suitable for genus-level discrimination within Apiaceae, and that alternative molecular markers, such as the internal transcribed spacer (ITS) region, are recommended. This conclusion is supported by previous studies (Chen et al., 2010; Downie et al., 2010; Liu et al., 2014) that reported low resolution of rbcL in Apiaceae and highlighted the superior discriminatory power of ITS markers. Furthermore, in Asteraceae—the most heavily sampled family in this study—only 78 of 102 genera were resolved at the genus level, corresponding to an identification success rate of 76.5%. These findings align with prior assessments that reported limited genus-level resolution using rbcL in large and taxonomically challenging angiosperm families (Michaels et al., 1993; Hollingsworth et al., 2011).
Unlike some angiosperm families where rbcL shows limited taxonomic resolution, the families Poaceae and Orchidaceae achieved high genus-level identification success rates of 96.0% and 95.1%, respectively (Table 3). Ninety-five of the 99 Poaceae genera analyzed were distinguished using rbcL sequences; however, closely related genera such as Miscanthus, Imperata, Pseudopogonatherum, and Sorghum were not resolved, underscoring rbcL’s limitations for discrimination at finer taxonomic levels. Consistent with these findings, previous studies have shown that although rbcL provides useful phylogenetic information across Poaceae, its limited sequence variation yields insufficient resolution to reliably discriminate closely related genera within the family (Saadullah et al., 2016; Wang et al., 2022). In Orchidaceae, rbcL sequences formed well-supported monophyletic clades for 39 of 41 genera, with the exceptions of Epipogium and Lecanorchis. This indicates that rbcL provides moderate resolution at the genus level in Orchidaceae. Similarly, Kim et al. (2014) analyzed 89 Korean orchid species and reported that rbcL had a relatively low species-level discrimination rate of 60.5%, though genus-level resolution was not explicitly quantified. Taken together, these findings suggest that while rbcL may be effective in resolving many genera, it is often insufficient for distinguishing closely related species or genera with recent divergence.
In this study, we constructed a comprehensive DNA barcoding reference dataset for all vascular plant genera in Korea, using nucleotide sequences of the rbcL gene, one of the most widely used markers in plant systematics and phylogenetic research. Although numerous studies have employed molecular markers—including rbcL, matK, ITS, and others—to delimit species and genera across plant lineages, this work focuses solely on rbcL to evaluate its utility for genus-level identification and taxonomic resolution across the Korean vascular flora. As part of this effort, we also assessed the systematic placement and taxonomic distinctiveness of plant genera considered endemic to the Korean Peninsula. According to Chung et al. (2017, 2023), six vascular plant genera are recognized as endemic to Korea: Abeliophyllum, Coreanomecon, Hanabusaya, Mankyua, Megaleranthis, and Sillaphyton. This study integrates findings from our rbcL-based analyses with previously published morphological and molecular studies to review the phylogenetic positions and taxonomic boundaries of these genera and to evaluate their current classification in light of recent evidence.
The present study confirms the monophyly of Abeliophyllum, supporting its recognition as a distinct evolutionary lineage within Oleaceae (Fig. 1A). Morphologically, Abeliophyllum distichum Nakai is characterized by small, fragrant white flowers and flat samara fruits—traits that distinguish it from closely related genera such as Forsythia. Previous phylogenetic analyses based on chloroplast and nuclear DNA regions (e.g., rbcL, matK, and ITS) consistently placed Abeliophyllum near Forsythia but as a separate clade (Kim et al., 2000; Ha et al., 2018). More recent plastome-scale studies have reinforced this distinction, further supporting the taxonomic independence of Abeliophyllum (Park et al., 2019). The findings of this study are congruent with these earlier reports and provide additional evidence for maintaining Abeliophyllum as a monotypic, phylogenetically distinct genus endemic to Korea.
Coreanomecon, an endemic genus of the Korean Peninsula, was identified in the present rbcL-based phylogenetic analysis as most closely related to Hylomecon (Fig. 1B). Multiple molecular phylogenetic analyses have strongly supported the recognition of Coreanomecon as a distinct genus, separate from Hylomecon, within the Papaveraceae. Studies using chloroplast markers such as rbcL and matK (Yun and Oh, 2018; Ghimire et al., 2019) or plastome-level analyses (Wu et al., 2019; Zhang et al., 2019) consistently show that Coreanomecon hylomeconoides clusters with either Hylomecon or Chelidonium, another morphologically similar genus within the tribe Chelidonieae. Plastome analysis of the three genera revealed low divergence among them (On-line Supplemental Data Fig. 2), highlighting the need to incorporate nuclear DNA data to resolve their relationships.
Hanabusaya shares several gross morphological traits with Adenophora, including an erect growth habit and overall floral architecture. However, it is distinguished by reproductive features, such as fused stamens and pubescence on the ovary, both of which are absent or less pronounced in Adenophora. These diagnostic traits, commonly used in Campanulaceae systematics (Crowl et al., 2016), align with molecular phylogenetic evidence supporting its independent evolutionary lineage. Notably, Hanabusaya is resolved as a distinct monotypic genus in multiple genetic studies using nuclear ITS sequences and chloroplast genome data (Kim et al., 1999; Cheon and Yoo, 2013). Furthermore, the results of this study confirm that Hanabusaya is a monotypic genus, reinforcing its taxonomic distinctiveness within the family (Fig. 1C).
Mankyua chejuense B. Y. Sun, M. H. Kim & C. H. Kim, a monotypic, endemic genus from Jejudo Island, Korea, is morphologically characterized by palmately divided fronds, thick, fleshy stipes, and well-separated fertile segments bearing sporangia—traits that are intermediate between Ophioglossum and Botrychium yet do not fully align with either genus. Previous molecular phylogenetic studies using plastid markers such as rbcL and trnL-F (Sun et al., 2001; Hauk et al., 2003) and plastome sequences (Kim and Kim, 2018) consistently placed Mankyua as an early-diverging lineage within the family Ophioglossaceae, supporting its recognition as an independent genus. The present study yielded congruent morphological and molecular results (Fig. 1D), reinforcing the taxonomic distinctiveness and evolutionary significance of Mankyua as a unique genus endemic to the Korean Peninsula.
Megaleranthis, traditionally recognized as a monotypic genus, is endemic to the Korean Peninsula. However, DNA barcoding analyses using the plastid rbcL marker in the present study revealed that Megaleranthis saniculifolia Ohwi is nested within the Trollius clade (Fig. 2A). This finding challenges the long-standing classification of Megaleranthis as a distinct genus and provides molecular evidence for its inclusion within Trollius. Earlier studies have also questioned the distinctiveness of Megaleranthis. Lee and Yeau (1985) examined pollen morphology and karyotype, reporting that Megaleranthis exhibits striate pollen ornamentation—a rare feature in Ranunculaceae—that closely resembles that of Trollius and Calathodes, suggesting a close phylogenetic relationship. Subsequently, Jang and Heo (2005) conducted a detailed anatomical comparison of reproductive structures, including pollen grains, ovules, embryo sacs, and seeds. They found no substantial differences between Megaleranthis and Trollius in these traits. Based on these findings, they recommended that M. saniculifolia be treated as Trollius chosenensis Bunge, while provisionally maintaining it as a separate genus pending molecular confirmation. Subsequently, Kim et al. (2009) supported this treatment through plastome sequence analysis of M. saniculifolia and multiple Trollius species. Collectively, evidence ranging from pollen morphology to reproductive anatomy and molecular phylogenetics, including the present study, provides strong support for the taxonomic reassignment of Megaleranthis to Trollius.
Pentactina is confirmed in the present study to be most closely related to Spiraea (Fig. 2B) while retaining distinct morphological and molecular traits that delineate it as a separate lineage within the tribe Spiraeeae of the family Rosaceae. Traditionally treated as a monotypic genus endemic to the Korean Peninsula—comprising P. rupicola Nakai—Pentactina has been supported as taxonomically distinct by previous palynological and phylogenetic studies based on trnL and ITS markers (Lee et al., 1993; Lee and Hong, 2011). However, a recent phylogenetic revision incorporating Pentactina schlothauerae (Ignatov & Vorosch.) Jakubov from the Russian Far East (Jeon et al., 2025) challenges its monotypic and endemic status, instead supporting its recognition as an oligotypic lineage within Spiraeeae. Molecular analyses of P. rupicola and P. schlothauerae—incorporating nuclear ribosomal DNA, plastome protein-coding genes (Jeon et al., 2025), and complete plastome data presented here (On-line Supplemental Data Fig. S3)—provide additional support for this idea, demonstrating that the two species form a distinct clade. These findings call for a reassessment of its systematic position and biogeographical distribution.
Both Wangsania and Sillaphyton were independently proposed as new monotypic genera endemic to Korea, based on the same species originally described as Peucedanum insolens Kitag. These treatments were published nearly simultaneously: Sillaphyton by Pimenov et al. (2016) and Wangsania by Lee et al. (2017). However, under the principle of priority in the International Code of Nomenclature for algae, fungi, and plants (ICN), Sillaphyton is accepted as the legitimate name, making Wangsania its taxonomic synonym. Molecular phylogenetic analysis of nuclear ITS sequences showed that Wangsania (= Sillaphyton) did not cluster with Peucedanum or any of the 14 recognized tribes in Apioideae, indicating a distinct phylogenetic position (Lee et al., 2017). The present study confirms this placement, as Sillaphyton does not form a monophyletic clade with Peucedanum but is most closely related to Sphallerocarpus (Fig. 3), supporting its recognition as a separate genus.
This study confirms the taxonomic distinctiveness of several Korean endemic genera using rbcL barcoding and comparative phylogenetic analysis. While genus-level resolution is achievable with plastid markers, accurate species-level discrimination—especially in closely related or recently diverged taxa—often requires additional DNA markers beyond standard plastid regions such as trnL-F and matK. Nuclear markers such as ITS and low-copy nuclear genes (e.g., LEAFY, G3PDH) have proven useful in resolving these cases (Wall, 2002; Yue et al., 2009). Therefore, there is an urgent need to expand the current barcoding effort to include species-level reference libraries that incorporate multi-locus data. Completing such a library for the Korean vascular flora will not only strengthen taxonomic resolution and phylogenetic accuracy but also enable broader applications in ecological monitoring, conservation prioritization, and policymaking. Accelerating the development of a species-level DNA barcode library should thus be recognized as a critical step toward maximizing the utility of biodiversity research in support of the GBF.








