INTRODUCTION
Many species of the Fabaceae family play vital roles in global ecosystems, significantly contributing to soil fertility via nitrogen fixation (Jithesh et al., 2024). Furthermore, many species are widely used as food, forage, biofuel, and medicine resources (Lewis et al., 2005; Ketcha Wanda et al., 2015). Consequently, many scientific reports, including analyses on chemical components, genetic characteristics, and breeding programs, have investigated this family (Cho, 2006; Seong et al., 2017; Choi et al., 2019). The genus Lespedeza Michx. belongs the to Fabaceae, many species of the genus have been utilized as medicinal resources for invigorating organs (stomach, liver, and kidneys), and for reducing inflammatory diseases (Huang et al., 2010; Kim and Kim, 2010; Chitiala et al., 2025). However, genus Lespedeza presents taxonomic challenges due to frequent interspecific hybridization and extensive morphological variations within species (Xu et al., 2017). Therefore, accurate species identification is crucial to prevent the misuse of resources and maintain product quality.
Chloroplast, a uniparentally inherited organelle, performs critical functions, such as photosynthesis, carbon fixation, and stress response (Kusnetsov, 2017; Song et al., 2021). A typical chloroplast exhibits a well-conserved quadripartite structure and gene content; however, several studies have reported structural changes, including variations in the position of inverted repeat boundaries, gene content, and gene order (e.g., Yoo et al., 2021). These rearrangements in the chloroplast genome provide valuable information to elucidate the phylogenetic relationships and evolutionary history of lineages. Consequently, an increasing number of chloroplast genomes are being reported.
Recent chloroplast analyses on Lespedeza (e.g., Jin et al., 2019a; Park et al., 2024) have shed light on the divergence of this genus from related genera. For instance, specific changes, such as the inversion of the trnD-GUC–trnY-GUA–trnE-UUC gene cluster, have been instrumental in clarifying the evolutionary relationships among Lespedeza species. Phylogenetic trees constructed from chloroplast genomes have also revealed important taxonomic relationships among Lespedeza species (Jin et al., 2019a).
In this study, we focused on three Lespedeza species of section Junceae: L. pilosa ( Thunb.) Siebold & Z ucc., L. tomentosa (Thunb.) Siebold ex Maxim, and L. virgata (Thunb.) DC. These three species are used for various medicinal purposes and hold significant value as genetic resources (Huang et al., 2010; Li et al., 2010; Kwon et al., 2021). Further research on these three species is particularly important as it will complete the chloroplast genome sequencing of all native Korean species of section Junceae (Choi, 2007). Our comprehensive analysis can serve as a crucial resource to further clarify the phylogenetic relationships and evolutionary history of this taxonomically challenging group.
MATERIALS AND METHODS
Fresh leaves of L. pilosa were collected from the Healing Forest in Jeju, Korea (33°17′48.01″N, 126°31′10.10″E), L. tomentosa from the Sejong National Arboretum in Sejong, Korea (36°29′45.39″N, 127°17′17.31″E), and L. virgata from the Ganghwa Haenuri Park in Incheon, Korea (37°42′27.34″N, 126°21′30.40″E). All voucher specimens (collection numbers: L. pilosa, 2411002; L. tomentosa, 2411001; L. virgata, 2309008) have been deposited in the herbarium of the National Institute of Biological Resources (KB) in Incheon, Korea.
Genomic DNA was extracted using the Axen Plant DNA Mini Kit (Macrogen, Seoul, Korea) and sequenced using NovaSeq (Phyzen, Seoul, Korea). First, 18,013,454 (L. pilosa), 17,858,836 (L. tomentosa), and 77,600,378 (L. virgata) paired-end reads (150 bp each) were generated. These raw reads were subsequently trimmed using Trimmomatic 4.0 (Bolger et al., 2014), resulting in a final count of 17,286,054 (L. pilosa), 17,243,794 (L. tomentosa), and 34,750,528 (L. virgata) clean reads. Then, these reads were de novo assembled onto the chloroplast genome of L. cuneata (GenBank accession no. NC057455) using GetOrganelle 1.7.7.1 (Jin et al., 2020), and the draft genome sequences were checked with a map-to-reference application in Geneious Prime 2024.0.5 (Biomatters Ltd., Auckland, New Zealand). Gene annotation of the confirmed sequence was performed when the nucleotide sequences of the plastid genes of the tested species showed >90% similarity to the reference genome. Some protein-coding genes were manually identified by considering their start and stop codons. tRNAs were confirmed using tRNAscan-SE (Lowe and Chan, 2016) and compared with the tRNAs of other species. The completed chloroplast genome sequences have been uploaded in the database of Korea Bioinformation Center (accession nos. KAS2474094–KAS2474096). These sequences are available from https://kbds.re.kr/.
Circular maps of the analyzed three species were generated using OGDRAW v1.31 (Greiner et al., 2019). For phylogenetic analyses of Lespedeza, the following 11 additional genome sequences were obtained from GenBank: L. bicolor Trucz. (NC046836), L. buergeri Miq. (NC061375), L. cuneata (NC057455), L. davurica (Laxm.) Schindl. (NC042748), L. floribunda Bunge (MH800327), L. inschanica (Maxim.) Schindl. (PQ652319), L. juncea ( L. f .) Pers. (PQ652320), L. maritima Nakai (MG867570), L. tricolor (Nakai) D. P. Jin & J. W. Park & B. H. Choi (NC064210), Kummerowia striata (Thunb.) Schindl. (MG867569), and Campylotropis macrocarpa (Bunge) Rehder (MG867566) (Jin et al., 2019a; Somaratne et al., 2019; Kim et al., 2022; Wang et al., 2022; Park et al., 2024). Two species, K. striata and C. macrocarpa, were used as outgroups. For each species, 67 conserved protein-coding genes were selected and aligned using Geneious Alignment with default settings. The best-fit nucleotide substitution model for maximum likelihood (ML) analysis was determined using jModelTest 2.1.6 (Darriba et al., 2012). Among 88 candidate models incorporating the gamma-distributed rate heterogeneity, TVM + F + I model was selected based on the Akaike information criterion. ML tree was inferred using IQTREE 1.6.12 (Trifinopoulos et al., 2016) with 1,000 bootstrap replicates. Polymorphic simple sequence repeats (SSRs) in chloroplast genomes have been widely studied for applications in species identification, population genetics, and phylogenetic research. SSR loci were identified using MISA (Beier et al., 2017) with thresholds of ten repeats for mononucleotides, five for dinucleotides, four for trinucleotides, and three for tetra-, penta-, and hexanucleotides.
RESULTS AND DISCUSSION
Chloroplast genomes of L. pilosa, L. tomentosa, and L. virgata were 149,065, 149,056, and 148,957 bp long, respectively. All genomes followed a typical quadripartite structure, including a large single-copy region (82,495 bp for L. pilosa, 82,460 bp for L. tomentosa, and 85,443 bp for L. virgata), a small single-copy region (18,912 bp for L. pilosa, 18,932 bp for L. tomentosa, and 18,944 bp for L. virgata), and two inverted repeat regions (23,829 bp for L. pilosa, 23,832 bp for L. tomentosa, and 22,285 bp for L. virgata). The overall GC content of each genome was ca. 35% (Fig. 1, Table 1). Each genome harbored 128 genes, including 83 protein-coding, 8 rRNA, and 37 tRNA genes (Table 1, Appendix 1). Chloroplast genome lengths and gene counts were highly similar to those previously reported for other Lespedeza species (Jin et al., 2019a; Somaratne et al., 2019; Kim et al., 2022; Park et al., 2024). Furthermore, our analysis revealed several genomic changes shared with other species of the Desmodieae tribe. Inversion of the trnD-GUC–trnY-GUA–trnE-UUC (trnD-Y-E) gene cluster, a characteristic of this group (Jin et al., 2019a), was also confirmed in our study. The numbers of SSRs detected in the chloroplast genomes of L. pilosa, L. tomentosa, and L. virgata were 78, 73, and 84, respectively (Fig. 2). Similar to other Lespedeza species, the mononucleotide repeat motif (especially A/T) was the most abundant (Fig. 2), occurring 42, 35, and 52 times in L. pilosa, L. tomentosa, and L. virgata, respectively. This was followed by the dinucleotide repeat motif (especially AT/AT), which occurred 23, 24, and 21 times in L. pilosa, L. tomentosa, and L. virgata, respectively (Fig. 2).
Section Junceae formed a monophyletic group in the ML tree, with the exception of L. virgata ( Fig. 3 ) . Within the Junceae clade, L. pilosa was sister to other species (excluding L. virgata), and L. tomentosa was closely related to L. inschanica and L. juncea (Fig. 3). The divergence between clade 1 and clade 2 was consistent with the morphological classification, with clade 1 exhibiting typically shorter inflorescences than leaves (sessile or nearly sessile) and clade 2 exhibiting typically longer inflorescences than leaves (pedunculate). Interestingly, L. virgata was closer to L. bicolor of section Macrolespedeza than to the other species of section Junceae. Lespedeza virgata, including L. chinensis G. Don and L. floribunda, showed close relationships with species of section Macrolespedeza, such as L. fordii Schindl. and L. dunnii Schindl., in a previous phylogenetic study (Xu et al., 2012). This was inferred to be associated with the chromosome number. In this genus, two sets of chromosome numbers have been reported: 2n = 22 and 2n = 20. Generally, 2n = 22 is common in section Macrolespedeza, whereas 2n = 20 is common in section Junceae (summarized in Han et al., 2010). However, the three species of section Junceae (L. virgata, L. chinensis, and L. floribunda) exhibited 2n = 22, similar to the species of section Macrolespedeza. Additionally, L. virgata is shrubbier than the other species of section Junceae, such as L. cuneata and L. pilosa (Lee, 2003).
However, it is difficult to determine whether the closeness of L. virgata to L. bicolor in the phylogenetic tree is related to chromosomal traits, especially when considering the result that L. floribunda, which exhibits similar traits (2n = 22 and woody plant) to L. virgata, was closely clustered with L. davurica (2n = 20 and herbaceous plant) (section Junceae). Moreover, the closeness of L. virgata to L. bicolor was detected by the phylogenetic tree based on chloroplast DNA, whereas the suggested explanatory factors were related to nuclear DNA. The closeness of L. virgata to L. bicolor compared to other species in section Junceae is possibly the result of complex evolutionary processes, such as incomplete lineage sorting or past hybridization/introgression events, particularly in Lespedeza known for frequent interspecific hybridization (Xu et al., 2017). Therefore, additional analyses using nuclear markers (e.g., single-copy nuclear genes) are necessary to clarify the phylogenetic relationships among Lespedeza species. In a previous DNA barcoding study, the internal transcribed spacer showed better resolution than partial chloroplast DNA (Jin et al., 2019b). Nevertheless, our study is valuable as it successfully completed the chloroplast genome sequencing of three native Korean species of section Junceae (L. pilosa, L. tomentosa, and L. virgata).








