In silico comparative analysis of the complete chloroplast genome sequences in the mulberry family (Moraceae)

Article information

Korean J. Pl. Taxon. 2024;54(2):110-120
Publication date (electronic) : 2024 June 30
doi : https://doi.org/10.11110/kjpt.2024.54.2.110
Ho Chi Minh City University of Industry and Trade, 140 Le Trong Tan, Tan Phu district, Ho Chi Minh City 700000, Vietnam
Corresponding author Viet The HO E-mail: thehv@huit.edu.vn
Received 2024 February 9; Revised 2024 March 29; Accepted 2024 May 8.

Abstract

The mulberry family (Moraceae) encompasses a wide range of angiosperms and includes several important fruit plants, such as jackfruit, breadfruit, and figs, among others. While there have been multiple published complete chloroplast (cp) genomes within this family, there has been no comprehensive study that summarizes the characteristics of these sequences. In the current study, a comprehensive analysis was conducted using a dataset consisting of 32 cp genomes obtained from NCBI GenBank. These genomes belong to 32 species, encompassing 12 genera within the mulberry family. The collected data were used to conduct genomic comparisons and phylogenetic analyses, employing a range of bioinformatics tools. The findings revealed length variations among the cp genomes of different species, ranging from 158,459 bp in Morus mongolica to 162,594 bp in Broussonetia luzonica. Additionally, the study detected structural variations in simple sequence repeat motifs including gene losses. Overall, the phylogenetic results in this study generally supported the existing taxonomy of species within this family, with the exception that Malaisia scandens and Trophis scandens are clustered in the clade of the Broussonetia genus. This investigation provides further evidence of the effectiveness of characterizing the entire cp genome when classifying species and genera within the mulberry family.

INTRODUCTION

The mulberry family, also known as the Moraceae family, is a large and widely distributed group of angiosperm plants that inhabit tropical to subtropical regions. Currently, there are over 37 genera comprising 1,050 reported species within this family (Berg, 2005). Plant in this family have several economic values such as silk production (Morus and Maclura genera), paper making (Broussonetia, Maclura and Morus), edible fruit (Artocarpus, Ficus and Morus), or timer (Artorcarpus and Brousonetia) (Zhekun and Gilbert, 2003). The plants within the mulberry family exhibit diverse variations in morphological traits, inflorescence architectures, and breeding systems (Clement and Weiblen, 2009). To evaluate the genetic diversity of this family, several studies have been published, although the majority of these studies have focused on morphological traits such as pollen (Teleb and Salah-El-din, 2014), trichomes (Schnetzler et al., 2017), morphological and anatomical characteristics (Erarslan et al., 2021), as well as molecular evidence such as ndhF and 26S nuclear ribosomal subunit (Zerega et al., 2005), or internal transcribed spacer and trnL-F sequences (Weiguo et al., 2005).

However, relying solely on morphological features for classification is not entirely reliable as these traits can be influenced by various factors such as environmental conditions and developmental stages. To overcome the limitations of morphological methods, the use of DNA barcodes has been proposed. However, DNA barcoding has its own drawbacks as it relies on the sequencing of a limited number of genome regions, which may not provide sufficient discriminatory power due to the similarity of sequences between species (Galimberti et al., 2014). Additionally, DNA barcodes are specific to individual species and may not be reliable when applied to higher taxonomic levels. Currently, the highest level of discrimination achieved through DNA barcoding is only around 70%, and this accuracy may be further reduced in plants with complex genomes (Besse et al., 2021).

The variation observed in chloroplast (cp) genomes of plants has been extensively utilized in studies related to population genetics, evolutionary relationships, and genetic connections, serving the conservation efforts of endangered plant species and facilitating the development of molecular markers to enhance breeding processes with greater efficiency. In recent times, next-generation sequencing (NGS) has become a commonly employed method for DNA sequencing, replacing the Sanger method in various applications that require the sequencing of multiple target DNA or RNA molecules simultaneously or the identification of complete genomes. Numerous studies have demonstrated that NGS can address the remaining challenges associated with DNA barcode technology, particularly in determining the origin of plants, detecting the presence of low-quality ingredients in products, and establishing traceability of plant-derived materials (Galimberti et al., 2014). Although several phylogenetic studies based on complete cp sequences have been published, these studies have primarily focused on individual genera such as Ficus (Zhang et al., 2022) or Morus (Zeng et al., 2022), or a limited number of genera analyzed simultaneously (He et al., 2020; De Souza et al., 2021). In this investigation, the complete cp genome sequences of 36 species, spanning 12 genera within the mulberry family, were acquired from GenBank and subjected to analysis. By comparing the sequences, the study identified clustering patterns and specific variable DNA regions within each species and across the entire family. These findings have the potential to enhance the accuracy of species classification within the mulberry family.

MATERIALS AND METHODS

Sequence annotation and comparison of chloroplast genomes

A total of 32 complete cp genome sequences, representing 32 species from 12 genera within the mulberry family, were obtained from the NCBI GenBank. or genera with multiple available sequences (Artocarpus, Broussonetia, Ficus, and Morus), only five cp sequences were randomly selected for further analysis (Table 1). The Geseq program (https://chlorobox.mpimp-golm.mpg.de/geseq.html) was used to annotate, and locate genes in the cp genomes (Tillich et al., 2017). Chloroplot software (https://irscope.shinyapps.io/Chloroplot/) were used to identify the number of protein-coding genes, rRNA genes, tRNA genes, and GC content in each cp genome (Zheng et al., 2020). Entire 32 cp sequences were further compared using VISTA program (https://genome.lbl.gov/vista/index.shtml) in Shuffle-LAGAN mode (Brudno et al., 2003).

Size comparison of cp genome features of 32 species in mulberry family

Repeat element analysis

The MAFFT program (https://mafft.cbrc.jp/alignment/server/) was used to align the 32 genome sequences with default settings. The IRscope tool (https://irscope.shinyapps.io/irapp/) was then employed to visualize the comparisons of the large single-copy region (LSC)/inverted repeat B (IRB)/small single-copy region (SSC)/inverted repeat A (IRA) junctions among these sequences (Amiryousefi et al., 2018). To detect simple sequence repeat (SSR) motifs, the MISA tool (http://pgrc.ipk-gatersleben.de/misa/misa.html) was utilized with the following parameters: a minimum of ten repeats for mononucleotides, six repeats for dinucleotides, five repeats for trinucleotides, four repeats for tetranucleotides, and three repeats each for penta- and hexa-nucleotides (Beier et al., 2017). The 32 genome sequences were aligned by the MAFFT program (https://mafft.cbrc.jp/alignment/server/) with default parameters. The comparison of the LSC/IRB/SSC/IRA junctions among these sequences was visualized by IRscope (https://irscope.shinyapps.io/irapp/) based on the annotations of their available cp genomes in GenBank (Amiryousefi et al., 2018).

Phylogenetic analysis

The TimeTree tool (http://www.timetree.org) was utilized to determine the divergence times and initial phylogenetic relatedness of all species in the mulberry family based on the available molecular sequences on NCBI (Kumar et al., 2017). The alignment of 32 cp genomes was then performed using the MAFFT alignment tool (https://mafft.cbrc.jp/alignment/server), and phylogenetic trees were constructed using Maximum Likelihood representing discrete character methods (Kang et al., 2017) using MEGA11 software (Tamura et al., 2021) with 500 bootstrap replicates. To serve as an outgroup, cp genomes from four different families were included (MK361034: Rosa banksiae; NC_058887: Elaeagnus pungens; NC_040984: Barbeya oleoides; and NC_026562: Cannabis sativa) following strategies to select outgroup described by Luo and colleagues (Luo et al., 2010). To assess the resolution of the cp genomes, the established phylogenetic tree was examined. The criterion used was as follows: If all species belonging to a genus were grouped together in a single monophyletic clade, the genus was considered to have clear resolution. On the other hand, if species from a particular genus were scattered across different clades, the genus was considered unresolved (Sikdar et al., 2018).

RESULTS AND DISCUSSION

Sequence annotation and comparison of chloroplast genomes

Through a search conducted on the NCBI database, complete cp genome sequences of 32 species from 12 genera within the mulberry family were obtained for analysis. As of February 2, 2024, the genus Figus had the highest number of available sequences, with 34 cp genomes. This was followed by Morus, Artocarpus, Broussonetia, Maclura, Milicia, Trophis, Afromorus, Antiaris, Malaisia, Pseudostreblus, and Streblus, with 16, 12, 6, 3, 2, 2, 1, 1, 1, 1, and 1 sequence, respectively. To ensure equal representation, up to five sequences per genus were selected for further analysis, resulting in a total of 32 cp sequences (Table 1). The Geseq program was used to obtain the structural characteristics and gene contents of the cp genomes, as shown in Fig. 1. Like other cp genomes, all cp genomes in this study exhibited a four-part structure, comprising a LSC, a SSC, and two IRs.

Fig. 1.

Typical map of mulberry chloroplast (cp) genome (where genes located outside and inside the circle are transcribed in clockwise and counterclockwise directions, respectively. The major regions of the cp genome are labeled as large single-copy region (LSC), simple sequence repeat (SSR), inverted repeat A (IRA), and inverted repeat B (IRB). The inner circle of the map illustrates the GC content in dark gray and the AT content in light gray).

The size of the cp genomes in the 32 sequences ranged from 158,459 bp in Morus mongolica to 162,594 bp in Broussonetia luzonica, with an average size of 160,432 bp per cp genome. The cp genome size of the mulberry family is slightly smaller compared to that of other land plants, as described by De Las Rivas et al. (2002). The number of protein-coding genes varies from 80 in Ficus religiosa to 92 in Morus mongolica. There seems to be no consistent rule governing the quantity of these genes among species within the same genus or across different genera, as the numbers vary significantly even within a single genus. Previous studies have also reported substantial variation in the number of protein-coding genes within genera such as Prunus (Xue et al., 2019), Cycas (Chang et al., 2020), or Rhus (Xu et al., 2022). As the number of complete cp genomes for each species continues to increase in the near future, it will become entirely feasible to analyze and discern the specific gene counts for each species, facilitating the discovery of patterns regarding the quantity of these genes within the same genus. Most of the cp genomes contain 8 rRNA genes and 37 tRNA genes, except for Ficus racemose with only 4 rRNA and 27 tRNA genes. Artocarpus petelotii is the only species with 36 tRNA genes. The average GC content of the cp genomes in the 32 species is approximately 36%.

When the cp genome of Artocarpus hypargyreus (NC_057287.1) served as the reference for aligning the cp genomes by mMISTA, noticeable distinctions were observed in the cp sequences of Artocarpus species (NC_057287.1: A. hypargyreus; NC_059002.1: A. altilis; NC_054247.1: A. camansi; NC_056286.1: A. petelotii and NC_080592.1: A. gomezianus) compared to other cp sequences (Fig. 2). A significant gap identified in the NC_080592.1 sequence warrants further investigation to determine whether this fragment was altered in the cp genome due to mutation or resulted from incomplete genome assembly.

Fig. 2.

Sequence identity plot compared 32 chloroplast genomes with NC_057287.1 as a reference by using mVISTA.

Repeat element analysis

Using the default parameters of the MISA program, tandem repeat sequences consisting of 1–6 nucleotide repeat units were analyzed to determine the relative abundance of SSRs (Fig. 3). A total of 2,140 SSRs were detected across the 32 cp genomes, ranging from 51 SSRs in Artocarpus petelotii to 97 SSRs in Maclura cochinchinensis, with an average of approximately 66 SSRs per cp genome. Among the detected SSRs, 11 different motifs were identified, namely A, T, C, G, AT, TA, TAA, TTC, TTA, AAT, and AAAT. The most dominant mononucleotide types were A and T, accounting for 35.4% (758) and 56.3% (1,206) of the total SSRs, respectively. On the other hand, the mononucleotide types C and G were rarely detected, with only 22 and 16 instances, respectively. Other motifs such as dinucleotide (AT and TA), trinucleotide (TAA, TTC, and TTA), and tetranucleotide (AAAT) were also identified with lower frequencies. The significant variation in SSR motifs among species within this family can provide valuable information for species identification, population genetics, and phylogenetic studies (Androsiuk et al., 2020).

Fig. 3.

The different repeat types in the chloroplast genomes of 32 species in mulberry family. SSR, simple sequence repeat.

While the IR regions in cp genomes are typically highly conserved, several instances of gene losses have been recorded in diverse plant species, such as barley, bamboo, cassava, and chickpea (Dobrogojski et al., 2020). By aligning the LSC/IRb/SSC/IRa borders and adjacent genes in the 32 cp genomes, significant variations were identified (Fig. 4). Notably, ndhF displayed the highest variability, ranging from translocation to the right site of SSC in Artocarpus gomezianus to truncation in Antiaris toxicaria, Streblus indicus, and Pseudostreblus indicus, and complete loss in Ficus concinna and F. religiosa. The loss of rpl22 was observed in all species of the Broussonetia genus, as well as in Malaisia scandens and Trophis scandens. These findings corroborate a previous study by Mohanta et al. (2020), which examined 2,511 cp genomes and identified ndhF and rpl22 as among the most frequently deleted genes in cp genomes. Variations among cp genomes are commonly observed, and several explanations have been proposed, including gene loss, expansions or contractions of IR regions, and intron loss (Mower and Vickrey, 2018).

Fig. 4.

The 32 chloroplast genomes were compared to analyze the border regions of the large single-copy (LSC), inverted repeat (IR), and small single-copy (SSC). Genes located at the IR/SC borders are depicted as boxes above or below the main lines, while the numbers above the genes indicate the distance from the gene terminal to the boundary region.

Phylogenetic analyses

Using the TimeTree tool, information was collected from the NCBI GenBank database for 390 species across 37 genera within the mulberry family. The divergence times of these 37 genera are presented in Fig. 5. The data indicates that speciation events within the mulberry family took place over a wide range of time, from approximately 59 million years ago to around 6 million years ago, spanning various geologic periods. However, it is worth noting that three genera, namely Afromorus, Malaisia, and Pseudostreblus, were not included in this phylogenetic tree, suggesting a discrepancy between the genera listed in the TimeTree database and the number of genera identified through the analysis of cp genomes.

Fig. 5.

Phylogenetic tree and timeline chronogram of 390 species belonging to 37 genera in mulberry family using TimeTree tool.

Furthermore, a robust phylogenetic analysis was conducted using the available 32 cp genomes from 12 genera, and the results are presented in Fig. 6 with a high bootstrap value. Among these genera, Artocarpus, Milicia, Morus, Maclura, and Ficus were found to exhibit the highest conservation, forming distinct monophyletic clades. Previously, the Royal Botanic Gardens Kew considered Streblus indicus and Pseudostreblus indicus as two separate species (https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:856410-1). However, our analysis reveals that they cluster together, suggesting a close relationship and the possibility of being a single species. In contrast, although the five cp sequences of the Broussoneitia genus were grouped within a single clade, the insertion of Malaisia scandens and Trophis scandens sequences fragmented their clustering. The inconsistency in the phylogenetic results may be attributed to variations in the number of markers used in different studies. Phylogenetic studies based on morphological traits or specific genes/markers can yield erroneous outcomes due to the different evolutionary rates of these markers (Wu and Ge, 2011). On the other hand, analyzing the complete cp genome provides a wealth of information and higher resolution, making it valuable for classifying organisms below the species level and conducting phylogenetic studies (Long et al., 2023).

Fig. 6.

Phylogenetic tree of 32 chloroplast (cp) genomes of Moracaece family. The cp sequences of Rosa banksiae, Elaeagnus pungens, Barbeya oleoides, and Cannabis sativa are used as outgroups. Numbers near branches are bootstrap values.

CONCLUSIONS

As the sequencing of cp genomes becomes increasingly accessible, there is a notable surge in the number of sequenced genomes. Consequently, it becomes essential to conduct in silico studies to comprehensively assess the cp genomes generated from various independent studies. This study aims to elucidate the common structure and content of cp genomes within the mulberry family by analyzing 32 species. By examining the similarities and differences among these cp genomes, we can enhance our understanding of the genetic structure of this plant family.

Acknowledgements

This work was supported by the Ho Chi Minh City University of Industry and Trade- Vietnam through the HUIT fund for Science and Technology under the Contract No. 97/HD-DCT.

Notes

CONFLICTS OF INTEREST

The authors declare that there are no conflicts of interest

References

Amiryousefi A., Hyvönen J., Poczai P.. 2018;IRscope: An online program to visualize the junction sites of chloroplast genomes. Bioinformatics 34:3030–3031.
Androsiuk P., Jastrzębski J. P., Paukszto Ł., Makowczenko K., Okorski A., Pszczółkowska A., Chwedorzewska K. J., Górecki R., Giełwanowska I.. 2020;Evolutionary dynamics of the chloroplast genome sequences of six Colobanthus species. Scientific Reports 10:11522.
Beier S., Thiel T., Münch T., Scholz U., Mascher M.. 2017;MISA-web: A web server for microsatellite prediction. Bioinformatics 33:2583–2585.
Berg C. C.. 2005;A new species of Artocarpus (Moraceae) from Thailand. Blumea 50:513–533.
Besse P., Da Silva D., Grisoni M.. 2021;Plant DNA barcoding principles and limits: A case study in the genus Vanilla . Methods in Molecular Biology 2222:131–148.
Brudno M., Malde S., Poliakov A., Do C. B., Couronne O., Dubchak I., Batzoglou S.. 2003;Glocal alignment: Finding rearrangements during alignment. Bioinformatics 19(Suppl 1):i54–i62.
Chang A. C. G., Lai Q., Chen T., Tu T., Wang Y., Agoo E. M. G., Duan J., Li N.. 2020;The complete chloroplast genome of Microcycas calocoma (Miq.) A. DC. (Zamiaceae, Cycadales) and evolution in Cycadales. PeerJ 8:e8305.
Clement WL., Weiblen G. D.. 2009;Morphological evolution in the Mulberry family (Moraceae). Systematics Biology 34:530–552.
De Las Rivas J., Lozano J. J., Ortiz A. R.. 2002;Comparative analysis of chloroplast genomes: Functional annotation, genome-based phylogeny, and deduced evolutionary patterns. Genome Research 12:567–583.
De Souza U. J. B., dos Santos R. N., de Araújo Filho R. N., dos Santos G. R., Sarmento R. A., De Bellis F., Campos F. S.. 2021;The complete chloroplast genome of Artocarpus altilis (Moraceae) and phylogenetic relationships. Mitochondrial DNA Part B Resources 6:2291–2293.
Dobrogojski J., Adamiec M., Luciński R.. 2020;The chloroplast genome: A review. Acta Physiologiae Plantarum 42:98.
Erarslan Z. B., Kragöz S, Kültür S.. 2021;Comparative morphological and anatomical studies on Morus species (Moraceae) in Turkey. Turkish Journal of Pharmaceutical Sciences 18:157–166.
Galimberti A., Labra M., Sandionigi A., Bruno A., Mezzasalma V., De Mattia F.. 2014;DNA barcoding for minor crops and food traceability. Advances in Agriculture 2014:831875.
He S.-L., Tian Y., Yang Y., Shi C.-Y.. 2020;Chloroplast genome and phylogenetic analyses of Morus alba (Moraceae). Mitochondrial DNA Part B Resources 5:2203–2204.
Kang Y., Deng Z., Zang R., Long W.. 2017;DNA barcoding analysis and phylogenetic relationships of tree species in tropical cloud forests. Scientific Reports 7:12564.
Kumar S., Stecher G., Suleski M., Hedges S. B.. 2017;Time-Tree: A resource for timelines, timetrees, and divergence times. Molecular Biology and Evolution 34:1812–1819.
Long L., Li Y., Wang S., Liu Z., Wang J., Yang M.. 2023;Complete chloroplast genomes and comparative analysis of Ligustrum species. Scientific Reports 13:212.
Luo A.-R., Zhang Y.-Z., Qiao H.-J., Shi W.-F., Murphy R. W., Zhu C.-D.. 2010;Outgroup selection in tree reconstruction: A case study of the family Halictidae (Hymenoptera: Apoidea). Acta Entomologica Sinica 53:192–201.
Mohanta TK., Mishra A. K., Khan A., Hashem A., Abd_Allah E. F., Al-Harrasi A.. 2020;Gene loss and evolution of the plastome. Genes 11:1133.
Mower J. P., Vickrey T. L.. 2018;Structural diversity among plastid genomes of land plants. Advances in Botanical Research 85:263–292.
Schnetzler B. N., Teixeira S. P., Marinho C. R.. 2017;Trichomes that secrete substances of a mixed nature in the vegetative and reproductive organs of some species of Moraceae. Acta Botanica Brasilica 31:392–402.
Sikdar S., Tiwari S., Thakur V. V., Sapre S.. 2018;An in silico approach for evaluation of rbcL and matK loci for DNA bar-coding of Fabaceae family. International Journal of Chemical Studies 6:2446–2451.
Tamura K., Stecher G., Kumar S.. 2021;MEGA11: Molecular Evolutionary Genetics Analysis version 11. Molecular Biology and Evolution 38:3022–3027.
Teleb S. S., Salah-El-din R. M.. 2014;Pollen morphology of some species of genus Ficus L. (Moraceae) from Egypt. Egyptian Journal of Botany 54:87–102.
Tillich M., Lehwark P., Pellizzer T., Ulbricht-Jones E. S., Fischer A., Bock R., Greiner S.. 2017;GeSeq: Versatile and accurate annotation of organelle genomes. Nucleic Acids Research 45:W6–W11.
Weiguo Z., Yile P., Zhifang Z., Shihai J., Xuexia M., Yongping H.. 2005;Phylogeny of the genus Morus (Urticales: Moraceae) inferred from ITS and trnL-F sequences. African Journal of Biotechnology 4:563–569.
Wu Z.-Q., Ge S.. 2011;The phylogeny of the BEP clade in grasses revisited: Evidence from the whole-genome sequences of chloroplasts. Molecular Phylogenetics and Evolution 62:573–578.
Xu Y., Wen J., Su X., Ren Z.. 2022;Variation among the complete chloroplast genomes of the Sumac species Rhus chinensis: Reannotation and comparative analysis. Genes 13:1936.
Xue S., Shi T., Luo W., Ni X., Iqbal S., Ni Z., Huang X., Yao D., Shen Z., Gao Z.. 2019;Comparative analysis of the complete chloroplast genome among Prunus mume, P. armeniaca, and P. salicina . Horticulture Research 6:89.
Zhang Z., Zhang D.-S., Zou L., Yao C.-Y.. 2022;Comparison of chloroplast genomes and phylogenomics in the Ficus sarmentosa complex (Moraceae). PLoS ONE 17:e0279849.
Zeng Q., Chen M., Wang S., Xu X., Li T., Xiang Z., He N.. 2022;Comparative and phylogenetic analyses of the chloroplast genome reveal the taxonomy of the Morus genus. Frontiers in Plant Science 13:1047592.
Zerega N. J. C., Clement W. L., Datwyler S. L., Weiblen G. D.. 2005;Biogeography and divergence times in the mulberry family (Moraceae). Molecular Phylogenetics and Evolution 37:402–416.
Zhekun Z., Gilbert M. G.. 2003. Moraceae. In Flora of China. 5Ulmaceae through Basellaceae In : Wu Z, Raven P. H., Hong D. Y., eds. Flora of China Editorial Committee Science Press, Beijing and Missouri Botanical Garden Press. St Louis, MO: p. 21–73.
Zheng S., Poczai P., Hyvönen J., Tang J., Amiryousefi A.. 2020;Chloroplot: An online program for the versatile plotting of organelle genomes. Frontiers in Genetics 11:576124.

Article information Continued

Fig. 1.

Typical map of mulberry chloroplast (cp) genome (where genes located outside and inside the circle are transcribed in clockwise and counterclockwise directions, respectively. The major regions of the cp genome are labeled as large single-copy region (LSC), simple sequence repeat (SSR), inverted repeat A (IRA), and inverted repeat B (IRB). The inner circle of the map illustrates the GC content in dark gray and the AT content in light gray).

Fig. 2.

Sequence identity plot compared 32 chloroplast genomes with NC_057287.1 as a reference by using mVISTA.

Fig. 3.

The different repeat types in the chloroplast genomes of 32 species in mulberry family. SSR, simple sequence repeat.

Fig. 4.

The 32 chloroplast genomes were compared to analyze the border regions of the large single-copy (LSC), inverted repeat (IR), and small single-copy (SSC). Genes located at the IR/SC borders are depicted as boxes above or below the main lines, while the numbers above the genes indicate the distance from the gene terminal to the boundary region.

Fig. 5.

Phylogenetic tree and timeline chronogram of 390 species belonging to 37 genera in mulberry family using TimeTree tool.

Fig. 6.

Phylogenetic tree of 32 chloroplast (cp) genomes of Moracaece family. The cp sequences of Rosa banksiae, Elaeagnus pungens, Barbeya oleoides, and Cannabis sativa are used as outgroups. Numbers near branches are bootstrap values.

Table 1.

Size comparison of cp genome features of 32 species in mulberry family

No. Accession No. Scientific name Genome size (bp) Coding genes rRNA tRNA
1 NC_057287.1 Artocarpus hypargyreus 160,952 85 8 37
2 NC_059002.1 Artocarpus altilis 160,184 88 8 37
3 NC_054247.1 Artocarpus camansi 160,096 88 8 37
4 NC_056286.1 Artocarpus petelotii 161,009 82 8 36
5 NC_080592.1 Artocarpus gomezianus 160,743 83 8 37
6 MH430880.1 Broussonetia papyrifera 160,290 88 8 37
7 MH223641 Broussonetia kaempferi 160,592 91 8 37
8 MH223642 Broussonetia kazinoki 160,841 90 8 37
9 NC_047180 Broussonetia luzonica 162,594 88 8 37
10 NC_047181 Broussonetia monoica 160,777 86 8 37
11 NC_033979 Ficus religiosa 160,627 80 8 37
12 NC_035237.1 Ficus carica 160,602 83 8 37
13 NC_028185 Ficus racemosa 159,473 88 4 27
14 NC_051532 Ficus hirta 160,357 84 8 37
15 MZ128521 Ficus concinna 160,331 87 8 37
16 KM491711 Morus mongolica 158,459 92 7 37
17 MT577029 Morus alba 159,184 85 8 37
18 NC_027110 Morus notabilis 158,680 85 8 37
19 NC_079702 Morus liboensis 159,282 85 8 37
20 NC_079652 Morus bombycis 159,184 85 8 37
21 MW732703 Maclura tricuspidata 161,355 86 8 37
22 NC_066228 Maclura cochinchinensis 161,295 85 8 37
23 NC_056295 Maclura tricuspidata 161348 85 8 37
24 MZ274134.1 Afromorus mesozygia 160,014 86 8 37
25 NC_042884 Antiaris toxicaria 161,412 86 8 37
26 NC_047182 Malaisia scandens 161,313 86 8 37
27 MZ274133.1 Milicia regia 160,232 86 8 37
28 MZ274132 Milicia excelsa 160,136 86 8 37
29 MN065161 Pseudostreblus indicus 159,853 85 8 37
30 NC_053933 Streblus indicus 159,853 85 8 37
31 NC_057293 Trophis caucana 161,445 85 8 37
32 MH189568 Trophis scandens 161,313 86 8 37