Research Article | Volume 11, Issue 3, May, 2023

The de novo genome assembly (nuclear, chloroplast, and mitochondria) of ornamental plant pygmy date palm Phoenix roebelenii

Navajeet Chakravartty Nageswara Rao Reddy Neelapu   

Open Access   

Published:  Apr 04, 2023

DOI: 10.7324/JABB.2023.38646
Abstract

The field of ornamental plant genomics has increased the sequencing of the whole genome of ornamental plants in the past 10 years. Phoenix roebelenii (pygmy date palm) is a popular ornamental plant grown indoors and outdoors. Pygmy date palm is a tropical and subtropical plant that belongs to the family Arecaceae. This plant is resistant to pests, tolerant to soil variation, and tolerant to drought. Therefore, it is interesting to report the nuclear and organelle genome sequences of P. roebelenii. The raw genome data were retrieved from NCBI and cleaned with AdapterRemoval version 2.3.2 for high-quality clean data. The genome size was estimated using KmerGenie version 1.7051, and nuclear genome assembly was generated using MaSurCa version 3.3.2. The completeness and quality of genome assembly is assessed using BUSCO version 4.1.2. This analysis resulted in a draft nuclear genome sequence constituting 462,152,837 bps with 7019 scaffolds. The repeats, genes, tRNA genes, and transcription factors were identified and predicted using RepeatModeler version 2.0.1, AUGUSTUS version 3.3.2, tRNAscan-SE version 2.0.6, and plant TFDB version 4.0, respectively. In total, 35.11% of repeats, 42,388 genes, 480 tRNA genes, and transcription factors for 480 genes were predicted. The functional annotation was based on UniProt protein database, OrthoFinder version 2.2.7, InterproScan, Plant metabolic network (PMN) analysis, and gene ontology (GO) categorization. The organelle genome sequences – chloroplast genome sequence and mitochondrial genome sequence are reported in the study. The mitochondria and chloroplast assembly were generated using GetOrganelle version 1.6.4. The chloroplast genome consists of 125,222 bps, and the mitochondrial genome consists of 482,735 bps. The chloroplast and mitochondrial genome annotation was performed using CPGAVAS2 version 1 and AGORA version 1, respectively. The chloroplast genome has 108 genes, 30 tRNA genes, and 136 repeats, whereas the mitochondrial genome has 65 genes, 12 tRNA genes, and 91 repeats. Thus, this study reports the draft nuclear and complete organelle genome sequences of P. roebelenii.


Keyword:     Chloroplast genome Genome assembly Mitochondrial genome Nuclear genome Phoenix roebelenii Pygmy date palm


Citation:

Chakravartty N, Neelapu NRR. The de novo genome assembly (nuclear, chloroplast and mitochondria) of ornamental plant pygmy date palm Phoenix roebelenii. J App Biol Biotech. 2023;11(3):113-122. https://doi.org/10.7324/JABB.2023.38646

Copyright: Author(s). This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike license.

HTML Full Text

1. INTRODUCTION

The field that looks for the plant’s ornamental values by sequencing and resequencing the genome of ornamental plants is known as ornamental plant genomics [1]. The continuous emergence of sequencing and bioinformatics technologies evolved the field of ornamental plant genomics [2-4]. Zhang et al. [4], in 2012, pioneered genome sequencing of the “ornamental plant, Prunus mume.” In the past 10 years, the “whole genome sequences” of nearly 65 “ornamental plants” representing the families Asteraceae, Orchidaceae, and Rosaceae were sequenced [1]. The whole-genome analysis of these plants’ species established ornamental traits such as abiotic stress resistance, disease resistance, dormancy, flower color formation, floral development, floral scent, plant architecture, and self-incompatibility [1].

Phoenix roebelenii (pygmy date palm) is a popular “ornamental plant” and is associated with the family “Arecaceae” [5]. “Pygmy date palm” is a miniature palm tree having slim trunks and a beautiful crown of feathery leaves, small yellow flowers, and small edible black fruits [5]. “Pygmy date palm” is a variety of date palms indigenous to southwestern China and northern Vietnam and is grown well in warm, tropical, and subtropical gardens [5]. The British royal horticultural society marked P. roebelenii with the Award of Garden Merit for its performance under UK growing conditions [6,7]. Pygmy date palm is resistant to pests, tolerant to soil variation, and moderately tolerant to drought. The NASA Clean Air Study established that this plant effectively removed regular domestic air pollutants benzene and formaldehyde [8]. Therefore, this study aims to report the draft nuclear and complete organelle genome sequences of P. roebelenii along with important ornamental traits.


2. MATERIALS AND METHODS

The raw genome sequence data (reads) of P. roebelenii was downloaded from NCBI (BioProject ID: PRJNA629103), evaluated, and checked. AdapterRemoval version 2.3.2 was employed to remove contaminated adapter sequences and bases of low-quality (with Q20) from reads to provide high-quality clean data [9]. The de novo assembly for nuclear genome is generated based on high-quality clean data. KmerGenie version 1.7051 [10] is employed to estimate the genome size, and MaSurCa version 3.3.2 [11] generated de novo assembly. BUSCO version 4.1.2 [12] is used to check the de novo assembly and was considered for downstream analysis to check the completeness and quality of the genome assembly. The plant dataset embryophyta_odb10 is provided as a model. RepeatModeler version 2.0.1 [13] and RepeatMasker version 4.0.9 [14] are employed to identify repeats and mask the genome, respectively. AUGUSTUS version 3.3.2 [15] is employed to predict genes with Arabidopsis as the model. tRNAscan-SE version 2.0.6 [16] is utilized to identify tRNAs. UniProt protein database [17] is employed to process functional annotation of the predicted genes based on homology. The topHits in the homology search are used to assign a function to the genes in the functional annotation. Transcription factor analysis was performed using Plant TFDB version 4.0 [18]. The orthologous analysis was performed for predicted protein sequences of P. roebelenii by considering protein sequences of six model species, that is, Arabidopsis thaliana, Oryza sativa, Phoenix dactylifera, Sorghum bicolor, Triticum aestivum, and Zea mays. OrthoFinder version 2.2.7 [19] is employed for orthologous analysis. The simple sequence repeats markers were identified using MISA version 2.1 [20], and primers were designed using primer 3 version 2.5.0 [21]. The mitochondria and chloroplast assembly were generated using GetOrganelle version 1.6.4 [22], and the annotation of chloroplast was made using CPGAVAS2 version 1 [23]. The mitochondrial genome annotation was performed using AGORA version 1 [24], considering Y08501.2 as a model.


3. RESULTS AND DISCUSSION

3.1. Genome Assembly (Nuclear and Organelle) of Pygmy Date Palm

The whole-genome sequence data of 32.2 GBs from NCBI Bio project PRJNA629103 is retrieved. There were 214,907,072 reads with GC content of 42.43% and 93.885% data ≥Q30 with read length of 2 × 150 bp. The adapter removal and quality trimming resulted in 214, 767, 420 reads with a GC content of 41.88%. The quality of the bases with ≥Q30 is 94.265%, and the genome size was 584,473,888 bps [Figure 1]. The de novo assembly generated 7019 scaffolds with an assembly size of 462,152,837 bps. The longest scaffold is 6,464,272 bps, and the shortest scaffold is 411 bps [Table 1]. The assembled genome’s GC content and scaffold length distribution were calculated and are shown in Figure 2 and 3, respectively. The GC content of the assembled genome is ~41.88%. BUSCO version 4.1.2 evaluated the assembled genome, and the assembly was 84.2% complete [Table 2].

Figure 1: The K-mer histogram for estimation of genome size in pygmy date palm. The figure shows K-mers size in the genome of pygmy date palm, that are predicted using KmerGenie version 1.7051. A graph was plotted with K-mer size on the X-axis and the number of genomic K-mers on the Y-axis and from the graph the K-mer size of 84 is selected for the genome assembly. The predicted genome assembly size of pygmy date palm is 584,473,888 bp.



[Click here to view]

Table 1: The summary of genome assembly on pygmy date palm.

S. No.Assembly StatisticsCount
1.Number of scaffolds7019
2.Total size of scaffolds462152837
3.Longest scaffold6464272
4.Shortest scaffold411
5.Number of scaffolds>1K nt7015
6.Percentage of scaffolds>1K nt99.9
7.Number of scaffolds>10K nt2706
8.Percentage of scaffolds>10K nt38.6
9.Number of scaffolds>100K nt803
10.Percentage of scaffolds>100K nt11.4
11.Number of scaffolds>1M nt83
12.Percentage of scaffolds>1M nt1.2
13.Mean scaffold size65843
14.Median scaffold size5546
15.N50 scaffold length569782
16.L50 scaffold count181
17.Scaffold %A27.87
18.Scaffold %C18.09
19.Scaffold %G18.07
20.Scaffold %T27.83
21.Scaffold %N8.14
Figure 2: The distribution of GC percentage in the assembled genome of pygmy date palm. The figure shows the distribution of GC percentage in the assembled genome of pygmy date palm. A graph was plotted with GC percentage range (GC%) on the X-axis and the number of scaffolds on the Y-axis. The GC percentage in the assembled genome of pygmy date palm is ~ 41.88%.



[Click here to view]
Figure 3: The scaffold length distribution of the assembled genome in pygmy date palm. The figure shows distribution of scaffold length distribution in the assembled genome of pygmy date palm. A graph was plotted with scaffold length range on the X-axis and the number of scaffolds on the Y-axis.



[Click here to view]

Table 2: The summary of BUSCO score parameters to evaluate the completeness of pygmy date palm

S. No.BUSCO StatisticsCountPercentage
1.Complete BUSCOs (C)136084.20%
2.Complete and single-copy BUSCOs (S)129079.90%
3.Complete and duplicated BUSCOs (D)704.30%
4.Fragmented BUSCOs (F)915.60%
5.Missing BUSCOs (M)16310.20%
6.Total BUSCO groups searched1614100.00%

The repeat analysis masked 162,247,645 bps, nearly 35.11 %. The repeat classification revealed 2.44% of LINEs, 10.19% of LTR elements, 1.66% of DNA elements, and unclassified repeats of 19.70%. The analysis revealed 89860, 32272, 11399, 2545, 458, 193, and 18504 mononucleotide repeats, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, pentanucleotide repeats, hexanucleotide repeats, and complex type repeats, respectively [Table 3]. Out of the simple repeats predicted, the primers were designed successfully for 27,364 mononucleotides, 9901 dinucleotides, 3599 trinucleotides, 801 tetranucleotides, 146 pentanucleotides, 58 hexanucleotides, and 5742 complex type repeats [Table 4]. There were 480 tRNAs identified in the assembly of pygmy date palm.

Table 3: The summary of repeats predicted in the genome of pygmy date palm

S. No.RepeatsSubtypes of repeatsNumber of elementsLength occupiedPercentage of sequence
1.LINEs2890511272985 bp2.44%
LINE1243629932191 bp2.15%
2.LTR elements9807447094466 bp10.19%
3.ERV_classII2016592501 bp0.13%
4.DNA elements300207649566 bp1.66%
5.Unclassified40213491031581 p19.70%
6.Total interspersed repeats157048598 bp33.98%
7.Small RNA738103556 bp0.02%
8.Satellites7455978757 bp0.21%
9.Simple repeats976573707134 bp0.80%
10.Low complexity18778960747 bp0.21%

Table 4: The summary of primers designed successfully for SSRs repeats in pygmy date palm.

S. No.Type of simple repeatsCount of simple repeatCount of primers designed
1.p18986021405
2.p2322726895
3.p3113993029
4.p42545605
5.p5458107
6.p619345
7.c176134151
8.c*891202

The gene prediction revealed 42,388 genes; out of them, 30,140 genes were annotated based on the best hit with the UniProt protein database, and 24,629 genes were annotated explicitly with P. dactylifera genes available at NCBI [Table 5]. InterproScan was used to annotate genes, resulting in the annotation of 33,645 genes. Transcription factors are identified for 1850 genes in the study, out of which transcription factors of 150 genes (highest) belong to bHLH transcription factors. In contrast, the transcription factor of one gene (lowest) belongs to HRT-like transcription factors. PMN analysis identified 8357 genes associated with metabolic pathways. KEGG analysis also identified the genes taking part in different pathways. The GO categorization identified 1397 genes related to biological processes [Figure 4], 393 genes associated with cellular components [Figure 5], and 1268 genes linked to molecular function [Figure 6]. The Mapman analysis identified 46.33% of genes having a significant role in metabolic pathways [Figure 7]. The orthologous study considered 485,739 genes; out of them 364,435 (75%) genes were present in the orthogroups. The orthogroups in the model plants were compared to identify the common orthogroups between the model species and P. roebelenii, as shown in Figure 8. The linear dendrogram was generated using the maximum likelihood method [19,25] and viewed in Figtree [26] to understand the phylogenetic relationship between the models and P. roebelenii, as shown in Figure 9. The number of orthogroups identified in this study is 22,689 [Table 6].

Table 5: The summary of gene prediction and annotation on pygmy date palm

S. NoAnnotationsCount
1Number of CDS predicted42388
2Number of CDS got annotated with UniProt protein db30140
3Number of CDS got annotated with Phoenix dactylifera genes24629
4Number of CDS got annotation with INTERPRO33645
5Number of CDS got annotation with plant metabolic network8358
6Number of CDS got annotation with transcription factor1849
Figure 4: The gene ontologies related to biological process observed in pygmy date palm. The figure shows gene ontologies related to biological process observed in pygmy date palm. A graph is plotted with gene ontologies of biological process on X-axis and number of genes on Y axis.



[Click here to view]
Figure 5: The gene ontologies related to cellular component observed in pygmy date palm. The figure shows gene ontologies related to cellular components observed in pygmy date palm. A graph is plotted with gene ontologies of cellular components on X-axis and number of genes on Y axis.



[Click here to view]
Figure 6: The gene ontologies related to molecular function observed in pygmy date palm. The figure shows gene ontologies related to molecular functions observed in pygmy date palm. A graph is plotted with gene ontologies of molecular functions on X-axis and number of genes on Y axis.



[Click here to view]
Figure 7: The pathway summary predicted with Mapman. This figure shows thirty-four metabolic categories along with the participating percentage of genes having a significant role in metabolic pathways of pygmy date palm. Mapman analysis identified 46.33% genes having a significant role in metabolic pathways.



[Click here to view]
Figure 8: The summary of genes in orthogroups between model plants Phoenix dactylifera and Phoenix roebelenii. This figure shows summary of genes in orthogroups as revealed by orthologous analysis between Arabidopsis thaliana, Oryza sativa, Phoenix dactylifera, Sorghum bicolor, Triticum aestivum, Zea mays and Phoenix roebelenii.



[Click here to view]
Figure 9: The linear tree generated between model plants Phoenix dactylifera and Phoenix roebelenii. This figure shows the linear tree based on orthology between Arabidopsis thaliana, Oryza sativa, Phoenix dactylifera, Sorghum bicolor, Triticum aestivum, Zea mays, and Phoenix roebelenii. The orthology data was generated using OrthoFinder version 2.3.11 for the above species. The orthology data then was used to construct a linear tree based on maximum likelihood method and was viewed in FigTree version 1.4.4.



[Click here to view]

Table 6: The summary of orthogroups and genes in pygmy date palm as revealed by the orthologous analysis

S. No.Summary of orthogroups and genesCounts
1.Number of genes485739
2.Number of genes in orthogroups364435
3.Number of unassigned genes121304
4.Percentage of genes in orthogroups75
5.Percentage of unassigned genes25
6.Number of orthogroups22689
7.Number of species-specific orthogroups749
8.Number of genes in species-specific orthogroups4630
9.Percentage of genes in species-specific orthogroups1
10.Mean orthogroup size16.1
11.Median orthogroup size12
12.G50 (assigned genes)25
13.G50 (all genes)18
14.O50 (assigned genes)4284
15.O50 (all genes)7169
16.Number of orthogroups with all species present9446
17.Number of single-copy orthogroups8

A complete circular chloroplast genome of 125,222 bps was generated and annotated without any gap [Figure 10]. The repeat analysis revealed 91 tandem repeats and 45 simple repeats. Out of 45 simple repeats, 36 were mononucleotide simples, three were dinucleotide repeats, and six were complex-type repeats. A total of 108 genes with 79 coding sequences, 26 tRNA genes, and four unique rRNA sequences were detected in the chloroplast genome of pygmy date palm.

Figure 10: The Chloroplast genome and annotation of pygmy date palm. This figure shows a circular chloroplast genome of 125,222 bps generated without any gap using GetOrganelle version 1.6.4. This figure also shows chloroplast genome annotation predicted using CPGAVAS2 version 1.



[Click here to view]

A complete circular mitochondrial genome of 482,735 bps was generated and annotated without any gap [Figure 11]. A. thaliana ecotype Col-0 mitochondrion, complete genome (NC_037304.1), is a reference for genome annotation in the mitochondria. The repeat analysis revealed 29 tandem repeats and 61 simple repeats. Out of 61 simple repeats, 44 were mononucleotide simples, eight were dinucleotide repeats, seven were trinucleotide repeats, and two were complex-type repeats. A total of 65 genes, six tRNA genes, and six unique rRNA sequences were recognized in the mitochondrial genome of pygmy date palm.

Figure 11: The mitochondrial genome and annotation of pygmy date palm. This figure shows a circular mitochondrial genome of 482,735 bps generated without any gap using GetOrganelle version 1.6.4. This figure also shows mitochondrial genome annotation predicted using AGORA version 1.



[Click here to view]

3.2. The Genes Associated with Important Ornamental Traits in Pygmy Date Palm

The genes associated with the important ornamental traits such as “fruit development and ripening,” “floral development,” “anthocyanin synthesis,” “floral scent biosynthesis,” “plant architecture,” “dormancy release,” “self-incompatibility,” “disease resistance,” and “drought” were discovered. The “carbohydrate metabolism” is the extreme pathway in fruit development and ripening of fruit. “Energy metabolism” is the next metabolic pathway which is expressed highly followed by “metabolism of other amino acids.” The number of genes identified in sugar, energy, and pyruvate metabolisms is 322, 9, and 11, respectively. The number of genes identified in association with controlling “floral development,” “anthocyanin synthesis,” “floral scent biosynthesis,” “plant architecture,” “dormancy release,” “self-compatibility,” “disease resistance,” and “drought” are 2101, 9, 6, 58, 207, 49, 1948, and 1425, respectively. The criteria used to identify the genes related to ornamental traits are homology identity cutoff of 80% and query coverage cutoff of 80%. The summary of gene count and genes associated with ornamental traits is presented in Table 7 and Supplementary Tables 1-11, respectively.

Table 7: The summary of gene count associated with ornamental traits in pygmy date palm

S. No.TraitsGene count
1.Sugar metabolism322
2.Energy metabolism9
3.Pyruvate metabolism11
4.Floral development2101
5.Anthocyanin synthesis9
6.Floral scent biosynthesis6
7.Plant architecture58
8.Dormancy release207
9.Self-incompatibility49
10.Disease resistance1948
11.Drought1425

4. CONCLUSION

This study presents the first draft genome assembly of P. roebelenii. The de novo assembly was constructed using the sequenced data available. Therefore, a scaffold-level assembly was generated with gaps. Still, we are forwarding this draft assembly and the annotation as a valuable resource for the scientific community to get access to their future research.


5. ACKNOWLEDGMENTS

NC and NNNR are grateful to the management of GITAM (Deemed to be University) for providing the necessary facilities to carry out the research work and extending constant support. The authors are grateful to Dr. Stacy Pirro, Iridian Genomes Inc, USA, for allowing us to use their public data sets for whole-genome or organelle assembly and/or annotation; and submit genome assemblies to NCBI GenBank as TPA.


6. AUTHORS’ CONTRIBUTIONS

All authors made substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; took part in drafting the article or revising it critically for important intellectual content; agreed to submit to the current journal; gave final approval of the version to be published; and agreed to be accountable for all aspects of the work. All the authors are eligible to be an author as per the International Committee of Medical Journal Editors (ICMJE) requirements/guidelines.


7. FUNDING

There is no funding to report.


8. CONFLICTS OF INTEREST

The authors report no financial or any other conflicts of interest in this work.


9. ETHICAL APPROVALS

This study does not involve experiments on animals or human subjects.


10. DATA AVAILABILITY

The assembled nuclear genome, mitochondrial genome, and chloroplast genome of pygmy date palm were deposited to NCBI Genbank as TPA (Third Party Annotation) submission with the following accession numbers DXKA00000000, BK059358, and BK059355, respectively. The supplementary files generated in this study and the genome annotations of the both nuclear and organelle genome were deposited to Harvard dataverse (https://dataverse.harvard.edu/privateurl.xhtml?token=0d4a65ab-a3e1-4e0c-9cd0-b89210e14657).


11. PUBLISHER’S NOTE

This journal remains neutral with regard to jurisdictional claims in published institutional affiliation.

REFERENCES

1.  Zheng T, Li P, Li L, Zhang Q. Research advances in and prospects of ornamental plant genomics. Hortic Res 2021;8:e65. [CrossRef]

2.  Neelapu NR, Surekha C. Next-generation sequencing and metagenomics. In:Wong KC, editor. Computational Biology and Bioinformatics:Gene Regulation. Boca Raton:CRC Press;2016. 331-51.

3.  Yadav V, Lekkala MM, Surekha C, Neelapu NR. Global scenario of advance fungal research in crop protection. In:Yadav A, Mishra S, Kour D, Yadav N, Kumar A, editors. Agriculturally Important Fungi for Sustainable Agriculture. Cham:Springer;2020. 313-46. [CrossRef]

4.  Zhang Q, Chen W, Sun L, Zhao F, Huang B, Yang W, et al. The genome of Prunus mume. Nat Commun 2012;3:1318. [CrossRef]

5.  United States Department of Agriculture (USDA), Agricultural Research Service, Maryland:National Plant Germplasm System, Germplasm Resources Information Network (GRIN Taxonomy), National Germplasm Resources Laboratory;2022. Available from:https://www.npgsweb.ars-grin.gov/gringlobal/ taxon/taxonomy detail?id=28056 [Last accessed on 2022 Feb 04].

6.  The Royal Horticultural Society. Phoenix roebelenii. London:The Royal Horticultural Society;2022. Available from:https://www.rhs.org.uk/plants/12771/phoenix-roebelenii/details [Last accessed on 2022 Feb 04].

7.  Royal Horticultural Society's (RHS) Ornamental. Award of Garden Merit (AGM) Plants 2021. London:Royal Horticultural Society's (RHS) Ornamental;2022. Available from:https://www.rhs.org.uk/plants/pdfs/agm-lists/agm-ornamentals.pdf [Last accessed on 2022 Feb 04].

8.  Wolverton BC, Wolverton JD. Plants and soil microorganisms:Removal of formaldehyde, xylene, and ammonia from the indoor environment. J Miss Acad Sci 1993;38:11-5.

9.  Schubert M, Lindgreen S, Orlando L. AdapterRemoval v2:Rapid adapter trimming, identification, and read merging. BMC Res Notes 2016;9:88. [CrossRef]

10.  Chikhi R, Medvedev P. Informed and automated k-mer size selection for genome assembly. Bioinformatics 2014;30:31-7. [CrossRef]

11.  Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics 2013;29:2669-77. [CrossRef]

12.  Seppey M, Manni M, Zdobnov EM. BUSCO:Assessing genome assembly and annotation completeness. Methods Mol Biol 2019;1962:227-45. [CrossRef]

13.  Institute for Systems Biology. RepeatModeler. Seattle:Institute for Systems Biology;2021. Available from:https://www.repeatmasker.org.repeatmodeler [Last accessed on 2022 Feb 04].

14.  Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics 2009;Chapter 4:Unit 4.10 doi:10.1002/0471250953.bi0410s25 [CrossRef]

15.  Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 2008;24:637-44. [CrossRef]

16.  Chan PP, Lin BY, Mak AJ, Lowe TM. tRNAscan-SE 2.0:Improved detection and functional classification of transfer RNA genes. Nucleic Acids Res 2021;49:9077-96. [CrossRef]

17.  UniProt Consortium. UniProt:The universal protein knowledgebase in 2021. Nucleic Acids Res 2021;49(D1):D480-9.

18.  Jin J, Tian F, Yang DC, Meng YQ, Kong L, Luo J, et al. PlantTFDB 4.0:Toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 2017;45(D1):D1040-5. [CrossRef]

19.  Emms DM, Kelly S. OrthoFinder:Phylogenetic orthology inference for comparative genomics. Genome Biol 2019;20:238. [CrossRef]

20.  Beier S, Thiel T, Münch T, Scholz U, Mascher M. MISA-web:A web server for microsatellite prediction. Bioinformatics 2017;33:2583-5. [CrossRef]

21.  Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. Primer3--new capabilities and interfaces. Nucleic Acids Res 2012;40:e115. [CrossRef]

22.  Jin JJ, Yu WB, Yang JB, Song Y, dePamphilis CW, Yi TS, et al. GetOrganelle:A fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol 2020;21:241. [CrossRef]

23.  Shi L, Chen H, Jiang M, Wang L, Wu X, Huang L, et al. CPGAVAS2, an integrated plastome sequence annotator and analyzer. Nucleic Acids Res 2019;47(W1):W65-73. [CrossRef]

24.  Jung J, Kim JI, Jeong YS, Yi G. AGORA:Organellar genome annotation from the amino acid and nucleotide references. Bioinformatics 2018;34:2661-3. [CrossRef]

25.  Challa S, Neelapu NR. Phylogenetic trees:Applications, construction, and assessment. In:Hakeem K, Shaik N, Banaganapalli B, Elango R, editors. Essentials of Bioinformatics. Vol. 03. Cham:Springer;2019. 167-92. [CrossRef]

26.  FigTree version 1.4.4. Edinburgh:Produce High-quality Figures of Phylogenetic Trees;2022. Available from:https://www.tree.bio.ed.ac.uk/software/figtree [Last accessed on 2022 Aug 27].

SUPPLEMENTARY

Supplementary Tables 1-11

https://drive.google.com/file/d/1sIsk1o_Rd4q2Lhxvi3AgTuZBri8cHyDm/view?usp=share_link

Reference

1. Zheng T, Li P, Li L, Zhang Q. Research advances in and prospects of ornamental plant genomics. Hortic Res 2021;8:e65. https://doi.org/10.1038/s41438-021-00499-x

2. Neelapu NR, Surekha C. Next-generation sequencing and metagenomics. In: Wong KC, editor. Computational Biology and Bioinformatics: Gene Regulation. Boca Raton: CRC Press; 2016. p. 331-51.

3. Yadav V, Lekkala MM, Surekha C, Neelapu NR. Global scenario of advance fungal research in crop protection. In: Yadav A, Mishra S, Kour D, Yadav N, Kumar A, editors. Agriculturally Important Fungi for Sustainable Agriculture. Cham: Springer; 2020. p. 313-46. https://doi.org/10.1007/978-3-030-48474-3_11

4. Zhang Q, Chen W, Sun L, Zhao F, Huang B, Yang W, et al. The genome of Prunus mume. Nat Commun 2012;3:1318. https://doi.org/10.1038/ncomms2290

5. United States Department of Agriculture (USDA), Agricultural Research Service, Maryland: National Plant Germplasm System, Germplasm Resources Information Network (GRIN Taxonomy), National Germplasm Resources Laboratory; 2022. Available from: https://www.npgsweb.ars-grin.gov/gringlobal/ taxon/taxonomy detail?id=28056 [Last accessed on 2022 Feb 04].

6. The Royal Horticultural Society. Phoenix roebelenii. London: The Royal Horticultural Society; 2022. Available from: https://www.rhs.org.uk/plants/12771/phoenix-roebelenii/details [Last accessed on 2022 Feb 04].

7. Royal Horticultural Society's (RHS) Ornamental. Award of Garden Merit (AGM) Plants 2021. London: Royal Horticultural Society's (RHS) Ornamental; 2022. Available from: https://www.rhs.org.uk/plants/pdfs/agm-lists/agm-ornamentals.pdf [Last accessed on 2022 Feb 04].

8. Wolverton BC, Wolverton JD. Plants and soil microorganisms: Removal of formaldehyde, xylene, and ammonia from the indoor environment. J Miss Acad Sci 1993;38:11-5.

9. Schubert M, Lindgreen S, Orlando L. AdapterRemoval v2: Rapid adapter trimming, identification, and read merging. BMC Res Notes 2016;9:88. https://doi.org/10.1186/s13104-016-1900-2

10. Chikhi R, Medvedev P. Informed and automated k-mer size selection for genome assembly. Bioinformatics 2014;30:31-7. https://doi.org/10.1093/bioinformatics/btt310

11. Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics 2013;29:2669-77. https://doi.org/10.1093/bioinformatics/btt476

12. Seppey M, Manni M, Zdobnov EM. BUSCO: Assessing genome assembly and annotation completeness. Methods Mol Biol 2019;1962:227-45. https://doi.org/10.1007/978-1-4939-9173-0_14

13. Institute for Systems Biology. RepeatModeler. Seattle: Institute for Systems Biology; 2021. Available from: https://www.repeatmasker.org.repeatmodeler [Last accessed on 2022 Feb 04].

14. Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics 2009; Chapter 4:Unit 4.10 doi: 10.1002/0471250953.bi0410s25 https://doi.org/10.1002/0471250953.bi0410s25

15. Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 2008;24:637-44. https://doi.org/10.1093/bioinformatics/btn013

16. Chan PP, Lin BY, Mak AJ, Lowe TM. tRNAscan-SE 2.0: Improved detection and functional classification of transfer RNA genes. Nucleic Acids Res 2021;49:9077-96. https://doi.org/10.1093/nar/gkab688

17. UniProt Consortium. UniProt: The universal protein knowledgebase in 2021. Nucleic Acids Res 2021;49(D1):D480-9.

18. Jin J, Tian F, Yang DC, Meng YQ, Kong L, Luo J, et al. PlantTFDB 4.0: Toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 2017;45(D1):D1040-5. https://doi.org/10.1093/nar/gkw982

19. Emms DM, Kelly S. OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol 2019;20:238. https://doi.org/10.1186/s13059-019-1832-y

20. Beier S, Thiel T, Münch T, Scholz U, Mascher M. MISA-web: A web server for microsatellite prediction. Bioinformatics 2017;33:2583-5. https://doi.org/10.1093/bioinformatics/btx198

21. Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. Primer3--new capabilities and interfaces. Nucleic Acids Res 2012;40:e115. https://doi.org/10.1093/nar/gks596

22. Jin JJ, Yu WB, Yang JB, Song Y, dePamphilis CW, Yi TS, et al. GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol 2020;21:241. https://doi.org/10.1186/s13059-020-02154-5

23. Shi L, Chen H, Jiang M, Wang L, Wu X, Huang L, et al. CPGAVAS2, an integrated plastome sequence annotator and analyzer. Nucleic Acids Res 2019;47(W1):W65-73. https://doi.org/10.1093/nar/gkz345

24. Jung J, Kim JI, Jeong YS, Yi G. AGORA: Organellar genome annotation from the amino acid and nucleotide references. Bioinformatics 2018;34:2661-3. https://doi.org/10.1093/bioinformatics/bty196

25. Challa S, Neelapu NR. Phylogenetic trees: Applications, construction, and assessment. In: Hakeem K, Shaik N, Banaganapalli B, Elango R, editors. Essentials of Bioinformatics. Vol. 03. Cham: Springer; 2019. p. 167-92. https://doi.org/10.1007/978-3-030-19318-8_10

26. FigTree version 1.4.4. Edinburgh: Produce High-quality Figures of Phylogenetic Trees; 2022. Available from: https://www.tree.bio.ed.ac.uk/software/figtree [Last accessed on 2022 Aug 27].

Article Metrics
64 Views 91 Downloads 155 Total

Year

Month

Related Search

By author names