• Delivers high-accuracy, high-contiguity, and high-completeness telomere-to-telomere genome assemblies.
• Overcomes assembly challenges in centromeric and highly repetitive regions.
• Analyzes structural variations in complex regions such as centromeres and telomeres.
• Explores chromosome origin and domestication, and identifies key sex-determining genes.
• Professional Ultra-long team covering extraction to sequencing, with successful experience across multiple species.
• Access to both PacBio and Nanopore long-read platforms with high throughput and flexible sequencing strategies.
• Experienced team in genome assembly and customized bioinformatics analysis, proficient in T2T genome projects.
• More than 200 successful genome projects and over 2000 accumulated impact factors.
• Integrated experimental and bioinformatic solutions supported by copyrights and patents.
|
Genome survey |
Genome assembly |
Chromosome-level |
Gap filling |
Genome Annotation |
|
50X Illumina NovaSeq PE150 |
30X PacBio CCS HiFi reads |
100X Hi-C |
40-100X ONT Ultra long reads |
RNA-seq Illumina PE150 10 Gb + (optional) Full length RNA-seq PacBio 40 Gb or Nanopore 12 Gb |
For Survey, PacBio CCS, Hi-C, and transcriptome (for annotation) sequencing samples, please refer to the “chromosome-level genome assembly sample requirements”.
For ONT ultra-long sequencing, tissue samples are recommended, with higher quality standards to support the extraction of ultra-HMW DNA.
For detailed sample preparation instructions and requirements, please contact our sales team for customized solution based on the species.
Main analyses include:
1) T2T Genome Assembly
● T2T genome refers to a genome with “0 gaps” in which at least one chromosome is completely assembled from telomere to telomere.
● Using high-accuracy CCS reads and ONT ultra-long reads:
* Generate contig v1 genome via hybrid assembly using hifiasm (v0.25.0).
* Remove plastid and contaminated sequences by BLAST against the NT database.
* Scaffold contigs into chromosome-scale assembly using Hi-C data with 3D-DNA.
* Fill missing telomeres through local assembly with ONT reads to obtain the final T2T genome.
2) Assembly Evaluation
● BUSCO Evaluation
BUSCO v5.2.1 (Benchmarking Universal Single-Copy Orthologs) constructs single-copy gene sets for major evolutionary lineages based on the OrthoDB 10 database. The assembled genome is evaluated by alignment against this gene set, based on the matching ratio and integrity.
A higher proportion of “Complete BUSCOs” indicates higher genome assembly completeness.
● Reads Mapping
Align short reads from next-generation sequencing (e.g., Illumina) to the assembled genome using bwa. Align third-generation long reads to the assembled genome using Minimap2.
The completeness of the assembled genome and uniformity of sequencing coverage are evaluated based on mapping rate, genome coverage ratio, and depth distribution.
● Genome QC Evaluation
Evaluate the assembly using Merqury by comparing high-accuracy sequencing reads k‑mers with the genome assembly to obtain consensus quality (QV).
Higher quality values indicate higher accuracy of the assembled genome.
● Genome LAI Evaluation
LAI (LTR Assembly Index) assesses genome assembly integrity as the ratio of intact LTR retrotransposon sequences to total LTR sequences. Candidate LTR-RT sequences are identified using LTR_FINDER (v1.0.7) and LTRharvest (v1.5.9), then filtered and integrated using LTR_retriever (v2.8) to obtain high-confidence LTR retrotransposons and calculate LAI.
According to the LAI developer’s publication, LAI values are classified into three levels:
Draft (0 ≤ LAI < 10), Reference (10 ≤ LAI < 20), and Gold (LAI ≥ 20).
● Identification of Telomeres and Centromeres
Identify potential telomere repeat units in the genome using TIDK. Detect telomere sequences and obtain positional information using FindTelomeres based on repeat motifs.
Identify potential centromeric repeats using Centromics with third‑generation long reads, then remap to the genome to obtain centromere positions and sequences.
1) Genome Chromosome Map
2) Telomere Positions in the Genome
|
Chr |
Chr Length(bp) |
Upstream_Start(bp) |
Upstream_End(bp) |
Upstream_Length(bp) |
Downstream_Start(bp) |
Downstream_End(bp) |
Downstream_Length(bp) |
|
Chr01 |
55,340,768 |
53 |
2,036 |
1,984 |
55,338,794 |
55,340,768 |
1,975 |
|
Chr02 |
56,588,289 |
1 |
2,760 |
2,760 |
56,584,191 |
56,588,289 |
4,099 |
|
Chr03 |
46,886,733 |
20 |
3,001 |
2,982 |
46,881,994 |
46,886,733 |
4,740 |
|
Chr04 |
49,401,798 |
1 |
2,143 |
2,143 |
49,399,160 |
49,401,798 |
2,639 |
|
Chr05 |
45,855,317 |
10 |
3,043 |
3,034 |
45,852,809 |
45,855,317 |
2,509 |
|
Chr06 |
45,285,625 |
1 |
3,268 |
3,268 |
45,283,427 |
45,285,625 |
2,199 |
|
Chr07 |
48,122,726 |
1 |
2,317 |
2,317 |
48,120,519 |
48,122,726 |
2,208 |
Note:
Chr: Chromosome ID
Chr_Length (bp): Chromosome length
Upstream_Start (bp): Start position of the upstream telomere on the chromosome
Upstream_End (bp): End position of the upstream telomere on the chromosome
Upstream_Length (bp): Length of the upstream telomere on the chromosome
Downstream_Start (bp): Start position of the downstream telomere on the chromosome
Downstream_End (bp): End position of the downstream telomere on the chromosome
Downstream_Length (bp): Length of the downstream telomere on the chromosome
3) Centromere Positions in the Genome
|
Chr |
Chr_Length(bp) |
Centromics_Start(bp) |
Centromics_End(bp) |
|
Chr01 |
55,340,768 |
18,943,204 |
23,005,555 |
|
Chr02 |
56,588,289 |
28,114,720 |
30,677,916 |
|
Chr03 |
46,886,733 |
24,487,558 |
24,929,326 |
|
Chr04 |
49,401,798 |
20,976,875 |
22,563,388 |
|
Chr05 |
45,855,317 |
18,578,095 |
19,715,924 |
|
Chr06 |
45,285,625 |
19,398,436 |
19,950,173 |
|
Chr07 |
48,122,726 |
26,390,720 |
27,913,284 |
Note:
Chr: Chromosome ID
Chr_Length (bp): Chromosome length
Centromere_Start (bp): Start position of the centromere on the chromosome
Centromere_End (bp): End position of the centromere on the chromosome
4) Gap Statistics of Assembly Results
|
Group |
Gap_Number |
Len |
|
Chr01 |
0 |
55,340,768 |
|
Chr02 |
0 |
56,588,289 |
|
Chr03 |
0 |
46,886,733 |
|
Chr04 |
0 |
49,401,798 |
|
Chr05 |
0 |
45,855,317 |
|
Chr06 |
0 |
45,285,625 |
|
Chr07 |
0 |
48,122,726 |
|
Total(Ratio %) |
0 |
347,481,256(100.00) |
Note:
Group: Chromosome ID
Gap_Number: Number of gaps on the chromosome
Len (bp): Chromosome length
5) Genome LAI Evaluation
|
Chr |
Chr Lenght(bp) |
Intact |
Total |
raw_LAI |
LAI |
|
whole_genome |
347,481,256 |
0.046 |
0.36 |
12.94 |
15.18 |
Note: According to the publication by the LAI developers, LAI values are classified into three categories: Draft (0 ≤ LAI < 10), Reference (10 ≤ LAI < 20), and Gold (LAI ≥ 20).
Chr Length (bp): Chromosome length
Intact: Proportion of intact LTR-RTs in the genome
Total: Proportion of total LTRs in the genome
raw_LAI = Intact / Total × 100
LAI: Corrected LAI value
Explore the advancements facilitated by BMKGene’s de novo genome assembly services through a curated collection of publications:
T2T Genome
Liu, Shoucheng et al. “A telomere-to-telomere genome assembly coupled with multi-omic data provides insights into the evolution of hexaploid bread wheat.” Nature genetics vol. 57,4 (2025): 1008-1020. doi:10.1038/s41588-025-02137-x
Yao, Xue-Feng et al. “Complete genome assembly of japonica rice variety Zhonghua 11.” Plant communications vol. 6,10 (2025): 101463. doi:10.1016/j.xplc.2025.101463
Lv, Zhiyuan et al. “Near telomere-to-telomere genome assembly of Camellia pitardii.” Scientific data vol. 12,1 1422. 14 Aug. 2025, doi:10.1038/s41597-025-05764-5
Du, Haiyuan et al. “A near-complete genome assembly of Fragaria iinumae.” BMC genomics vol. 26,1 253. 14 Mar. 2025, doi:10.1186/s12864-025-11440-0
Chen, Weikai et al. “The complete genome assembly of Nicotiana benthamiana reveals the genetic and epigenetic landscape of centromeres.” Nature plants vol. 10,12 (2024): 1928-1943. doi:10.1038/s41477-024-01849-y
Haplotype-resolved T2T Genome
Khan, Falak Sher et al. “Haplotype-resolved T2T gap-free genomes of the winegrape cultivar Cabernet Sauvignon.” Scientific data, 10.1038/s41597-026-06910-3. 26 Feb. 2026, doi:10.1038/s41597-026-06910-3
T2T Genome + Comparative Genome
Hong, Lin et al. “Construction and analysis of telomere-to-telomere genomes for 2 sweet oranges: Longhuihong and Newhall (Citrus sinensis).” GigaScience vol. 13 (2024): giae084. doi:10.1093/gigascience/giae084
Li, Xiao-Jie et al. “Analysis of telomere-to-telomere genome of red carrot TXH4 elucidates the role of DcLCYE and DcLCYB1 in lycopene accumulation in carrot.” Horticulture research vol. 12,11 uhaf192. 29 Jul. 2025, doi:10.1093/hr/uhaf192
T2T Genome + Pangenome
Wang, Xiaojing et al. “T2T genome, pan-genome analysis, and heat stress response genes in Rhododendron species.” iMeta vol. 4,2 e70010. 5 Mar. 2025, doi:10.1002/imt2.70010