toGC: a new tool for correcting errors in gene annotations using RNA-seq data
Published 18 March, 2026
The accuracy of genomic annotation is crucial for subsequent functional investigations; however, computational protocols used in high-throughput annotation of open reading frames (ORFs) can introduce inconsistencies. These inconsistencies, which lead to non-uniform extension or truncation of sequence ends, pose challenges for downstream analyses. Existing strategies to rectify these inconsistencies are time-consuming and labor-intensive, lacking specific approaches.
To address this gap, a team of researchers from China developed toGC, a tool that integrates genomic annotation with RNA-seq datasets to rectify annotation inconsistencies. They reported their results in the Journal of Integrative Agriculture.
"Using toGC, we achieved an accuracy of nearly 100% accuracy in correcting inconsistencies in published Phytophthora sojae ORFs," shares corresponding author Yuanchao Wang, a professor at Nanjing Agricultural University. "We applied this innovative pipeline to the GPCR-bigrams gene family, which was predicted to have 42 members in the P. sojae genome but lacked experimental validation."
By employing toGC, the researchers identified 32 GPCR-bigram ORFs with inconsistencies between previous annotations and toGC-corrected sequences. Notably, among these were 5 genes (GPCR-TKL9, GPCR-TKL15, GPCR-PDE3, GPCR-AC3, and GPCR-AC4) showed substantial inconsistencies.
"Experimental gene annotation confirmed the effectiveness of toGC, as sequences obtained through cloning matched those annotated by toGC," adds Wang. "We discovered two novel GPCRs (GPCR-AC3 and GPCR-AC4), which were previously mispredicted as a single gene."
CRISPR/Cas9-mediated knockout experiments revealed the involvement of GPCR-AC4, but not GPCR-AC3 in oospore production, further confirming their status as two separate genes.
"In addition to P. sojae, the reliability of the toGC pipeline in Phytophthora capsici and Pythium ultimum further emphasizes the robustness of this pipeline," shares co-corresponding author Ming Wang, a professor at Nanjing Agricultural University. "Our findings highlight the utility of toGC for reliable gene model correction, facilitating investigations into biological functions and offering potential applications in diverse species analyses."
Contact Authors:
Min Qiu, E-mail: minqiu@njau.edu.cn;
Chun Yan, E-mail: 2022202060@stu.njau.edu.cn;
#Correspondence Yuanchao Wang, E-mail: wangyc@njau.edu.cn;
Ming Wang, E-mail: mwang@njau.edu.cn
Funder:
This work was supported by the grants to Min Qiu and Ming Wang from the National Natural Science Foundation of China (32100160 and 32100044), the grants to Ming Wang from the Jiangsu "Innovative and Entrepreneurial Talent" Program, China (JSSCRC2021510), and the grants to Yuanchao Wang from the Chinese Modern Agricultural Industry Technology System (CARS-004-PS14).
Conflict of Interest:
The authors declare that they have no conflict of interest.
See the Article:
Qiu M, et al. 2026. toGC: A pipeline to correct gene model for functional excavation of dark GPCRs in Phytophthora sojae. Journal of Integrative Agriculture, 25(1): 150-156.