Beyond “Junk DNA”: Re-exploring Pseudo gene Annotation and Functional Analysis with Artificial Intelligence and Machine Learning

  • 1Salwa Jaber Al-Awadi, 2Abdulameer M. Ghareeb, 3Zainib Hatif Abbas

Abstract

Pseudogenes, once regarded as nonfunctional genomic relics, are now recognized as important contributors to gene regulation, chromatin remodeling, transcriptional modulation, and disease-associated pathways. Traditional annotation pipelines—built primarily on heuristic mutation-based criteria and sequence similarity—frequently misclassify pseudogenes, overlooking subtle yet significant biological functions. Recent advances in artificial intelligence (AI) and machine learning (ML) have enabled the integration of multi-omics datasets, allowing for deeper and more accurate functional inference of pseudogenes. Deep learning architectures such as convolutional neural networks (CNNs), long short-term memory networks (LSTMs), transformers, and graph neural networks (GNNs) have demonstrated remarkable capability in modeling genomic sequences, identifying hidden open reading frames (ORFs), reconstructing regulatory networks, and predicting pseudogene-mediated effects on virulence and immunity. This review synthesizes the rapid developments in AI-driven pseudogene annotation, describes emerging multi-omics approaches, highlights the link between pseudogenes and virulence, and outlines future directions toward comprehensive, mechanistic pseudogene catalogs.

Published
2026-04-09