New genes can emerge through gene duplication, new combinations of preexisting protein domains, or de novo gene emergence (i.e., the emergence of a gene from non-coding DNA). However, the detection of novel genes is not trivial, particularly for genes that are taxonomically restricted (i.e., genes that are only present in certain groups of organisms), since there are important methodological caveats where genes appear as being novel due to an artifact of their limited detectability rather than their true evolutionary novelty. The accurate detection of novel genes is necessary to make robust inferences about the nature of their emergence, their evolutionary dynamics, and the role they play in shaping lineage-specific traits.
I'm currently working on the establishment of bioinformatic methods to detect taxonomically-restricted genes throughout the tree of life and assess their novelty through homology detection failure tests. We recently established a novel pipeline called GenEra to do exactly that: establish the relative ages of all the genes in any genome of interest and test for homology detection failure to separate truly novel genes from methodological artifacts. Click here to give it a try!