Enzymes play indispensable roles in catalyzing biochemical reactions in living organisms. Intensive efforts have been devoted to the computational prediction of catalytic residues in enzymes individually utilizing feature- or template-based strategy, but there are no studies that systematically compare the strengths and limitations of these two strategies and further consider whether their combination can be utilized to enhance the prediction performance. Herein we established the first integrative algorithm, called CRHunter, by simultaneously utilizing the complementarity between feature- and template-based strategies and that between structural and sequence information. The Delaunay triangulation and Laplacian transformation were first used to characterize enzyme structures, resulting in several novel structural features. Combining them with traditional descriptors, we developed two support vector machine feature predictors individually based on structural and sequence information. Meanwhile, we invented two template predictors by respectively using structure and profile alignments. Evaluated on the datasets with different levels of homology, our feature predictors can achieve relatively stable performance, whereas our template predictors yield poor results as the homological constraint increases. Even so, the hybrid algorithm CRHunter consistently achieves the highest prediction accuracy among all our proposed predictors, indicating the importance of integrating different strategies. We further demonstrate that our proposed methodology can also be applicable to the simulated structures of enzymes, which is extremely useful for the query proteins having only sequence information. Compared to the state-of-the-art methods, our algorithm shows obvious advantages on various datasets, suggesting that CRHunter is an effective and efficient web tool for predicting catalytic residues.

Flowchart of our CRHunter algorithm
Jun Sun, Jia Wang, Dan Xiong, Jian Hu, Rong Liu. CRHunter: integrating multifaceted information to predict catalytic residues in enzymes. (In submission)