Fold Recognition

Q: What sequence identity is needed for fold recognition?

Fold recognition is specifically designed for low sequence identity (<25%). It detects structural similarity even when sequences have diverged significantly.

Q: How does fold recognition differ from standard homology modeling?

Standard homology modeling relies on detectable sequence similarity. Fold recognition uses structural information, 1D-3D profiles, and threading to identify templates when sequence homology is insufficient.

Q: Can you predict novel folds not in the database?

No. Fold recognition assigns query proteins to known folds. For truly novel folds, ab initio or deep learning methods are required. We offer hybrid approaches combining threading with other methods.

Q: What is included in the output?

Output includes template protein information, query-template alignments, confidence scores, and full-length structural models ready for refinement or downstream analysis.

Q: Can fold recognition predict protein function?

Indirectly. Similar folds often imply similar functions. We provide functional annotation based on template functions, active site prediction, and pathway associations.

Q: What input do you need to start a project?

We need the amino acid sequence of the query protein. Additional information such as domain boundaries, known functions, or experimental constraints can improve results but are not required.

Fold recognition and threading for protein structure prediction

At Profacgen, our Fold Recognition services deliver accurate structural fold identification and remote homology detection for proteins with low sequence identity to known structures, supporting structure prediction, functional annotation, and target discovery through advanced threading algorithms.

Protein tertiary structure is essential for understanding function and guiding drug discovery. Computational prediction serves as a powerful complement to experimental methods. Fold recognition (or threading) identifies structurally similar proteins to serve as templates for modeling, particularly when sequence identity falls below 25%.

Profacgen combines sequence profile-profile alignment with structural information to recognize correct folds. Our workflow involves building a template database, aligning the query sequence against each template via optimized scoring functions, and iterating across all known structures to find the best match. The most statistically probable alignment is used to construct a structural model by mapping backbone atoms onto the selected template. Unlike sequence-only approaches, our method leverages 3D structural data for enhanced accuracy. All models are quality-validated and suitable for downstream applications such as protein engineering and drug design.

Overview of Fold Recognition

Fold recognition (threading) addresses the challenge of predicting protein structures when sequence homology is insufficient for conventional comparative modeling:

Remote homology detection: Identification of evolutionary relationships between proteins with divergent sequences (<25% identity) but conserved tertiary folds, using profile-profile alignment and structural feature comparison
Structural fold identification: Assignment of query proteins to known structural classes (α, β, α/β, α+β) and specific folds from the SCOP/CATH databases, enabling structural annotation without experimental data
Function prediction: Inference of molecular function from structural similarity, as proteins with similar folds often share functional mechanisms, active site architectures, and interaction partners
Template discovery: Identification of optimal structural templates for downstream homology modeling, even when sequence-based searches fail to detect meaningful relationships

Protein structure prediction by fold recognition threading workflow Figure 1. Fold recognition workflow: from template database construction and profile-profile alignment through scoring function optimization to structural model generation.

The output includes information about template proteins, the query-template alignments and most importantly full-length models of the target protein. We can also customize the service according to the specific requirements from our customers and integrate our computational procedures into your workflow.

Our Capabilities

Our fold recognition platform encompasses four specialized service modules, each addressing critical aspects of remote homology detection and structure prediction:

Fold Classification

Systematic assignment of query proteins to known structural fold classes.

Template database construction from curated PDB structures and fold classifications
1D-3D profile comparison for structural similarity assessment
Assignment to SCOP/CATH fold classes and structural superfamilies
Confidence scoring and statistical significance evaluation for fold assignments

Remote Homology Analysis

Detection of evolutionary relationships beyond sequence similarity thresholds.

Accurate sequence and structural alignment algorithms for remote homology identification
Profile-profile alignment using position-specific scoring matrices and hidden Markov models
Use of functional domain information and sequential evolution information
Multiple threading methods to identify converged solutions and increase confidence

Template Identification

Optimal template selection for downstream structure prediction and modeling.

Knowledge-based scoring function containing mutation potential, environment fitness potential, pairwise potential, secondary structure compatibilities, and gap penalties
Hybrid methods combining threading with other structure prediction methods
Ranking and validation of candidate templates by energy calculation
Template coverage optimization for maximum structural information transfer

Functional Annotation Support

Structure-based inference of molecular function from predicted folds.

Active site and binding pocket identification from template structures
Functional site mapping and catalytic residue prediction
Gene ontology and pathway annotation based on structural classification
Integration of fold recognition results with functional genomics databases

Applications

Our Fold Recognition services support a broad spectrum of applications across structural biology and genomics:

Novel Protein Characterization: Structural and functional annotation of proteins with no detectable sequence homology to characterized families, enabling mechanistic hypothesis generation and experimental design prioritization
Structure Prediction: Template identification and alignment generation for proteins with remote homology, providing starting models for refinement and validation when conventional homology modeling fails
Genome Annotation: Large-scale structural annotation of proteomes and metagenomic datasets, assigning folds and predicted functions to uncharacterized open reading frames
Target Discovery: Identification of novel therapeutic targets through structural fold analysis, including druggability assessment based on predicted pocket geometries and fold-class specific properties

Deliverables

Profacgen provides structured, analysis-ready documentation aligned with your fold recognition and structure prediction requirements:

Deliverable	Description
Fold Recognition Reports	Comprehensive documentation of fold assignments, confidence scores, statistical significance, structural class annotations, and comparison to known fold families with evolutionary analysis
Candidate Structural Templates	Ranked list of optimal template structures with alignment details, coverage maps, scoring function values, and energy validation results for downstream modeling
Functional Annotation Results	Predicted molecular functions, active site architectures, binding pocket characteristics, and pathway associations derived from structural fold similarity and template functional annotations

Request a quote

Why Choose Our Fold Recognition Services?

Advanced Profile-Profile Methods: We combine sequence profile-profile alignment with multiple structural information sources, enabling detection of remote homologies invisible to standard sequence comparison.
Knowledge-Based Scoring Functions: Our optimized scoring functions incorporate mutation potential, environment fitness, pairwise interactions, secondary structure compatibility, and gap penalties for accurate template ranking.
Multiple Convergent Methods: We employ multiple threading algorithms and hybrid approaches combining threading with ab initio prediction, ensuring robust identification of the correct fold through method convergence.
Integrated Workflow Compatibility: Our output includes full-length models, alignments, and template information ready for direct integration into your structure prediction, protein engineering, or drug design pipelines.

Related Services

Representative Program Scenarios

Scenario 1: Structural Annotation of a Metagenomic Protein Family

Program Context:

A microbiology research group identified a family of 200 uncharacterized proteins from a deep-sea metagenomic dataset. Standard BLAST searches returned no significant hits, preventing functional and structural annotation.

Objective:

To assign structural folds and predict functions for the protein family using fold recognition, enabling experimental prioritization and mechanistic hypothesis generation.

Approach:

Profacgen constructed sequence profiles for the protein family and performed threading against a comprehensive template database of 50,000 known folds. Profile-profile alignment and 1D-3D scoring identified a conserved α/β hydrolase fold with significant confidence. Multiple threading methods converged on the same fold assignment. Functional annotation based on the template active site architecture predicted esterase activity.

Outcome:

Fold recognition assigned the α/β hydrolase fold to 85% of the family members with high confidence. Experimental testing of 10 representative proteins confirmed esterase activity in 8, validating the structure-based functional prediction and enabling focused biochemical characterization of the family.

Scenario 2: Template Discovery for a Therapeutic Target with Low Sequence Identity

Program Context:

A pharmaceutical company identified a kinase-like protein as a potential oncology target but could not find suitable templates for homology modeling using standard sequence-based searches, as the closest homologues shared only 18% sequence identity.

Objective:

To identify structurally suitable templates for homology modeling and assess the feasibility of structure-based drug design despite low sequence homology.

Approach:

Profacgen performed fold recognition using profile-profile alignment and structural feature comparison. Threading identified a protein kinase fold with significant statistical confidence despite the low sequence identity. The optimal template was selected based on scoring function optimization and energy validation. A structural model was constructed and refined for ATP-binding pocket analysis.

Outcome:

The fold recognition approach identified a suitable kinase template that sequence methods missed. The resulting model revealed a druggable ATP-binding pocket with conserved hinge region geometry, enabling structure-based virtual screening. Two hit compounds from the screen showed micromolar activity in biochemical assays, validating the template and demonstrating the value of threading for challenging targets.

Get a Project Assessment

Frequently Asked Questions (FAQs)

Q: What sequence identity is needed for fold recognition?

A: Fold recognition is specifically designed for low sequence identity (<25%). It detects structural similarity even when sequences have diverged significantly.

Q: How does fold recognition differ from standard homology modeling?

A: Standard homology modeling relies on detectable sequence similarity. Fold recognition uses structural information, 1D-3D profiles, and threading to identify templates when sequence homology is insufficient.

Q: Can you predict novel folds not in the database?

A: No. Fold recognition assigns query proteins to known folds. For truly novel folds, ab initio or deep learning methods are required. We offer hybrid approaches combining threading with other methods.

Q: What is included in the output?

A: Output includes template protein information, query-template alignments, confidence scores, and full-length structural models ready for refinement or downstream analysis.

Q: Can fold recognition predict protein function?

A: Indirectly. Similar folds often imply similar functions. We provide functional annotation based on template functions, active site prediction, and pathway associations.

Q: What input do you need to start a project?

A: We need the amino acid sequence of the query protein. Additional information such as domain boundaries, known functions, or experimental constraints can improve results but are not required.