CMAT Help

  1. Overview
    1. Correlated mutation analsis
    2. The CMAT server
    3. Contacts
  2. Input parameters
    1. Sequence or alignments
    2. Analysis options
    3. Display options
  3. Analysis results
    1. Query information
    2. Summary
    3. Details

Overview

Correlated mutation analysis

Correlated mutation analysis identifies the co-evolutionary relationship between residue positions from multiple sequence alignments. Important residues are both co-evolved with their associated residues and conserved independently, so correlated mutation analysis is an effective method for using the protein’s evolutionary information to understand its function. In fact, correlated mutation analysis is successfully applied to inter-residue contact prediction and functional site detection.

The CMAT server

The CMAT server is developed to provide biologists with a fully automated and reliable pipeline for correlated mutation analysis. CMAT conducts all the processes for correlated mutation analysis, including homology search, multiple sequence alignment construction, sequence redundancy treatment, and various correlated mutation score measures.

By incorporating sequence profile, CMAT reliably estimates the joint probabilities that most correlated mutation measures are derived from. The adjustment consistently improves various correlated mutation measures regardless of the quality of the input alignments.

Contacts

The CMAT server is developed and maintained at the Bioinformatics and Computational Biology Laboratory, Korea Advanced Institute of Science and Technology.

To help with our continual efforts towards improvement, please inform us of any problems and comments with this service.

Input parameters

Sequence or alignments

CMAT can accept either a single sequence or multiple sequence alignments as input. If alignments are submitted, the early stages related to alignment construction will be omitted. For the allowed formats and examples, see below.

Bare sequence

FSAVVSVGDWLQAIKMDRYKDNFTAAGYTTLEAVVHMSQDDLARIGITAITHQNKILSSV
QAMR

FASTA sequence

>QUERY
FSAVVSVGDWLQAIKMDRYKDNFTAAGYTTLEAVVHMSQDDLARIGITAITHQNKILSSV
QAMR

FASTA alignments

>QUERY
FSAVVSVGDWLQAIKM--------DRYKDNFTAAGYTTLEAVV---HMSQDDLA-RIGIT
A---ITHQNKILSSVQAMR
>gi|13096661|gi|13096661| 
YHADPSLVSFLTGLGC--------PNCIEYFTSQGLQSIYHLQ---NLTIEDLG-ALKIP
E----QYRMTIWRGLQDLK
>gi|13636617|gi|13636617| 
AWTEDDVYCWVQQLVRKGDSSAEMSVYASLFKENNI-TGKRLL---LLEEEDLK-DMGIV
S---KGHIIHFKSAIEKLT
>gi|14133209|gi|14133209| 
VWSNERVMGWVSGLGL--------KEFATNLTESGV-HGALLALDETFDYSDLALLLQIP
TQ-NAQARQLLEKEFSNLI
>gi|1708621|gi|1708621| 
DWSLNSVLQFLKLYKFN-------KEWEDVFIKSRI-EMDLFIN--LADQSKAE-EFAFK
NKLSKESAIQLSSCIRKTL

Analysis options

Sequence weight: Sequence weighting scheme used for estimating amino acid frequencies

Pseudocount: Pseudocount method used for correcting a low number of observations

Simple pseudocount: Constant value to be added to each joint frequency in simple pseudocount method

Adjustment: Joint probability adjustment based on sequence profile. This option subsequently corrects the marginal probabilities to be consistent with sequence profile.

Display options

Min. MI: Only the position pairs above this cut-off mutual information will be displayed.

Min. Zp: Only the position pairs above this cut-off Zp score will be displayed.

Min. Zc: Only the position pairs above this cut-off Zc score will be displayed.

Min. Neff: Only the position pairs whose corresponding alignments have an effective number of sequences above this cut-off number will be displayed.

Min. position distance: Only the position pairs separated from each other by at least this cut-off position distance will be displayed.

Max. # of hits in summary: How many position pairs will be displayed in the summary section.

Max. # of hits in detail: How many position pairs will be displayed in the details section.

Analysis results

Query information

This section shows sequence information (one-letter sequence and length), input type (sequence or alignments), request ID, and submitted and finished date and time. The request ID is used to revisit the result page.

Summary

This section shows a summary of co-evolving position pairs. It includes the position index, the effective number of sequences, the mutual information, and the mutual information variants that remove any noise signal. Only those position pairs that satisfy the given cut-off parameters are displayed. The entries are ordered by Zc score.

Details

This section provides a detailed display of the correlated mutation information for a pair of residue positions. In addition to the correlated mutation scores, the commonly observed residue types are listed, ordered by the frequencies. For each residue type, the observed and expected probabilities are shown. The pointwise mutual information (an association measure between residue types) is also calculated and shown, as is the cumulative density.