Fork me on GitHub


The displayed data set consists of of 139 proteins, each characterized by percentage of its 20 amino acid. Protein structures were obtained from the Protein Data Bank (PDB). We selected non-redundant (maximum sequence identity of 70%) eukaryotic membrane proteins resolved by X-ray crystallography. Each protein has one chain long at least 100 residues.

The data were clustered using Euclidean distance and Ward's linkage. Protein 3D structure is rendered by the GLmol molecular viewer. PDB file of each structure is downloaded directly from the PDB server, and it can take up to 20 seconds.

Data set information

When the heatmap row is clicked, a corresponding pdb file is dynamically fetched from the PDB database, and the protein 3D model is displayed. Various information linked to external databases are summarized below protein visualization.