A Bioinformatics Approach to Predict Predisposition to Amyloidosis
Devastating neurodegenerative diseases such as Alzheimer’s disease, Parkingson’s Disease, and Huntington’s disease are linked to the formation of protein aggregates called amyloid fibrils. Recently experimental techniques have shed light on the structure of amyloids. Using this information we have developed a program to predict amyloidosis from protein sequence information. As whole genome sequencing becomes cheaper, and improvements are made to such programs, it may become possible to create individual risk profiles for these diseases.
A broad range of human diseases are linked to the formation of insoluble, fibrous, protein aggregates called amyloid fibrils. They include, but are not limited to, type II diabetes, rheumatoid arthritis, and perhaps most importantly, debilitating neurodegenerative diseases such as Alzheimer’s disease, Parkinson’s disease, and Huntington’s disease. The mechanism of amyloid fibril formation, the root cause of their toxicity, and the even basic structure of amyloids is poorly understood. There currently exists no cure, and no means of early diagnosis for any of these diseases.
Over the past decade, revolutionary progress has been made in understanding the 3D structural arrangement of amyloid fibrils due to application of new experimental techniques in combination with molecular modelling. [1, 2, 3, 4].
It was recently shown that the majority of structural models of disease-related amyloid fibrils can be reduced to stacks of a repeating unit called a β-arcade . This fold represents a columnar structure produced by stacking of β-strand-loop-β-strand motifs called “β-arches”.
Using this structural insight, we are working on development of a bioinformatics based approach to predict amyloids from protein sequence information. Data from the analysis of β-arcades observed in known 3D structures  was used to identify the most probable conformations of β-arches. Hundreds of beta-arcades with different amino acid sequences were then modelled and ranked based on the results of energy minimization programs. The results of this analysis were incorporated into a rule based algorithm implemented in the Java programming language to create the Arch-Candy program.
The work that we plan for the internship will be to perform tests of Arch-Candy program against a set of protein and peptide sequences known to be related to diseases to estimate the correctness of the program predictions. To extend the capacity of the program to predict not only the individual β-arches but also β-serpentins that consist of several β-arches . To add a module to this program that will allow to generate a 3D structural model of the amyloid fibril for each prediction. To adapt the Arch-Candy program for large scale analysis of the proteomes.
 Kajava A. V., Baxa U., Wickner R. B., Steven A. C. (2004) Proc. Natl. Acad. Sci. U. S. A. 101,
 Kajava, A. V., Aebi, U., and Steven, A. C. (2005) J. Mol. Biol. 348, 247–252
 Nelson R. and Eisenberg D. (2006) Adv. Protein Chem. 73, 235
 Luca, S., Yau, W. M., Leapman, R., and Tycko, R. (2007) Biochemistry 46, 13505–13522
 Kajava AV, Baxa U, Steven A. C. FASEB J. 2010 24(5):1311‐1319
 Hennetin, J., Jullian, B., Steven, A. C., and Kajava, A. V. (2006) J. Mol. Biol. 358, 1094–1105