Welcome to the AAstretch portal. AAstretch is a computer program designed to scrutinize entire proteomes for imperfect homopetidic stretches and their corresponding coding sequences. We expanded classical concept of "single amino acid repeat", usually defined as a consecutive stretch of a single amino acid residue, to a more biologically meaningful definition taking into account insertions. Together with the protein and coding sequences of the repeats, the program retrieves a number of parameters that you will be able to graphically and automatically analyze thanks to a dedicated software. You will also find ready-to-use proteomes for a number of organisms in the Organisms section.

The project started with a precise biological aim in mind: to discover the features and deepen the knowledge of the polyglutamine stretches typical of triplet expansion diseases. In fact, diseases such as Huntington's disease, Spinocerebellar Ataxia and many other are characterized by an epigenetic elongation, due to several genetic weakness in the DNA replication process, of glutamine tracts into the protein, resulting in their decrease solubility and in the formation of fibrils that impair the cellular machinery and lead to organic disfunctions.

To accomplish this task, we conceived a computer program, AAstretch, that it is able to scan a properly formatted set of protein sequences and their corresponding coding sequences and extract from them imperfect homopetidic repeats of a given amino acid. The user can define the nature of the residue repeated and the maximal length and proportion of insertions tolerated, together with other parameters that are defined in the Manual. The program emits a tabular output in which a set of features are reported for each repeat, such as its positioning on the sequence, its flanking regions, the annotations and GO terms of the containing protein and many others. We also designed an interactive graphic tool, AAexplore that automatically binds to AAstretch output files and presents the results in a readily interpretable form supported by chi-squared statistics and plots.

Any sequence or sequence set can be scanned with AAstretch. We developed an automatic builder (AAprepare) that allows working on whole genomic sets, Linked to EnsEMBL genomic database and taking advantage of the BioMart services, AAprepare prepares organism specific annotated gene sets in a format suitable for the AAstretch. You can find that ready-to-use files for a number of organism spanning the different life kingdoms in the Organisms section. Since this section is intended to be yearly updated (following EMBL genome releases, possibly) all you need to do is download AAstretch and its manual, download your preferred organism and start analysing. A full genome scanning takes one minute or two on modern computers. AAstretch is very simple to run: edit a configuration file to change rather intuitive parameters, then run the anlaysis trough the interactive menu.
