Help finding Bioinformatics tools?

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

Post Flair (click to view more posts with a particular flair)

technical question

Post Body

Hello!

I'm carrying out a bioinformatics project as part of the computational section of my master's degree project and need some help finding the right tools for the job.

So far I've been carrying out a conservation analysis on my protein of interest using the Consurf platform. My protein of interest is conserved across eukaryotes and has two paralogues, 1 and 2, supposedly in "higher eukaryotes" and I'm trying to find out when exactly this divergence occurred by sifting through the WASABI phylogenetic trees I've generated in Consurf which is.... fun. This part I'm ok with I think, it's just tedious and I don't really know how to use the WASABI very well.

My supervisor has advised me to map the conservation pattern of each paralogue between organisms, so look at all the mammalian paralogue A's for example, which share like 80-100% sequence ID, or like all the chordates and map their conservation scores onto a pymol structure. What's interesting, however, is that paralogue A and B for the same organism seem to only show about 50% ID, but I've only tested this for a very small number of organisms. I'd like to see whether this pattern is maintained across organisms.

What I'm looking for is a sort of high throughput pairwise alignment tool where I can align the paralogues in each organism and compare the shared sequence ID between the paralogues across a range of organisms. I'm not sure if I've explained that very clearly but the output should look something my diagram below (which uses made up numbers). Of course I can just do a bunch of pairwise alignments manually but I'm wondering whether theres a tool that I can use to upload a bunch of sequences to pairwise align all in one go without needing 500 tabs of Emboss Needle open. Maybe I'd be able to spot some patterns where in, idk, bony fish for example the paralogues are more closely related. I don't know, but I think it would be cool

Organism	%ID	Alignment
Orangutan 1/2	51	GCNAL......
Dolphin 1/2	48	etc
Chimpanzee 1/2	44	etc

The main focus of my project is not on evolution, it's supposed to be more like structural biology but cos of the 'rona we've had to make do with some computational stuff often in areas we're less familiar with so any help is welcome. If anyone has any advice on conservation analysis with protein paralogues or identifying gene duplication events or even some cool literature to read that would be great!

Thanks everyone!

Author

Account Strength

100%

Account Age

5 years

Verified Email

Yes

Verified Flair

Total Karma

20,123

Link Karma

2,467

Comment Karma

17,532

Profile updated: 2 days ago

Posts updated: 1 year ago

_HelicalTwist_

Subreddit

r/bioinformatics

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.

Posted: 4 years ago
Reddit URL: View post on reddit.com
External URL: reddit.com/r/bioinformat...