tag:treefam.tenderapp.com,2013-02-16:/discussions/questions/17-number-of-proteins-in-treefamTreeFam: Discussion 2015-06-08T13:54:22Ztag:treefam.tenderapp.com,2013-02-16:Comment/277660662013-07-14T15:51:13Z2013-07-14T15:51:14ZNumber of Proteins in TreeFam<div><p>Hi,<br>
the TreeFam help page states that TreeFam v9 is based on 2243919
protein sequences. Within the alignments and trees I downloaded
however, I can only find 1088507 unique proteins. I realize that
orphans or proteins which have no homologs in at least two other
species will not be included in these datasets, but 1155412
proteins being orphans or having only an ortholog in one other
species seems to be quite a lot. My questions are: are the numbers
correct and are there other potential causes that could lead to
proteins not being included in the alignments/trees?<br>
Best wishes,<br>
Andy</p></div>Andreas Schülertag:treefam.tenderapp.com,2013-02-16:Comment/277660662013-07-22T15:03:18Z2013-07-22T15:03:18ZNumber of Proteins in TreeFam<div><p>Dear Andy,<br>
thanks for your message and sorry for the late reply.<br>
You are right, the number of orphan genes is quite high.</p>
<p>The answer is that some of those genes should be in a family<br>
but are currently not.<br>
We are looking into ways of building many new families in an
automated way.</p>
<p>But once again, currently this number is high and we expect to
reduce it soon.</p>
<p>Hope that answers your question.</p>
<p>Cheers,<br>
Fabian</p></div>Fabian