| |
| 1. The database deals purely
with proteins found in plasma/serum. Why was it necessary
to create this resource? |
Researchers believe that the majority
of proteins in the body, at some point of time, end up
in the plasma. Plasma is the most commonly diagnosed clinical
sample, clinicians are able to detect diseases just by
testing for a few specific proteins. Sorting out, accumulating
and presenting in a user-friendly and queryable manner
the vast amount of data that has already been generated
before and during the Plasma Proteome Project will bring
some sense and direction to future research. This is why
the Plasma Proteome Database was created.
|
| 2. What exactly do you mean by protein annotation? |
Our definition of protein annotation
involves hunting for detailed information on each and every
protein and referencing it to a corresponding PubMed identifier,
or to a reputed prediction program. Wherever possible,
sequences are shown with their corresponding accession
numbers, if not, they are referenced to the literature
which talks about it. We have also crosslinked our annotations
to other databases so that the researcher has the flexibility
of simultaneously viewing the annotations of others, and
will thus have a more comprehensive picture of the role
of each particular protein which is found in the plasma.
|
| 3. What are the various fields that have
been annotated? |
We have included information regarding
protein sequence and isoforms, post translational modifications
with special focus on proteolytic cleavage, sites of expression,
cellular component and SNPs which fall within the coding
sequence of a gene.
|
| 4. I have noticed that the site/residue of
a modification mentioned in the referenced literature does
not always correspond to the site/residue mentioned in
the database. Why is this? |
In the literature, if the site which
is undergoing modification does not correspond to the sequences
that we have entered in our annotation, we "map" the modification
site onto the sequences. This means that we look at the
motif in which the modification site lies and then find
the same motif in our sequence. If the paper mentions which
isoform they used, we map the modification only to that
isoform, if the paper makes no mention of this, we map
the modification to all the isoforms of that gene.
|
| 5. Why has proteolytic cleavage been given
such importance? |
Proteolytic cleavage is particularly
important for several reasons. Each extracellular protein
has to have some sort of a signal sequence which directs
the protein to be secreted, these signals are usually cleaved
off and only the "mature" protein is secreted. Alternatively
the proteins may be part of some larger structure, such
as the plasma membrane, and are released into the plasma
only on cleavage. Many cascades are activated by proteolytic
cleavage, the main ones being the blood coagulation cascade
and complement pathway activation. Seeing it's predominance
as the main signalling cascade in an extracellular environment,
we have decided to highlight this particular modification.
|
| 6. What are cSNPs and why have you focussed
on this? |
Single nucleotide polymorphisms
can affect any part of the genome. SNPs which are found
in the coding region of genes are called cSNPs. We are
only interested in these as they affect the coding sequence
of the gene, hence they can cause changes to the protein
structure and function.
|
| 7. Does the database follow standardisations
that have been set by Gene Ontology? |
Yes, all information related to molecular
function, biological process and cellular compeonent are
GO compatible.
|
| 8. I am from a commercial entity - what
do I need to do to download the database for commercial
purposes? |
Please send an e-mail.
|