UniProt Knowledgebase
Swiss-Prot Protein Knowledgebase
TrEMBL Protein Database

What's new in XML?
Release 2013_12 of 11-Dec-2013

Also read about forthcoming changes, and recent and forthcoming changes for the flat file version of the UniProt Knowledgebase.

Questions regarding UniProtKB XML should be directed to our Help Desk.

UniProt release 2013_08 of 24-Jul-2013

New element 'disease'

As of release 2013_05, the description of human genetic diseases that have links to the Online Mendelian Inheritance in Man knowledgebase (OMIM) are structured in a way to facilitate the retrieval of disease information from UniProtKB. We adapted the comment type 'disease' accordingly by adding a new element 'disease' to the XSD as highlited below:

    <xs:complexType name="commentType">
    ...
        <xs:sequence>
            <xs:choice minOccurs="0" maxOccurs="1">
            ...
                <xs:element name="disease">
                    <xs:annotation>
                        <xs:documentation>Used in 'disease' annotations.</xs:documentation>
                    </xs:annotation>
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="name" type="xs:string"/>
                            <xs:element name="acronym" type="xs:string"/>
                            <xs:element name="description" type="xs:string"/>
                            <xs:element name="dbReference" type="dbReferenceType"/>
                        </xs:sequence>
                        <xs:attribute name="id" type="xs:string" use="required"/>
                    </xs:complexType>
                </xs:element>

            </xs:choice>
            ...
            <xs:element name="text" type="evidencedStringType" minOccurs="0">
                <xs:annotation>
                    <xs:documentation>Used to store non-structured types of annotations, as well as optional free-text notes of structured types of annotations.</xs:documentation>
                </xs:annotation>
            </xs:element>

        </xs:sequence>

Please note that only the diseases that are linked to OMIM are described with the new element 'disease'. Other diseases will continue to be described in free text comments.

Example:

<!-- Disease described in OMIM -->
<comment type="disease" evidence="1 2">
   <disease id="DI-01359">
      <name>Colorectal cancer</name>
      <acronym>CRC</acronym>
      <description>A complex disease characterized by ... family history.</description>
      <dbReference type="MIM" id="114500"/>
   </disease>
   <text>The gene represented in this entry is involved in disease pathogenesis.</text>
</comment>
<!-- Disease not described in OMIM -->
<comment type="disease" evidence="2">
   <text>A polymorphism in PDGFRL has ... and the gastrointestinal tract.</text>
</comment>
Change of the cross-reference GlycoSuiteDB to UniCarbKB

GlycoSuiteDB, an annotated and curated relational database of glycan structures, has been integrated into UniCarbKB, with a new user interface and added functionalities.

We therefore changed the corresponding resource abbreviation from GlycoSuiteDB to UniCarbKB.

This change did not affect the XSD, but may nevertheless require code changes.

Example:

Previous format:

<dbReference type="GlycoSuiteDB" id="P02763"/>

New format:

<dbReference type="UniCarbKB" id="P02763"/>

UniProtKB/Swiss-Prot is currently linked to this resource via cross-references (dbReference elements), but we also have some site-specific links via the id attribute of feature elements of relevant UniProtKB/Swiss-Prot entries. An increase of the number of cross-linked entries is planned, including more literature based glycan data from UniCarbKB.

UniProt release 2013_03 of 06-Mar-2013

Change of the cross-references to the Gene3D database

The Gene3D database no longer provides names for their signatures. The property of type entry name that has been displayed in the cross-references was therefore removed.

This change did not affect the XSD, but may nevertheless require code changes.

Examples:

Previous format:

<dbReference type="Gene3D" id="2.60.210.10">
  <property type="entry name" value="TRAF-type"/>
  <property type="match status" value="1"/>
</dbReference>
<dbReference type="Gene3D" id="3.30.40.10">
  <property type="entry name" value="Znf_RING/FYVE/PHD"/>
  <property type="match status" value="1"/>
</dbReference>

New format:

<dbReference type="Gene3D" id="2.60.210.10">
  <property type="match status" value="1"/>
</dbReference>
<dbReference type="Gene3D" id="3.30.40.10">
  <property type="match status" value="1"/>
</dbReference>
Introduction of 'geneType'

We have changed the XSD as highlited below:

                <!-- REMOVED:
                <xs:element name="gene" minOccurs="0" maxOccurs="unbounded">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="name" type="geneNameType" maxOccurs="unbounded"/>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
                -->
                <xs:element name="gene" type="geneType" minOccurs="0" maxOccurs="unbounded"/>
    ...
    <!-- Gene definition begins -->
    <xs:complexType name="geneType">
        <xs:annotation>
            <xs:documentation>Describes a gene.
            Equivalent to the flat file GN-line.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="name" type="geneNameType" maxOccurs="unbounded"/>
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name="geneNameType">
        <xs:annotation>
            <xs:documentation>Describes different types of gene designations.
            Equivalent to the flat file GN-line.</xs:documentation>
        </xs:annotation>
        <xs:simpleContent>
            <xs:extension base="xs:string">
                <xs:attribute name="evidence" type="intListType" use="optional"/>
                <xs:attribute name="type" use="required">
                    <xs:simpleType>
                        <xs:restriction base="xs:string">
                            <xs:enumeration value="primary"/>
                            <xs:enumeration value="synonym"/>
                            <xs:enumeration value="ordered locus"/>
                            <xs:enumeration value="ORF"/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:attribute>
            </xs:extension>
        </xs:simpleContent>
    </xs:complexType>
    <!-- Gene definition ends -->

This change has no effect on the XML representation, but may require code changes if code is auto-generated from the XSD.

UniProtKB release 2012_08 of 05-Sep-2012

New cross-references to UniPathway database

We have added cross-references to the UniPathway database. To represent these cross-references we use a new property of type reaction ID to store the optional UniPathway enzymatic reaction identifier.

This change did not affect the XSD, but may nevertheless require code changes.

Examples:

<dbReference type="UniPathway" id="UPA00842"/>

<dbReference type="UniPathway" id="UPA00842">
 <property type="reaction ID" value="UER00808"/>
</dbReference>
UniProtKB release 2012_06 of 13-Jun-2012

Changes to cross-references to PhosSite

The resource identifiers of the cross-references to the Phosphorylation Site Database for Archaea and Bacteria (PhosSite) have changed from a UniProtKB primary accession number to a Phosphorylation Site Database unique identifier for a phosphoprotein.

This change did not affect the XSD, but may nevertheless require code changes.

Example:

Previous format:

<dbReference type="PhosSite" id="P08839"/>

(Link: P08839)

New format:

<dbReference type="PhosSite" id="P0810428"/>

(Link: P0810428)

UniProtKB release 2012_02 of 22-Feb-2012

Change of the cross-reference GeneDB_Spombe to PomBase

The Schizosaccharomyces pombe GeneDB was replaced by PomBase, the new model organism database for the fission yeast Schizosaccharomyces pombe. We have therefore changed the corresponding dbReference type from GeneDB_SPombe to PomBase and the identifiers will be prefixed with PomBase:.

This change did not affect the XSD, but may nevertheless require code changes.

Example:

Previous format:

<dbReference type="GeneDB_Spombe" id="SPCC1223.10c"/>

New format:

<dbReference type="PomBase" id="PomBase:SPCC1223.10c"/>
UniProtKB release 2012_01 of 25-Jan-2012

Change of representation of EC numbers

EC numbers were linked to protein names via the ref and key attributes of the protein names (recommendedName, alternativeName, submittedName) and dbReference elements, resp., as shown in this example:

<protein>
    <recommendedName ref="1">
        <fullName evidence="1">Diphthine synthase</fullName>
    </recommendedName>
</protein>
...
<dbReference type="EC" id="2.1.1.98" key="1" evidence="1"/>

To simplify the data processing for users who want to use EC numbers as synonyms of protein names, we have changed the representation of EC numbers by introducing a new ecNumber child element for protein names to display an EC number as a literal value. The dbReference element was kept, but its key attribute is no longer required. The example above has become:

<protein>
    <recommendedName>
        <fullName evidence="1">Diphthine synthase</fullName>
        <ecNumber evidence="1">2.1.1.98</ecNumber>
    </recommendedName>
</protein>
...
<dbReference type="EC" id="2.1.1.98"/>

The XSD was changed as highlited below:

    <xs:group name="proteinNameGroup">
        <xs:sequence>
            <xs:element name="recommendedName" minOccurs="0">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="fullName" type="evidencedStringType"/>
                        <xs:element name="shortName" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                        <xs:element name="ecNumber" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                    </xs:sequence>
                    <!-- REMOVED: <xs:attribute name="ref" type="xs:string" use="optional"/> -->
                </xs:complexType>
            </xs:element>
            <xs:element name="alternativeName" minOccurs="0" maxOccurs="unbounded">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="fullName" type="evidencedStringType" minOccurs="0"/>
                        <xs:element name="shortName" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                        <xs:element name="ecNumber" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                    </xs:sequence>
                    <!-- REMOVED: <xs:attribute name="ref" type="xs:string" use="optional"/> -->
                </xs:complexType>
            </xs:element>
            <xs:element name="submittedName" minOccurs="0" maxOccurs="unbounded">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="fullName" type="evidencedStringType"/>
                        <xs:element name="ecNumber" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                    </xs:sequence>
                    <!-- REMOVED: <xs:attribute name="ref" type="xs:string" use="optional"/> -->
                </xs:complexType>
            </xs:element>
...
    <xs:complexType name="dbReferenceType">
        <xs:annotation>
            <xs:documentation>Describes a database cross-reference.
            Equivalent to the flat file DR-line.
            </xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="property" type="propertyType" minOccurs="0" maxOccurs="unbounded"/>
        </xs:sequence>
        <xs:attribute name="type" type="xs:string" use="required">
            <xs:annotation>
                <xs:documentation>Describes the name of the database.</xs:documentation>
            </xs:annotation>
        </xs:attribute>
        <xs:attribute name="id" type="xs:string" use="required">
            <xs:annotation>
                <xs:documentation>Describes a unique database identifier.</xs:documentation>
            </xs:annotation>
        </xs:attribute>
        <xs:attribute name="evidence" type="intListType" use="optional"/>
        <!-- REMOVED: <xs:attribute name="key" type="xs:string" use="optional"/> -->
    </xs:complexType>
UniProtKB release 2011_08 of 27-Jul-2011

Change of 'integer' to 'int'

We have replaced all occurrences of the http://www.w3.org/2001/XMLSchema.xsd type integer by int in order to allow users of JAXB to map to a Java primitive data type.

Change of 'evidenceType'

Each UniProtKB entry combines information from a wide range of sources including data imported from DDBJ/ENA/GenBank nucleotide records, data imported from other databases, automatic annotation predictions, and manually curated information added from the scientific literature and sequence analysis programs. Because of this variety of data sources, it is vital that users are provided with a way of tracing the origin of each piece of information in an entry and evaluating it. The UniProt Consortium has begun to approach this challenge by an evidence attribution system which attaches an evidence to most data items in a UniProtKB entry identifying the source(s) and/or method(s) used to generate the data. In the XML version of UniProtKB, evidences were described by an evidenceType with the following XSD:

    <xs:complexType name="evidenceType">
        <xs:annotation>
            <xs:documentation>Describes the evidence for an annotation.
            No flat file equivalent.</xs:documentation>
        </xs:annotation>
        <xs:attribute name="category" use="required">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="curator"/>
                    <xs:enumeration value="import"/>
                    <xs:enumeration value="program"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>
        <xs:attribute name="type" type="xs:string" use="required"/>
        <xs:attribute name="attribute" type="xs:string" use="optional"/>
        <xs:attribute name="date" type="xs:date" use="required"/>
        <xs:attribute name="key" type="xs:string" use="required"/>
    </xs:complexType>

Where:

Going forward, we have standardized these evidences with the widely known Evidence Code Ontology. In future, we will also import evidences from databases from which we import data, where they provide such information. To achieve this, we have modified the evidenceType in the following way:

    <xs:complexType name="evidenceType">
        <xs:annotation>
            <xs:documentation>Describes the evidence for an annotation.
            No flat file equivalent.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="source" type="sourceType" minOccurs="0"/>
            <xs:element name="importedFrom" type="importedFromType" minOccurs="0"/>
        </xs:sequence>
        <xs:attribute name="type" type="xs:string" use="required">
            <xs:annotation>
                <xs:documentation>Describes the type of an evidence using the Evidence Code Ontology (http://www.obofoundry.org/cgi-bin/detail.cgi?id=evidence_code).</xs:documentation>
            </xs:annotation>
        </xs:attribute>
        <xs:attribute name="key" type="xs:int" use="required">
            <xs:annotation>
                <xs:documentation>A unique key to link annotations (via 'evidence' attributes) to evidences.</xs:documentation>
            </xs:annotation>
        </xs:attribute>
        <!-- REMOVED:
        <xs:attribute name="category" use="required">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="curator"/>
                    <xs:enumeration value="import"/>
                    <xs:enumeration value="program"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>
        <xs:attribute name="attribute" type="xs:string" use="optional"/>
        <xs:attribute name="date" type="xs:date" use="required"/>
        -->
    </xs:complexType>
    <xs:complexType name="sourceType">
        <xs:annotation>
            <xs:documentation>Describes the source of the data using a database cross-reference (or a 'ref' attribute when the source cannot be found in a public data source, such as PubMed, and is cited only within the UniProtKB entry).</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="dbReference" type="dbReferenceType" minOccurs="0"/>
        </xs:sequence>
        <xs:attribute name="ref" type="xs:int" use="optional"/>
    </xs:complexType>
    <xs:complexType name="importedFromType">
        <xs:annotation>
            <xs:documentation>Describes the source of the evidence, when it is not assigned by UniProt, but imported from an external database.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="dbReference" type="dbReferenceType"/>
        </xs:sequence>
    </xs:complexType>

As a consequence of this change:

Examples:

Previous format:

<evidence key="EC1" category="curator" type="Literature" attribute="PubMed=11707463" date="2010-07-01"/>

<evidence key="EC2" category="curator" type="Similarity" attribute="P30803" date="2005-12-13"/>

<evidence key="EC3" category="curator" type="Curator" date="2005-12-13"/>

<evidence key="EI4" category="import" type="EMBL" attribute="AAY22750.1" date="2005-08-19"/>

<evidence key="EA5" category="program" type="Rulebase" attribute="RU000381V4.170S0063164" date="2011-02-01"/>

New format:

<evidence key="1" type="ECO:0000006">
    <source>
        <dbReference type="PubMed" id="11707463"/>
    </source>
</evidence>

<evidence key="2" type="ECO:0000044">
    <source>
        <dbReference type="UniProtKB" id="P30803"/>
    </source>
</evidence>

<evidence key="3" type="ECO:0000001"/>

<evidence key="4" type="ECO:0000313">
    <source>
        <dbReference type="EMBL" id="AAY22750.1"/>
    </source>
</evidence>

<evidence key="5" type="ECO:0000203">
    <source>
        <dbReference type="Rulebase" id="RU000381V4.170S0063164"/>
    </source>
</evidence>
Evidence imported from another database record:
<evidence key="6" type="ECO:0000006">
    <source>
        <dbReference type="PubMed" id="18212739"/>
    </source>
    <importedFrom>
        <dbReference type="IntAct" id="EBI-359343,EBI-79792"/>
    </importedFrom>
</evidence>

Please note that some of the ECO codes used in the examples above may change in the future. The ECO is going to define new codes to help the UniProt Consortium to describe different kinds of "automatic assertions" (currently most are ECO:0000203). These will be documented as soon as they become available.

UniProtKB release 2011_01 of 11-Jan-2011

New cross-references to Allergome database

We have added cross-references to the Allergome database. To represent these cross-references we have introduced a new property of type allergen name to store the names of the allergens.

This change did not affect the XSD, but may nevertheless require code changes.

Example:

<dbReference type="Allergome" id="2" key="175">
  <property type="allergen name" value="Aca s 13" />
</dbReference>
<dbReference type="Allergome" id="3051" key="174">
  <property type="allergen name" value="Aca s 13.0101" />
</dbReference>
UniProtKB release 2010_12 of 30-Nov-2010

Changes to cross-references to RefSeq

We have introduced a new property of type nucleotide sequence ID to the cross-reference to the NCBI Reference Sequences database to show the RefSeq nucleotide accession number.

This change did not affect the XSD, but may nevertheless require code changes.

Example:

Previous format:

<dbReference type="RefSeq" id="AP_000992.1" key="33" />
<dbReference type="RefSeq" id="NP_414874.1" key="34" />

New format:

<dbReference type="RefSeq" id="AP_000992.1" key="33">
  <property type="nucleotide sequence ID" value="AC_000091.1" />
</dbReference>
<dbReference type="RefSeq" id="NP_414874.1" key="34">
  <property type="nucleotide sequence ID" value="NC_000913.2" />
</dbReference>
UniProtKB release 2010_10 of 05-Oct-2010

Changes to cross-references to Ensembl and EnsemblGenomes databases

The property of type organism was removed from cross-references to the Ensembl database, because it is no longer necessary to build a valid URL.

The property of type gene designation was replaced by a property of type gene ID to indicate that this element describes a unique gene identifier without biological meaning in the Ensembl and EnsemblGenomes databases.

These changes did not affect the XSD, but may nevertheless require code changes.

Example:

Previous format:

<dbReference type="Ensembl" id="ENST00000220809" key="174">
  <property type="protein sequence ID" value="ENSP00000220809" />
  <property type="gene designation" value="ENSG00000104368" />
  <property type="organism name" value="Homo sapiens" />
</dbReference>

New format:

<dbReference type="Ensembl" id="ENST00000220809" key="174">
  <property type="protein sequence ID" value="ENSP00000220809" />
  <property type="gene ID" value="ENSG00000104368" />
</dbReference>
UniProtKB release 2010_08 of 13-Jul-2010

Removal of WormPep cross-references and changes to cross-references to WormBase

We have removed the cross-references to WormPep and changed the format of the WormBase cross-references. For details of this change, please read the UniProt document What's new?. This change did not affect the XSD, but may nevertheless require code changes.

Example:

Previous format:

<dbReference type="WormBase" id="WBGene00012019" key="52">
  <property type="gene designation" value="dkf-2" />
</dbReference>
<dbReference type="WormPep" id="T25E12.4a" key="53">
  <property type="accession" value="CE18967" />
</dbReference>
<dbReference type="WormPep" id="T25E12.4b" key="54">
  <property type="accession" value="CE18283" />
</dbReference>
<dbReference type="WormPep" id="T25E12.4c" key="55">
  <property type="accession" value="CE42507" />
</dbReference>

New format:

<dbReference type="WormBase" id="T25E12.4a" key="52">
  <property type="protein sequence ID" value="CE18967" />
  <property type="gene ID" value="WBGene00012019" />
  <property type="gene designation" value="dkf-2" />
</dbReference>
<dbReference type="WormBase" id="T25E12.4b" key="53">
  <property type="protein sequence ID" value="CE18283" />
  <property type="gene ID" value="WBGene00012019" />
  <property type="gene designation" value="dkf-2" />
</dbReference>
<dbReference type="WormBase" id="T25E12.4c" key="54">
  <property type="protein sequence ID" value="CE42507" />
  <property type="gene ID" value="WBGene00012019" />
  <property type="gene designation" value="dkf-2" />
</dbReference>
UniProtKB release 2010_07 of 15-Jun-2010

New feature type 'intramembrane region'

A new feature key, INTRAMEM, has been introduced in the flat file format of UniProtKB entries. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we modified the XSD type featureType in the following way:

    <xs:complexType name="featureType">
    ...
        <xs:attribute name="type" use="required">
        ...
                    <xs:enumeration value="intramembrane region"/>

UniProtKB release 15.15 of 02-Mar-2010

Change of 'positionType'

To allow the representation of evidences on individual sequence coordinates of RNA editing commments, we have added an optional evidence attribute to the XSD type positionType:

    <xs:complexType name="positionType">
        ...
        <xs:attribute name="evidence" type="xs:string" use="optional"/>
    </xs:complexType>
Change of 'bpcCommentGroup'

To allow the representation of evidences on biophysicochemical properties commments, we have modified the XSD type bpcCommentGroup by replacing all type="xs:string" by type="evidencedStringType":

    <xs:group name="bpcCommentGroup">
        <xs:sequence>
            <xs:element name="absorption" minOccurs="0" maxOccurs="1">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="max" type="evidencedStringType" minOccurs="0" maxOccurs="1"/>
                        <xs:element name="text" type="evidencedStringType" minOccurs="0" maxOccurs="1"/>
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
            <xs:element name="kinetics" minOccurs="0" maxOccurs="1">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="KM" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                        <xs:element name="Vmax" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                        <xs:element name="text" type="evidencedStringType" minOccurs="0" maxOccurs="1"/>
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
            <xs:element name="phDependence" type="evidencedStringType" minOccurs="0" maxOccurs="1"/>
            <xs:element name="redoxPotential" type="evidencedStringType" minOccurs="0" maxOccurs="1"/>
            <xs:element name="temperatureDependence" type="evidencedStringType" minOccurs="0" maxOccurs="1"/>
        </xs:sequence>
    </xs:group>
Change of 'organism', 'organismType' and 'sourceDataType'

UniProtKB entries no longer describe proteins of several organisms. We have adapted the XSD in the following way:

    <xs:element name="entry">
    ...
    <xs:complexType name="sourceDataType">
        <xs:choice maxOccurs="unbounded">
            <!-- 
            <<xs:element name="species">
                <xs:complexType>
                    <xs:simpleContent>
                        <xs:extension base="xs:string">
                            <xs:attribute name="ref" type="xs:string" use="optional"/>
                        </xs:extension>
                    </xs:simpleContent>
                </xs:complexType>
            </xs:element>
            -->
            <xs:element name="strain">
            ...
    ...
    <xs:complexType name="organismType">
        ...
   <!-- <xs:attribute name="key" type="xs:string" use="required"/> -->
        <xs:attribute name="evidence" type="xs:string" use="optional"/>
    </xs:complexType>
    ...
           <!-- <xs:element name="organism" type="organismType" maxOccurs="unbounded"/> -->
                <xs:element name="organism" type="organismType"/>
Changes for consistency

For reasons of consistency, we have done the following changes:

UniProtKB release 15.13 of 19-Jan-2010

Changes to cross-references to HAMAP

We have modified the cross-references to the HAMAP database. For details of this change, please read the UniProt document What's new?. This change did not affect the XSD, but may nevertheless require code changes.

Previous format:

<dbReference type="HAMAP" id="MF_00326" key="182">
  <property type="match status" value="1" />
</dbReference>

<dbReference type="HAMAP" id="MF_00006" key="18">
  <property type="flag" value="fused" />
  <property type="match status" value="1" />
</dbReference>

<dbReference type="HAMAP" id="MF_01105" key="19">
  <property type="flag" value="atypical/fused" />
  <property type="match status" value="1" />
</dbReference>

New format:

<dbReference type="HAMAP" id="MF_00326" key="182">
  <property type="entry name" value="Ribosomal_L7Ae" />
  <property type="match status" value="1" />
</dbReference>

<dbReference type="HAMAP" id="MF_00006" key="18">
  <property type="entry name" value="Arg_succ_lyase" />
  <property type="match status" value="1" />
  <property type="flag" value="fused" />
</dbReference>

<dbReference type="HAMAP" id="MF_01105" key="19">
  <property type="entry name" value="N-acetyl_glu_synth" />
  <property type="match status" value="1" />
  <property type="flag" value="atypical/fused" />
</dbReference>
Changes to cross-references to HOGENOM

We have previously cross-referenced the HOGENOM database via UniProtKB accession numbers. These have been replaced by HOGENOM identifiers. This change did not affect the XSD, but may nevertheless require code changes.

Previous format:

<dbReference type="HOGENOM" id="Q9D8H7" key="30" />

New format:

<dbReference type="HOGENOM" id="HBG025762" key="30" />
UniProtKB release 15.10 of 03-Nov-2009

Changes to cross-references to OMA

We have previously cross-referenced the OMA database via UniProtKB accession numbers. These have been replaced by OMA group fingerprints. This change did not affect the XSD, but may nevertheless require code changes.

Previous format:

<dbReference type="OMA" id="P39899" key="31">
  <property type="fingerprint" value="NEELMRR" />
</dbReference>

New format:

<dbReference type="OMA" id="NEELMRR" key="31" />
UniProtKB release 15.6 of 28-Jul-2009

Changes to cross-references to Ensembl

We have previously cross-referenced the Ensembl database at the level of the gene via Ensembl's gene identifiers. To provide more detailed cross-referencing, we now link to Ensembl at the level of gene transcripts and corresponding peptides using Ensembl's transcript and peptide identifiers. This change did not affect the XSD, but may nevertheless require code changes because the gene identifier was moved from the dbReference's id attribute to a property element.

Previous format:

<dbReference type="Ensembl" id="ENSG00000104368" key="174">
  <property type="organism name" value="Homo sapiens" />
</dbReference>

New format:

<dbReference type="Ensembl" id="ENST00000220809" key="174">
  <property type="protein sequence ID" value="ENSP00000220809" />
  <property type="gene designation" value="ENSG00000104368" />
  <property type="organism name" value="Homo sapiens" />
</dbReference>
<dbReference type="Ensembl" id="ENST00000270187" key="175">
  <property type="protein sequence ID" value="ENSP00000270187" />
  <property type="gene designation" value="ENSG00000104368" />
  <property type="organism name" value="Homo sapiens" />
</dbReference>
<dbReference type="Ensembl" id="ENST00000270189" key="176">
  <property type="protein sequence ID" value="ENSP00000270189" />
  <property type="gene designation" value="ENSG00000104368" />
  <property type="organism name" value="Homo sapiens" />
</dbReference>
<dbReference type="Ensembl" id="ENST00000352041" key="177">
  <property type="protein sequence ID" value="ENSP00000270188" />
  <property type="gene designation" value="ENSG00000104368" />
  <property type="organism name" value="Homo sapiens" />
</dbReference>
UniProtKB release 15.0 of 24-Mar-2009

Change of 'GeneLocationType'

The controlled vocabulary for organelles has changed in the flat file format of UniProtKB entries. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we changed the GeneLocationType enumeration values in the XSD as shown in red:

    
    <xs:complexType name="geneLocationType">
        ...
        <xs:attribute name="type" use="required">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="apicoplast"/>
                    <xs:enumeration value="chloroplast"/>
                    <!-- <xs:enumeration value="chromatophore"/>  -->
                    <xs:enumeration value="cyanelle"/>
                    <xs:enumeration value="hydrogenosome"/>
                    <xs:enumeration value="mitochondrion"/>
                    <xs:enumeration value="non-photosynthetic plastid"/>
                    <xs:enumeration value="nucleomorph"/>
                    <xs:enumeration value="organellar chromatophore"/>
                    <xs:enumeration value="plasmid"/>
                    <xs:enumeration value="plastid"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>

Example:

  <geneLocation type="organellar chromatophore"/>
UniProtKB release 14.7 of 20-Jan-2009

New comment type 'disruption phenotype'

A new comment topic, DISRUPTION PHENOTYPE, has been introduced in the flat file format of UniProtKB entries. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we modified the XSD type commentType in the following way:

     <xs:complexType name="commentType">
     ...
         <xs:attribute name="type" use="required">
         ...
                     <xs:enumeration value="disruption phenotype"/>
UniProtKB release 14.0 of 22-Jul-2008

Changes of 'commentType'

To increase the consistency of the different comment types, we changed the XSD type <commentType> in the following way (changes are highlighted in red):

    <xs:complexType name="commentType">
        ...
        <xs:sequence>
       <!-- <xs:element name="text" type="xs:string" minOccurs="0" maxOccurs="1">
                <xs:annotation>
                    <xs:documentation>If a CC line type does not have a defined structure,
                    the text of this comment is stored in the element.
                    </xs:documentation>
                </xs:annotation>
            </xs:element> -->
       <!-- <xs:group ref="bpcCommentGroup"/> -->
            <xs:choice minOccurs="0" maxOccurs="1">
                <xs:group ref="bpcCommentGroup"/>
            ...
            </xs:choice>
            <xs:element name="location" type="locationType" minOccurs="0" maxOccurs="unbounded">
                <xs:annotation>
                    <xs:documentation>Used in 'mass spectrometry' and 'sequence caution' comments.</xs:documentation>
                </xs:annotation>
            </xs:element>          
       <!-- <xs:element name="note" type="xs:string" minOccurs="0" maxOccurs="1">
                <xs:annotation>
                    <xs:documentation>If a CC line type contains a 'Note=',
                    the text of that note is stored in this element.
                    </xs:documentation>
                </xs:annotation>
            </xs:element> -->
            <xs:element name="text" type="evidencedStringType" minOccurs="0">
                <xs:annotation>
                    <xs:documentation>Used to store the contents of non-structured comment types,
                    as well the contents of the flat file 'Note=' field of structured comment types.
                    </xs:documentation>
                </xs:annotation>
            </xs:element>
        </xs:sequence>
        ...
   <!-- <xs:attribute name="status" type="xs:string" use="optional">
            <xs:annotation>
                <xs:documentation>Some comments have a status reflecting their reliability (By similarity, Potential and Probable).
                </xs:documentation>
            </xs:annotation>
        </xs:attribute> -->
        ...
   <!-- <xs:attribute name="evidence" type="xs:string" use="optional"/> -->
    </xs:complexType>
The XSD type evidencedStringType is defined as follows:
    <xs:complexType name="evidencedStringType">
        <xs:simpleContent>
            <xs:extension base="xs:string">
                <xs:attribute name="evidence" type="xs:string" use="optional"/>
                <xs:attribute name="status" use="optional">
                    <xs:simpleType>
                        <xs:restriction base="xs:string">
                            <xs:enumeration value="By similarity"/>
                            <xs:enumeration value="Probable"/>
                            <xs:enumeration value="Potential"/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:attribute>
            </xs:extension>
        </xs:simpleContent>
    </xs:complexType>

Examples:

From

  <comment type="sequence caution">
    <conflict type="erroneous gene model prediction">
      <sequence version="1" resource="EMBL-CDS" id="BAA97015"/>
    </conflict>
    <note>The predicted gene At5g49940 has been split into 2 genes: At5g49940 and At5g49945.</note>
  </comment>

To

  <comment type="sequence caution">
    <conflict type="erroneous gene model prediction">
      <sequence version="1" resource="EMBL-CDS" id="BAA97015"/>
    </conflict>
    <text>The predicted gene At5g49940 has been split into 2 genes: At5g49940 and At5g49945.</text>
  </comment>

From

  <comment type="function" status="By similarity" evidence="EA3">
    <text>Cytochrome c oxidase is the component of the respiratory chain.</text>
  </comment>

To

  <comment type="function">
    <text status="By similarity" evidence="EA3">Cytochrome c oxidase is the component of the respiratory chain.</text>
  </comment>
Structuring of comment type 'subcellular location'

A new controlled vocabulary has been introduced in order to structure subcellular location comments. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we modified the XSD type commentType as shown in red:

    <xs:complexType name="commentType">
    ...
        <xs:sequence>
        ...
           <xs:choice minOccurs="0" maxOccurs="1">
           ...
               <xs:sequence>
                   <xs:annotation>
                       <xs:documentation>Used in 'subcellular location' comments.</xs:documentation>
                   </xs:annotation>
                   <xs:element name="molecule" type="xs:string" minOccurs="0" maxOccurs="1"/>
                   <xs:element name="subcellularLocation" type="subcellularLocationType" minOccurs="1" maxOccurs="unbounded"/>
               </xs:sequence>
           ...
The XSD type subcellularLocationType is defined as follows:
    <xs:complexType name="subcellularLocationType">
        <xs:sequence>
            <xs:element name="location" type="evidencedStringType" minOccurs="1" maxOccurs="unbounded"/>
            <xs:element name="topology" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
            <xs:element name="orientation" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
        </xs:sequence>
    </xs:complexType>

Examples:

  <comment type="subcellular location">
    <subcellularLocation>
      <location evidence="EA3">Mitochondrion inner membrane</location>
      <topology evidence="EA3" status="By similarity">Multi-pass membrane protein</topology>
    </subcellularLocation>
  </comment>
  <comment type="subcellular location">
    <subcellularLocation>
      <location>Cytoplasm</location>
    </subcellularLocation>
    <subcellularLocation>
      <location>Endoplasmic reticulum membrane</location>
      <topology>Peripheral membrane protein</topology>
    </subcellularLocation>
    <subcellularLocation>
      <location>Golgi apparatus membrane</location>
      <topology>Peripheral membrane protein</topology>
    </subcellularLocation>
  </comment>
  <comment type="subcellular location">
    <subcellularLocation>
      <location>Cell membrane</location>
      <topology status="By similarity">Peripheral membrane protein</topology>
    </subcellularLocation>
    <subcellularLocation>
      <location status="By similarity">Secreted</location>
    </subcellularLocation>
    <text>The last 22 C-terminal amino acids may participate in cell membrane attachment.</text>
  </comment>
  <comment type="subcellular location">
    <molecule>Isoform 2</molecule>
    <subcellularLocation>
      <location status="Probable">Cytoplasm</location>
    </subcellularLocation>
  </comment>
  <comment type="subcellular location">
    <subcellularLocation>
      <location>Golgi apparatus</location>
      <location>trans-Golgi network membrane</location>
      <topology status="By similarity">Multi-pass membrane protein</topology>
    </subcellularLocation>
    <text>Predominantly found in the trans-Golgi network (TGN). Not redistributed to the plasma membrane in response to elevated copper levels.</text>
  </comment>
  <comment type="subcellular location">
    <molecule>Isoform 2</molecule>
    <subcellularLocation>
      <location>Cytoplasm</location>
    </subcellularLocation>
  </comment>
  <comment type="subcellular location">
    <molecule>WND/140 kDa</molecule>
    <subcellularLocation>
      <location>Mitochondrion</location>
    </subcellularLocation>
  </comment>
Structuring of protein names

The names which are stored in the <protein> element have been categorized to distinguish recommended, alternative and submitted names, etc. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we modified the XSD type proteinType as shown in red:

    <xs:complexType name="proteinType">
        <xs:annotation>
            <xs:documentation>Stores protein names.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
       <!-- <xs:element name="name" type="proteinNameType" maxOccurs="unbounded"/> -->
            <xs:group ref="proteinNameGroup"/>            
            <xs:element name="domain" minOccurs="0" maxOccurs="unbounded">
                <xs:annotation>
                    <xs:documentation>The domain list is equivalent to the INCLUDES section of the DE line.</xs:documentation>
                </xs:annotation>
                <xs:complexType>
               <!-- <xs:sequence>
                        <xs:element name="name" type="proteinNameType" maxOccurs="unbounded"/>
                    </xs:sequence> -->
                    <xs:group ref="proteinNameGroup"/>
                </xs:complexType>
            </xs:element>
            <xs:element name="component" minOccurs="0" maxOccurs="unbounded">
                <xs:annotation>
                    <xs:documentation>The component list is equivalent to the CONTAINS section of the DE line.</xs:documentation>
                </xs:annotation>
                <xs:complexType>
               <!-- <xs:sequence>
                        <xs:element name="name" type="proteinNameType" maxOccurs="unbounded"/>
                    </xs:sequence> -->
                    <xs:group ref="proteinNameGroup"/>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
   <!-- <xs:attribute name="type">
            <xs:simpleType>
                <xs:restriction base="xs:NMTOKEN">
                    <xs:enumeration value="fragment"/>
                    <xs:enumeration value="fragments"/>
                    <xs:enumeration value="version1"/>
                    <xs:enumeration value="version2"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>
        <xs:attribute name="evidence" type="xs:string" use="optional">
            <xs:annotation>
                <xs:documentation>This contains all evidences that are connected to the complete DE line.</xs:documentation>
            </xs:annotation>
        </xs:attribute> -->
    </xs:complexType>

The proteinNameType definition was deleted. The proteinNameGroup is defined as follows:

    <xs:group name="proteinNameGroup">
        <xs:sequence>
            <xs:element name="recommendedName" minOccurs="0">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="fullName" type="evidencedStringType"/>
                        <xs:element name="shortName" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                    </xs:sequence>
                    <xs:attribute name="ref" type="xs:string" use="optional"/>
                </xs:complexType>
            </xs:element>
            <xs:element name="alternativeName" minOccurs="0" maxOccurs="unbounded">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="fullName" type="evidencedStringType" minOccurs="0"/>
                        <xs:element name="shortName" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
                    </xs:sequence>
                    <xs:attribute name="ref" type="xs:string" use="optional"/>
                </xs:complexType>
            </xs:element>
            <xs:element name="submittedName" minOccurs="0" maxOccurs="unbounded">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="fullName" type="evidencedStringType"/>
                    </xs:sequence>
                    <xs:attribute name="ref" type="xs:string" use="optional"/>
                </xs:complexType>
            </xs:element>
            <xs:element name="allergenName" type="evidencedStringType" minOccurs="0"/>
            <xs:element name="biotechName" type="evidencedStringType" minOccurs="0"/>
            <xs:element name="CdAntigenName" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded"/>
            <xs:element name="innName" type="evidencedStringType" minOccurs="0" maxOccurs="unbounded/>
        </xs:sequence>
    </xs:group>

Two new attributes, precursor and fragment, were added to the sequenceType:

    <xs:complexType name="sequenceType">
        <xs:simpleContent>
            <xs:extension base="xs:string">
                <xs:attribute name="length" type="xs:integer" use="required"/>
                <xs:attribute name="mass" type="xs:integer" use="required"/>
                <xs:attribute name="checksum" type="xs:string" use="required"/>
                <xs:attribute name="modified" type="xs:date" use="required"/>
                <xs:attribute name="version" type="xs:integer" use="required"/>
                <xs:attribute name="precursor" type="xs:boolean" use="optional"/>
                <xs:attribute name="fragment" use="optional">
                    <xs:simpleType>
                        <xs:restriction base="xs:string">
                            <xs:enumeration value="single"/>
                            <xs:enumeration value="multiple"/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:attribute>
            </xs:extension>
        </xs:simpleContent>
    </xs:complexType>

Examples:

   <protein>
      <recommendedName>
        <fullName>Interleukin-2</fullName>
        <shortName>IL-2</shortName>
      </recommendedName>
      <alternativeName>
        <fullName>T-cell growth factor</fullName>
        <shortName>TCGF</shortName>
      </alternativeName>
      <innName>Aldesleukin</innName>
   </protein>
   <sequence precursor="true" ...>
   <protein>
      <recommendedName ref="1">
        <fullName>A disintegrin and metalloproteinase domain 10</fullName>
        <shortName>ADAM 10</shortName>
      </recommendedName>
      <alternativeName>
        <fullName>Mammalian disintegrin-metalloprotease</fullName>
      </alternativeName>
      <alternativeName>
        <fullName>Kuzbanian protein homolog</fullName>
      </alternativeName>
      <CdAntigenName>CD156c</CdAntigenName>
   </protein>
   <dbReference type="EC" key="1" id="EC 3.4.24.81"/>
   <sequence fragment="single" precursor="true" ...>
   <protein>
      <recommendedName>
        <fullName>Arginine biosynthesis bifunctional protein argJ</fullName>
      </recommendedName>
     <domain>
       <recommendedName ref="1">
         <fullName>Glutamate N-acetyltransferase</fullName>
       </recommendedName>
       <alternativeName>
         <fullName>Ornithine acetyltransferase</fullName>
         <shortName>OATase</shortName>
       </alternativeName>
       <alternativeName>
         <fullName>Ornithine transacetylase</fullName>
       </alternativeName>
     </domain>
     <domain>
       <recommendedName ref="2">
         <fullName>Amino-acid acetyltransferase</fullName>
       </recommendedName>
       <alternativeName>
         <fullName>N-acetylglutamate synthase</fullName>
         <shortName>AGS</shortName>
       </alternativeName>
     </domain>
     <component>
       <recommendedName>
         <fullName>Arginine biosynthesis bifunctional protein argJ alpha chain</fullName>
       </recommendedName>
     </component>
     <component>
       <recommendedName>
         <fullName>Arginine biosynthesis bifunctional protein argJ beta chain</fullName>
       </recommendedName>
     </component>
   </protein>
   <dbReference type="EC" key="1" id="EC 2.3.1.35"/>
   <dbReference type="EC" key="2" id="EC 2.3.1.1"/>
Change of 'GeneLocationType'

We have added a new value to the controlled vocabulary of organelle names: Chromatophore. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we added a new enumeration value to the type attribute of the GeneLocationType in the XSD as shown in red:

    
    <xs:complexType name="geneLocationType">
        <xs:annotation>
            <xs:documentation>Defines the locations/origins of the shown sequence (OG line).</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="name" type="statusType" minOccurs="0"/>
        </xs:sequence>
        <xs:attribute name="type" use="required">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="apicoplast"/>
                    <xs:enumeration value="chloroplast"/>
                    <xs:enumeration value="chromatophore"/>
                    <xs:enumeration value="cyanelle"/>
                    <xs:enumeration value="hydrogenosome"/>
                    <xs:enumeration value="mitochondrion"/>
                    <xs:enumeration value="non-photosynthetic plastid"/>
                    <xs:enumeration value="nucleomorph"/>
                    <xs:enumeration value="plasmid"/>
                    <xs:enumeration value="plastid"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>
        <xs:attribute name="evidence" type="xs:string" use="optional"/>
    </xs:complexType>

Example:

  <geneLocation type="chromatophore"/>
UniProtKB release 13.6 of 01-Jul-2008

New 'dbReference' type

A new type of cross-reference, AGRICOLA, was added to the RX (Reference cross-reference) line in the the flat file format of UniProtKB entries. For details of this change, please read the UniProt document What's new?.

The type of a cross-reference is stored in the type attribute of the dbReference element. The modification requires no change of the schema.

Example:

  <dbReference type="AGRICOLA" id="IND20450567" key="31"/>
UniProtKB release 13.0 of 26-Feb-2008

A new feature key, NON_STD, was introduced in the flat file format of UniProtKB entries to replace the key SE_CYS. At the same time, we changed the sequence to use the IUPAC/IUBMB recommended one-letter codes 'U' for selenocysteine and 'O' for pyrrolysine. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we modified the XSD type featureType in the following way:

    <xs:complexType name="featureType">
    ...
        <xs:attribute name="type" use="required">
        ...
                    <!--  <xs:enumeration value="selenocysteine"/> -->
                    <xs:enumeration value="non-standard amino acid"/>
UniProtKB release 12.5 of 13-Nov-2007

A new field, RESOLUTION, was added to the cross-references to the PDB database in the flat file format of UniProtKB entries. For details of this change, please read the UniProt document What's new?.

This optional field is represented in the XML format as an additional <property> element. The modification requires no change of the schema.

Example:

From

  <dbReference type="PDB" id="1AUW" key="31">
    <property type="method" value="X-ray"/>
    <property type="chains" value="A/B/C/D=1-468"/>
  </dbReference>

To

  <dbReference type="PDB" id="1AUW" key="31">
    <property type="method" value="X-ray"/>
    <property type="resolution" value="1.80 A"/>
    <property type="chains" value="A/B/C/D=1-468"/>
  </dbReference>
UniProtKB release 12.4 of 23-Oct-2007

Modification of ftp file names and locations

We now provide all XSD files in uncompressed form and changed the names and locations of the XSD and XML files for keywords to use the same names for all distribution formats:

From:
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/uniprot.xsd.gz
To:
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/uniprot.xsd

From:
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/docs/keyword.xsd.gz
To:
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/docs/keywlist.xsd

From:
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/keydef.xml.gz and
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/docs/keydef.xml.gz
To:
ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/docs/keywlist.xml.gz

Modification of the EC (Enzyme Commission) number format

The format of partial EC numbers has been modified. For details of this change, please read the UniProt document What's new?.

EC numbers are stored in the id attribute of the dbReference element. The modification requires no change of the schema.

Example:

From

  <dbReference type="EC" key="1" id="EC 3.4.24.-"/>
  <dbReference type="EC" key="1" id="EC 3.1.3.-"/>

To

  <dbReference type="EC" key="1" id="EC 3.4.24.-"/>
  <dbReference type="EC" key="1" id="EC 3.1.3.n1"/>
UniProtKB release 12.0 of 24-Jul-2007

New element 'proteinExistence'

A new line type, PE (Protein Existence), has been introduced in the flat file format of UniProtKB entries. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we have added a new child element, proteinExistence, to the entry element in the XSD:

	<xs:element name="entry">
	...
	   <xs:element name="proteinExistence" type="proteinExistenceType"/>
 	...
	</xs:element>
The proteinExistenceType is defined as follows:
   <xs:complexType name="proteinExistenceType">
      <xs:annotation>
         <xs:documentation>Protein Existence (flat file: PE line).</xs:documentation>
      </xs:annotation>
      <xs:attribute name="type" use="required">
         <xs:simpleType>
            <xs:restriction base="xs:string">
               <xs:enumeration value="evidence at protein level"/>
               <xs:enumeration value="evidence at transcript level"/>
               <xs:enumeration value="inferred from homology"/>
               <xs:enumeration value="predicted"/>
               <xs:enumeration value="uncertain"/>
            </xs:restriction>
         </xs:simpleType>
      </xs:attribute>
   </xs:complexType>
Modification of submission citations

The controlled vocabulary that is used for database names in submission citations was modified. For details of this change, please read the UniProt document What's new?.

This information is stored in the db attribute of the citation element. The modification requires no change of the schema.

Example:

From

    <citation type="submission" db="Swiss-Prot" date="2007-03">

To

    <citation type="submission" db="UniProtKB" date="2007-03">
UniProtKB release 11.2 26-Jun-2007

Evidence tags in UniProtKB/Swiss-Prot

The evidence attribute and the evidence element are used in UniProtKB/TrEMBL to indicate the source of an annotation. We have begun to introduce such evidence in UniProtKB/Swiss-Prot as well. In the initial phase, automatic procedures are used to infer the evidence from the existing data (mainly the contents of the scope element). It will also be gradually part of the manual curation process. The completion of the retrofit of existing UniProtKB/Swiss-Prot with evidence information will be an ongoing process.

New comment type 'sequence caution'

A new comment topic, SEQUENCE CAUTION, has been introduced in the flat file format of UniProtKB entries. For details of this change, please read the UniProt document What's new?.

To represent this data in the XML format, we have modifed the XSD type commentType in the following way:

     <xs:complexType name="commentType">
     ...
         <xs:sequence>
         ...
             <xs:choice minOccurs="0" maxOccurs="1">
             ...
                <xs:element name="conflict">
                    <xs:annotation>
                        <xs:documentation>Used in the 'sequence caution' comment (flat file format: CC SEQUENCE CAUTION).</xs:documentation>
                    </xs:annotation>
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="sequence" minOccurs="0" maxOccurs="1">
                                <xs:complexType>
                                    <xs:attribute name="resource" use="required">
                                        <xs:simpleType>
                                            <xs:restriction base="xs:string">
                                                <xs:enumeration value="EMBL-CDS"/>
                                                <xs:enumeration value="EMBL"/>
                                            </xs:restriction>
                                        </xs:simpleType>
                                    </xs:attribute>
                                    <xs:attribute name="id" type="xs:string" use="required"/>
                                    <xs:attribute name="version" type="xs:integer" use="optional"/>
                               </xs:complexType>
                            </xs:element>
                        </xs:sequence>
                        <xs:attribute name="type" use="required">
                            <xs:simpleType>
                                <xs:restriction base="xs:string">
                                    <xs:enumeration value="frameshift"/>
                                    <xs:enumeration value="erroneous initiation"/>
                                    <xs:enumeration value="erroneous termination"/>
                                    <xs:enumeration value="erroneous gene model prediction"/>
                                    <xs:enumeration value="erroneous translation"/>
                                    <xs:enumeration value="miscellaneous discrepancy"/>
                                 </xs:restriction>
                            </xs:simpleType>
                        </xs:attribute>
                        <xs:attribute name="ref" type="xs:string" use="optional">
                            <xs:annotation>
                                <xs:documentation>Refers to the 'key' attribute of a 'reference' element.</xs:documentation>
                            </xs:annotation>
                        </xs:attribute>
                    </xs:complexType>
                </xs:element>
            ...
            </xs:choice>
            <xs:element name="location" type="locationType" minOccurs="0" maxOccurs="unbounded">
                <xs:annotation>
                    <xs:documentation>Used in 'mass spectrometry' and 'sequence caution' comments.</xs:documentation>
                </xs:annotation>
            </xs:element>
            ...
         </xs:sequence>
         ...
         <xs:attribute name="type" use="required">
         ...
                     <xs:enumeration value="sequence caution"/>

Note that the location element has been moved out of the xs:choice.

UniProtKB release 10.1 06-Mar-2007

Changes concerning the Cross-Reference section

Following previous agreement we no longer include cross-references properties with value="-". Examples of cross-references that are affected:

PDB:

<dbReference type="PDB" id="2PGK" key="11">
  <property type="method" value="X-ray"/>
  <property type="chains" value="-"/>
</dbReference>

Becomes:

<dbReference type="PDB" id="2PGK" key="11">
  <property type="method" value="X-ray"/>
</dbReference>

EMBL:

<dbReference type="EMBL" id="BC001051" key="17">
  <property type="protein sequence ID" value="-"/>
  <property type="status" value="NOT_ANNOTATED_CDS"/>
  <property type="molecule type" value="mRNA"/>
</dbReference>

Becomes:

<dbReference type="EMBL" id="BC001051" key="17">
  <property type="status" value="NOT_ANNOTATED_CDS"/>
  <property type="molecule type" value="mRNA"/>
</dbReference>

GeneFarm:

<dbReference type="GeneFarm" id="2241" key="14">
  <property type="family number" value="-"/>
</dbReference>

Becomes:

<dbReference type="GeneFarm" id="2241" key="14"/>
UniProtKB release 8.0 of 30-May-2006

Changes concerning Comment "Alternative Products"

The format of the ALTERNATIVE PRODUCTS Comment (CC) line in UniProt has changed. For details of this change, please see the UniProt flat file news. In order to accomodate this change, a new Ribosomal frameshifting value has been added to the attribute "type" of Event element.

	<xs:complexType name="eventType">
	...
	<xs:attribute name="type" use="required">
	<xs:simpleType>
	<xs:restriction base="xs:string>
	<xs:enumeration value="alternative splicing"/>
	<xs:enumeration value="alternative initiation"/>
	<xs:enumeration value="alternative promoter"/>
	<xs:enumeration value="ribosomal frameshifting"/>
	</xs:restriction>
	</xs:simpleType>
	</xs:attribute>

Additionally, ALTERNATIVE PRODUCTS comment is allowed to have a subelement note.

	<xs:complexType name="commentType">
	...
	<xs:sequence>
	<xs:element name="event" type="eventType" minOccurs="1" maxOccurs="4"/>
	<xs:element name="isoform" type="isoformType" minOccurs="0" maxOccurs="unbounded">
	<xs:element name="note" type="xs:string" minOccurs="0" maxOccurs="1"/>
	</xs:sequence>
Introduction of the new element for Organism Host (line type OH)

New line type OH (Organism Host) was intruduced to viral UniProtKB entries. For details of this change, please see the UniProt flat file news. To represent this data in XML format, we introduced a new subelement of entry: organismHost.

The following has been added to the entry element in the XSD:

   <xs:element name="organismHost" type="organismType" minOccurs="0" maxOccurs="unbounded"/>
UniProtKB release 7.0 of 07-Feb-2006

New version attribute in entry and sequence element

The format of the Date (DT) lines in UniProt has changed. In order to accomodate this change, a version attribute has been added to both <entry> and <sequence>.

The following is added to the entry and sequence elements in the XSD:

<xs:attribute name="version" type="xs:integer"/>

Example <entry> and <sequence> elements using the old schema:

<entry dataset="Swiss-Prot" created="2004-10-25" modified="2005-09-13">
<sequence length="868" mass="95979" checksum="5EAF32DBB48A184C" modified="2004-10-25">

Example <entry> and <sequence> elements under the new schema:

<entry dataset="Swiss-Prot" created="2004-10-25" modified="2005-09-13" version="1">
<sequence length="868" mass="95979" checksum="5EAF32DBB48A184C" modified="2004-10-25" version="2">
UniProtKB release 6.1 of 27-Sep-2005

Addition to Root Schema Elements

As part of our continuing effort to make it easier to work with the UniProtKB Schema using tools such as JAXB, types for organism, keyword, and sequence outside of the entry element (organismType, keywordType, and sequenceType) have been created. If these types are specified inside of entry type, then the generated Java classes become inner classes of the entry type class (e.g. EntryType.KeywordType), when logically these three should be independent of the entry type. THIS DOES NOT CHANGE THE WAY THE XML FILES LOOK. IT IS A CONVENIENCE MODIFICATION ONLY. All xml documents which are valid against old schema will be valid against new schema as well. In the near future we plan to release a uniprot parser and writer based on JAXB, and this is one step in the preparation of the schema. The new schema has each of the following three types moved to root level:

    
    <!--  Organism definition begins  -->
    <xs:complexType name="organismType">
        <xs:sequence>
            <xs:element name="name" type="organismNameType" maxOccurs="unbounded"/>
            <xs:element name="dbReference" type="dbReferenceType" maxOccurs="unbounded"/>
            <xs:element name="lineage" minOccurs="0">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="taxon" type="xs:string" maxOccurs="unbounded"/>
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
        <xs:attribute name="key" type="xs:string" use="required"/>
    </xs:complexType>
    <!--  Organism definition ends  -->
    <!--  Keyword definition begins  -->
    <xs:complexType name="keywordType">
        <xs:simpleContent>
            <xs:extension base="xs:string">
                <xs:attribute name="evidence" type="xs:string" use="optional"/>
                <xs:attribute name="id" type="xs:string" use="required"/>
            </xs:extension>
        </xs:simpleContent>
    </xs:complexType>
    <!--  Keyword definition ends  -->
    <!--  sequence definition ends  -->
    <xs:complexType name="sequenceType">
        <xs:simpleContent>
            <xs:extension base="xs:string">
                <xs:attribute name="length" type="xs:integer" use="required"/>
                <xs:attribute name="mass" type="xs:integer" use="required"/>
                <xs:attribute name="checksum" type="xs:string" use="required"/>
                <xs:attribute name="modified" type="xs:date" use="required"/>
            </xs:extension>
        </xs:simpleContent>
    </xs:complexType>
    <!--  sequence definition ends  -->
UniProtKB release 6.0 of 13-Sep-2005

Changes concerning the gene location section
Change to the UniProtKB XML schema unrelated to flat-file format changes
UniProtKB release 5.5 of 19-Jul-2005

Change to the UniProtKB XML schema
Changes concerning the Cross-Reference section
UniProtKB release 5.3 of 21-Jun-2005

Changes concerning the Organelle (OG) section
UniProtKB release 5.2 of 07-Jun-2005

Changes concerning the document type definition (DTD)
Changes concerning the Features (FT line)
UniProtKB release 5.0 of 10-May-2005

Changes concerning the Features (FT line)
Changes concerning the Database Cross-References (DR line)
XML Mailing List Now Available
UniProtKB release 4.6 of 26-Apr-2005

Changes concerning the Database Cross-References (DR line)
UniProtKB release 4.5 of 12-Apr-2005

Additional Availablity of UniProt XML
Changes concerning the Database Cross-References (DR line)
UniProtKB release 4.4 of 29-Mar-2005

Changes concerning the Comments (CC line)
Changes concerning the Citations (RC line)
Changes concerning the Citations (RC line)
UniProtKB release 4.2 of 01-Mar-2005

Changes concerning the UniProt XML Schema
Changes concerning the Comments (CC line)
Changes concerning the Sequence Features (FT line)
UniProtKB release 4.1 of 15-Feb-2005

Changes concerning the Database Cross-References (DR line)
UniProtKB release 4.0 of 1-Feb-2005

Changes concerning the comment type (CC line)
UniProtKB release 3.4 of 21-Dec-2004

Changes concerning the comment type (CC line)
Changes concerning the last updated date (DT line)
Changes concerning the citations (RA line)
UniProtKB release 3.3 of 07-Dec-2004

Changes concerning the geneLocation element (OG line)
UniProtKB release 3.0 of 25-Oct-2004

Changes concerning the geneLocation element (OG line)