Domain
Domains represent annotations associated with subregions along the sequence. Domains are added to proteins using the Protein.add_domain()
function, or using functions in the shephard.interfaces.si_domains
module.
Domains must have a domain type
, and a domain type
as well as a start and end position. Each domain generally should have a unique domain name, although this is not strictly enforced. In contrast, many domains can and will be of common domain types. Domains also know the position in the sequence they come from, the underlying residue, and can extract Site and Track information associated with the domain.
Domains for a given protein can be called using the protein.domain(domain_name)
function. However, in general it’s more useful to either request all domains using protein.domains
(which returns a list of all domains in protein) or to request specific domains based on their position, location, type, or some combinatin of the two. Explicit functions for these types of requests are included in the Protein
object. Finally, all domains (or all domains of a specific type) can be requested from an entire proteome using Proteome
object functions.
Domains can be removed from proteins using Protein.remove_domain()
function.
Domain Properties
- start()
[Property]: Returns the start position that defines this domain
- Getter
Returns the start of the domain (indexed from 1)
- Setter
None
- Type
int
- end()
[Property]: Returns the end position that defines this domain
- protein()
[Property]: Returns the Protein that this Domain is associated with
- sequence()
[Property]: Returns the amino acid sequence associated with this domain
- domain_type()
Returns the domain type as a string
Domain Functions
- inside_domain(self, position)
Function that returns True/False depending on if the provided position lies inside the domain.
- Parameters
position (int) – Position in the sequence
- Returns
Returns True if position is inside the domain region, else False
- Return type
bool
Domain Attribute Functions
- attributes()
Provides a list of the keys associated with every attribute associated with this domain.
- Returns
returns a list of the attribute keys associated with the domain.
- Return type
list
- attribute(self, name, safe=True)
Function that returns a specific attribute as defined by the name.
Recall that attributes are name : value pairs, where the ‘value’ can be anything and is user defined. This function will return the value associated with a given name.
- Parameters
name (str) – The attribute name. A list of valid names can be found by calling the
<domain>.attributes()
(which returns a list of the valid names)safe (bool (default = True)) – Flag which if true with throw an exception if an attribute with the same name already exists.
- Returns
Will either return whatever was associated with that attribute (which could be anything) or None if that attribute is missing.
- Return type
Unknown
- add_attribute(self, name, val, safe=True)
Function that adds an attribute. Note that if safe is true, this function will raise an exception if the attribute is already present. If safe=False, then an exisiting value will be overwritten.
- Parameters
name (str) – Name that will be used to identify the attribute
val (<anything>) – An object or primitive we wish to associate with this attribute
safe (bool (default = True)) – Flag which if True with throw an exception if an attribute with the same name already exists, otherwise the newly introduced attribute will overwrite the previous one.
- Returns
- Return type
None - but adds an attribute to the calling object
- remove_attribute(self, name, safe=True)
Function that removes a given attribute from the Domain based on the passed attribute name. If the passed attribute does not exist or is not associate with the Domain then this will trigger an exception unless safe=False.
- Parameters
name (str) – The attribute name that will be used to identify it
safe (bool (default = True)) – Flag which if True with throw an exception if an attribute this name does not exists. If set to False then if an attribute is not found it is simply ignored
- Returns
No return type but will remove an attribute from the protein if present.
- Return type
None
Domain Site Functions
- sites()
Get list of all sites inside the domain.
- Returns
Returns a list of all the sites
- Return type
list
- site(self, position)
Returns the list of sites that are found at a given position. Note that - in generalsite() should be used to retrieve sites you know exist while get_sites_by_position() offers a way to more safely get sites at a position. Site will throw an exception if the position passed does not exist (while get_sites_by_position() will not).
- Parameters
position (int) – Defines the position in the sequence we want to interrogate
- Returns
Returns a list with between 1 and n sites. Will raise an exception if the passed position cannot be found in the codebase.
- Return type
list
- get_sites_by_type(self, site_type, return_list=False)
Get dictionary of list of sites inside the domain
- Parameters
site_type (string) – The site type identifier for which the function will search for matching sites
return_list (bool) – By default, the flag returns a dictionary, which is convenient as it makes it easy to index into one or more sites at a specific position in the sequence. However, you may instead want a list of sites, in which case setting return_list will have the function simply return a list of sites. As of right now we do not guarentee the order of these returned sites.
- Returns
Returns a dictionary, where each key-value pair is:
key - site position (integer) value - list of one or more site object
- Return type
list
Domain Track Functions
- get_track_values(self, name, safe=True)
Function that returns the region of a protein’s values- track associated with this domain.
If the track name is not found in this protein and safe is True, this will throw an exception, otherwise (if safe=False) then if the track is missing the function will return None.
- Parameters
name (str) – Track name
safe (bool (default = True)) – If set to True, missing tracks trigger an exception, else they just return None
- Returns
Returns a list of floats that corresponds to the set of residues associated with the domain of interest, or None if the track does not exist and safe=False.
- Return type
list
- get_track_symbols(self, name, safe=True)
Function that returns the region of a protein’s symbols track associated with this domain.
If the track name is missing and safe is True, this will throw an exception, otherwise (if safe=False) then if the track is missing the function returns None
- Parameters
name (str) – Track name
safe (bool (default = True)) – If set to True, missing tracks trigger an exception, else they just return None
- Returns
Returns a list of strings that corresponds to the set of residues associated with the domain of interest.
- Return type
list