Domain

Domains represent annotations associated with subregions along the sequence. Domains are added to proteins using the Protein.add_domain() function, or using functions in the shephard.interfaces.si_domains module.

Domains must have a domain type, and a domain type as well as a start and end position. Each domain generally should have a unique domain name, although this is not strictly enforced. In contrast, many domains can and will be of common domain types. Domains also know the position in the sequence they come from, the underlying residue, and can extract Site and Track information associated with the domain.

Domains for a given protein can be called using the protein.domain(domain_name) function. However, in general it’s more useful to either request all domains using protein.domains (which returns a list of all domains in protein) or to request specific domains based on their position, location, type, or some combinatin of the two. Explicit functions for these types of requests are included in the Protein object. Finally, all domains (or all domains of a specific type) can be requested from an entire proteome using Proteome object functions.

Domains can be removed from proteins using Protein.remove_domain() function.

class Domain(start, end, protein, domain_type, domain_name, attributes=None)[source]

Domain Properties

start()

[Property]: Returns the start position that defines this domain

Getter:

Returns the start of the domain (indexed from 1)

Setter:

None

Type:

int

end()

[Property]: Returns the end position that defines this domain

protein()

[Property]: Returns the Protein that this Domain is associated with

sequence()

[Property]: Returns the amino acid sequence associated with this domain

domain_type()

Returns the domain type as a string

Domain Functions

inside_domain(self, position)

Function that returns True/False depending on if the provided position lies inside the domain.

Parameters:

position (int) – Position in the sequence

Returns:

Returns True if position is inside the domain region, else False

Return type:

bool

Domain Attribute Functions

attributes()

Provides a list of the keys associated with every attribute associated with this domain.

Returns:

returns a list of the attribute keys associated with the domain.

Return type:

list

attribute(self, name, safe=True)

Function that returns a specific attribute as defined by the name.

Recall that attributes are name : value pairs, where the ‘value’ can be anything and is user defined. This function will return the value associated with a given name.

Parameters:
  • name (str) – The attribute name. A list of valid names can be found by calling the <domain>.attributes() (which returns a list of the valid names)

  • safe (bool (default = True)) – Flag which if true with throw an exception if an attribute with the same name already exists.

Returns:

Will either return whatever was associated with that attribute (which could be anything) or None if that attribute is missing.

Return type:

Unknown

add_attribute(self, name, val, safe=True)

Function that adds an attribute. Note that if safe is true, this function will raise an exception if the attribute is already present. If safe=False, then an exisiting value will be overwritten.

Parameters:
  • name (str) – Name that will be used to identify the attribute

  • val (<anything>) – An object or primitive we wish to associate with this attribute

  • safe (bool (default = True)) – Flag which if True with throw an exception if an attribute with the same name already exists, otherwise the newly introduced attribute will overwrite the previous one.

Return type:

None - but adds an attribute to the calling object

remove_attribute(self, name, safe=True)

Function that removes a given attribute from the Domain based on the passed attribute name. If the passed attribute does not exist or is not associate with the Domain then this will trigger an exception unless safe=False.

Parameters:
  • name (str) – The attribute name that will be used to identify it

  • safe (bool (default = True)) – Flag which if True with throw an exception if an attribute this name does not exists. If set to False then if an attribute is not found it is simply ignored

Returns:

No return type but will remove an attribute from the protein if present.

Return type:

None

Domain Site Functions

sites()

Get list of all sites inside the domain.

Returns:

Returns a list of all the sites

Return type:

list

site(self, position)

Returns the list of sites that are found at a given position. Note that - in generalsite() should be used to retrieve sites you know exist while get_sites_by_position() offers a way to more safely get sites at a position. Site will throw an exception if the position passed does not exist (while get_sites_by_position() will not).

Parameters:

position (int) – Defines the position in the sequence we want to interrogate

Returns:

Returns a list with between 1 and n sites. Will raise an exception if the passed position cannot be found in the codebase.

Return type:

list

get_sites_by_type(self, site_type, return_list=False)

Get dictionary of list of sites inside the domain

Parameters:
  • site_type (string) – The site type identifier for which the function will search for matching sites

  • return_list (bool) – By default, the flag returns a dictionary, which is convenient as it makes it easy to index into one or more sites at a specific position in the sequence. However, you may instead want a list of sites, in which case setting return_list will have the function simply return a list of sites. As of right now we do not guarentee the order of these returned sites.

Returns:

Returns a dictionary, where each key-value pair is:

key - site position (integer) value - list of one or more site object

Return type:

list

Domain Track Functions

get_track_values(self, name, safe=True)

Function that returns the region of a protein’s values- track associated with this domain.

If the track name is not found in this protein and safe is True, this will throw an exception, otherwise (if safe=False) then if the track is missing the function will return None.

Parameters:
  • name (str) – Track name

  • safe (bool (default = True)) – If set to True, missing tracks trigger an exception, else they just return None

Returns:

Returns a list of floats that corresponds to the set of residues associated with the domain of interest, or None if the track does not exist and safe=False.

Return type:

list

get_track_symbols(self, name, safe=True)

Function that returns the region of a protein’s symbols track associated with this domain.

If the track name is missing and safe is True, this will throw an exception, otherwise (if safe=False) then if the track is missing the function returns None

Parameters:
  • name (str) – Track name

  • safe (bool (default = True)) – If set to True, missing tracks trigger an exception, else they just return None

Returns:

Returns a list of strings that corresponds to the set of residues associated with the domain of interest.

Return type:

list