skbio.sequence.Protein.distance

Protein.distance(other, metric=None)[source]

Compute the distance to another sequence.

State: Experimental as of 0.4.0.

Parameters:

other : str, Sequence, or 1D np.ndarray (np.uint8 or ‘|S1’)

Sequence to compute the distance to.

metric : function, optional

Function used to compute the distance between this sequence and other. If None (the default), scipy.spatial.distance.hamming will be used. This function should take two skbio.Sequence objects and return a float.

Returns:

float

Distance between this sequence and other.

Raises:

ValueError

If the sequences are not the same length when metric is None (i.e., metric is scipy.spatial.distance.hamming). This is only checked when using this metric, as equal length is not a requirement of all sequence distance metrics. In general, the metric itself should test and give an informative error message, but the message from scipy.spatial.distance.hamming is somewhat cryptic (as of this writing), and it’s the default metric, so we explicitly do this check here. This metric-specific check will be removed from this method when the skbio.sequence.stats module is created (track progress on issue #913).

TypeError

If other is a Sequence object with a different type than this sequence.

See also

fraction_diff, fraction_same, scipy.spatial.distance.hamming

Examples

>>> from skbio import Sequence
>>> s = Sequence('GGUC')
>>> t = Sequence('AGUC')
>>> s.distance(t)
0.25
>>> def custom_dist(s1, s2): return 0.42
>>> s.distance(t, custom_dist)
0.42