cma.utilities.utils

module documentation

various utilities not related to optimization

Class	`BlancClass`	blanc container class to have a collection of attributes.
Class	`DataDict`	a dictionary of lists (of data)
Class	`DefaultSettings`	resembling somewhat `types.SimpleNamespace` from Python >=3.3 but with instantiation and resembling even more the `dataclass` decorator from Python >=3.7.
Class	`DerivedDictBase`	for conveniently adding methods/functionality to a dictionary.
Class	`DictClass`	A class wrapped over `dict` to use class .-notation.
Class	`DictFromTagsInString`	read from a string or file all key-value pairs within all `<python>...</python>` tags and return a `dict`.
Class	`ElapsedWCTime`	measure elapsed cumulative time while not paused and elapsed time since last tic.
Class	`ExclusionListOfVectors`	For delayed selective mirrored sampling
Class	`ListOfCallables`	A `list` of callables that can be called like a single `callable`.
Class	`MoreToWrite`	make sure that this list does not grow unbounded
Class	`ShowInFolder`	callable instance to save and show figures from `matplotlib`.
Class	`SolutionDict`	dictionary with computation of an hash key.
Class	`TimingWrapper`	wrap a timer around a callable.
Function	`argsort`	return index list to get `a` in order, ie `a[argsort(a)[i]] == sorted(a)[i]`, which leads to unexpected results with `np.nan` entries, because any comparison with `np.nan` is `False`.
Function	`as_vector_list`	a tool to handle a vector or a list of vectors in the same way, return a list of vectors and a function to revert the "list making".
Function	`download_file`	Undocumented
Function	`extract_targz`	filename must be a valid path in the tar
Function	`format_message`	put line breaks and trailing white spaces
Function	`format_warning`	Poor man's maxwarns: msg must match exactly.
Function	`is_`	intuitive handling of variable truth value also for `numpy` arrays.
Function	`is_all`	return `all(is_(v) for v in var_list)`
Function	`is_any`	return `any(is_(v) for v in var_list)`
Function	`is_nan`	return `np.isnan(var)` or `False` if `var` is not numeric
Function	`is_not`	see `is_`
Function	`is_one`	return True if var == 1 or ones vector
Function	`is_str`	`bytes` (in Python 3) also fit the bill.
Function	`is_vector_list`	make an educated guess whether `x` is a list of vectors.
Function	`num2str`	returns the shortest string representation.
Function	`pprint`	nicely formated print
Function	`print_message`	Undocumented
Function	`print_warning`	Poor man's maxwarns: msg must match exactly
Function	`ranks`	return ranks of entries starting with zero based on Pythons `sorted`.
Function	`recycled`	return `vec` with the last element recycled to `dim` if `len(vec)` doesn't fail, else `vec`.
Function	`rglen`	return generator `range(len(.))` with shortcut `rglen(.)`
Function	`round_indices`	modify `a[i]` to `round(a[i])` for i in `indices` and return `a`, never used
Function	`set_attributes_from_dict`	assign, for example, all arguments given to an `__init__` method to attributes in `self` or `self.params` or `self.args`.
Function	`tolist`	return `a.tolist()` if applicable else `list(a)`, never used
Function	`version_diff`	return -1 if v1 < v2 else +1 if v1 > v2 else 0
Function	`zero_values_indices`	generate increasing index pairs `(i, j)` with `all(diffs[i:j] == 0)`
Variable	`global_verbosity`	Undocumented
Variable	`warnings_counter`	Undocumented

def argsort(a, reverse=False): ¶

return index list to get a in order, ie a[argsort(a)[i]] == sorted(a)[i], which leads to unexpected results with np.nan entries, because any comparison with np.nan is False.

def as_vector_list(X): ¶

a tool to handle a vector or a list of vectors in the same way, return a list of vectors and a function to revert the "list making".

Useful when we might either have a single solution vector or a set/list/population of vectors to deal with.

Namely, this function allows to replace a slightly more verbose:

was_list = utils.is_vector_list(X)
X = X if was_list else [X]
# work work work on X, e.g.
res = [x[0] + 1 for x in X]
res = res if was_list else res[0]

with:

X, revert = utils.as_vector_list(X)
# work work work on X, e.g.
res = [x[0] + 2 for x in X]
res, ... = revert(res, ...)  # also allows to revert X, if desired

Testing:

>>> from cma.utilities import utils
>>> X = [3]  # a single vector
>>> X, revert_vlist = utils.as_vector_list(X)  # BEGIN
>>> assert X == [[3]]  # a list with one element
>>> # work work work on X as a list of vectors, e.g.
>>> res = [x[0] + 1 for x in X]
>>> X, res = revert_vlist(X, res)  # END
>>> assert res == 4
>>> assert X[0] == 3

def download_file(url, target_dir='.', target_name=None): ¶

Undocumented

def extract_targz(tarname, filename=None, target_dir='.'): ¶

filename must be a valid path in the tar

def format_message(msg, es=None, spaces=6): ¶

put line breaks and trailing white spaces

def format_warning(msg, method_name=None, class_name=None, iteration=None, maxwarns=None): ¶

Poor man's maxwarns: msg must match exactly.

Copy-paste of print_warning to get better location information than print_warning. Calling warnings.warn here makes the warning location information meaningless, hence we format only a string here. Usage could be like:

m = utils.format_warning(some_message
        ); m and warnings.warn(m)

def is_(var): ¶

intuitive handling of variable truth value also for numpy arrays.

Return True for any non-empty container, otherwise the truth value of the scalar var.

Caveat of the most unintuitive case: [0] evaluates to True, like [0, 0].

>>> import numpy as np
>>> from cma.utilities.utils import is_
>>> is_({}) or is_(()) or is_(0) or is_(None) or is_(np.array(0))
False
>>> is_({0:0}) and is_((0,)) and is_(np.array([0]))
True

def is_all(var_list): ¶

return all(is_(v) for v in var_list)

def is_any(var_list): ¶

return any(is_(v) for v in var_list)

def is_nan(var): ¶

return np.isnan(var) or False if var is not numeric

def is_not(var): ¶

see is_

def is_one(var): ¶

return True if var == 1 or ones vector

def is_str(var): ¶

bytes (in Python 3) also fit the bill.

>>> from cma.utilities.utils import is_str
>>> assert is_str(b'a') * is_str('a') * is_str(u'a') * is_str(r'b')
>>> assert not is_str([1]) and not is_str(1)

def is_vector_list(x): ¶

make an educated guess whether x is a list of vectors.

>>> from cma.utilities.utils import is_vector_list as ivl
>>> assert ivl([[0], [0]]) and not ivl([1,2,3])

def num2str(val, significant_digits=2, force_rounding=False, max_predecimal_digits=5, max_postdecimal_leading_zeros=1, remove_trailing_zeros=True, desired_length=None): ¶

returns the shortest string representation.

Generally, display either significant_digits digits or its true value, whichever is shorter.

force_rounding shows no more than the desired number of significant digits, which means, e.g., 12345 becomes 12000.

remove_trailing_zeros removes zeros, if and only if the value is exactly.

desired_length adds digits up to the desired length.

>>> from cma.utilities import utils
>>> print([utils.num2str(val) for val in [12345, 1234.5, 123.45,
...       12.345, 1.2345, .12345, .012345, .0012345]])
['12345', '1234', '123', '12', '1.2', '0.12', '0.012', '1.2e-3']

def pprint(to_be_printed): ¶

nicely formated print

def print_message(msg, method_name=None, class_name=None, iteration=None, verbose=None): ¶

Undocumented

def print_warning(msg, method_name=None, class_name=None, iteration=None, verbose=None, maxwarns=None, **kwargs_for_warn): ¶

Poor man's maxwarns: msg must match exactly

def ranks(a, reverse=False): ¶

return ranks of entries starting with zero based on Pythons sorted.

This leads to unreasonable results with np.nan values.

def recycled(vec, dim=None, as_=None): ¶

return vec with the last element recycled to dim if len(vec) doesn't fail, else vec.

If dim is not given, len(as_) is used if available, else a scalar is returned.

def rglen(ar): ¶

return generator range(len(.)) with shortcut rglen(.)

def round_indices(a, indices): ¶

modify a[i] to round(a[i]) for i in indices and return a, never used

def set_attributes_from_dict(self, dict_, initial_params_dict_name=None): ¶

assign, for example, all arguments given to an __init__ method to attributes in self or self.params or self.args.

If initial_params_dict_name is given, dict_ is also copied into an attribute of self with name initial_params_dict_name:

setattr(self, initial_params_dict_name, dict_.copy())

and the self key is removed from the copied dict if present.

>>> from cma.utilities.utils import set_attributes_from_dict
>>> class C(object):
...     def __init__(self, arg1, arg2, arg3=None):
...         assert len(locals()) == 4  # arguments are locally visible
...         set_attributes_from_dict(self, locals())
>>> c = C(1, 22)
>>> assert c.arg1 == 1 and c.arg2 == 22 and c.arg3 is None
>>> assert len(c.__dict__) == 3 and not hasattr(c, 'self')

Details:

The entry dict_['self'] is always ignored.

Alternatively:

self.args = locals().copy()
self.args.pop('self', None)  # not strictly necessary

puts all arguments into self.args: dict.

def tolist(a): ¶

return a.tolist() if applicable else list(a), never used

def version_diff(v1, v2): ¶

return -1 if v1 < v2 else +1 if v1 > v2 else 0

def zero_values_indices(diffs): ¶

generate increasing index pairs (i, j) with all(diffs[i:j] == 0)

and diffs[j] != 0 or j == len(diffs), thereby identifying "flat spots/areas" in diffs.

Returns the respective generator type.

Not anymore used to smoothen ECDFs.

Example:

>>> from cma.utilities.utils import zero_values_indices
>>> for i, j in zero_values_indices([0, 0.1, 0, 0, 3.2, 0, 2.1]):
...     print((i, j))
(0, 1)
(2, 4)
(5, 6)

global_verbosity: int = ¶

Undocumented

warnings_counter = ¶

Undocumented