WIP ad hoc code for writing (changing) results in a pandas.DataFrame and to disk,
where "results" refers to end-results of experiment repetitions rather than the traces of single runs.
Caveat: this is an ad hoc implementation, some interfaces may be incomplete, interface details are still versatile and in flux, some recent changes may be faulty.
Main features, similar to Results, are
- Intermediate savings of results which consequently can be loaded from a different shell while the experiment is running.
- Backup under the current timestamp before each saving into a 'backups-name' folder (optional but default).
- Similar float values can be "equalized" for correct data aggregation, see
the
check_close_valuesandequalize_close_valuesmethods. This usesnp.isclosewhich considers 1e-8 to be close to 1e-9 and 1+1e-5 close to 1+1e-6.
Compared to Results, this class is useful to store more information for
each run, like the termination condition or a final condition number or
constraints violations or meta parameter information. (In contrast, with
Results the workaround for catching the final condition would writing a
nonfinite entry when the target was not reached.)
A guiding code example:
import cma.experimentation
res = cma.experimentation.ResultsPandas('some-name') # reloads data to append/continue
for dim in dimensions:
es = cma.CMA(dim * [2], 1, {'verbose': -9})
# a tracker in case we want to track, say, a minimum over the trace as result
es.optimize(cma.ff.rosen, callback=my_result_tracker)
if es.opts.get('verbose') == -9: # do not save data while testing the setup
res.append([es.N, es.popsize,
es.result.evaluations,
1 if 'ftarget' in es.stop() else 0,
repr(es.stop),
es.condition_number,
my_result_tracker.my_val_of_interest],
colums=['dimension', 'popsize',
'evaluations',
'targethit',
'stopcondition',
'conditionnumber',
'myvalue'])
res.save()
# load the data:
res = cma.experimentation.ResultsPandas('some-name')
print(res.summary) # summary statistics of columns in a dict
res.df # the pandas data frame
Notes: self.df.drop(...) could be used to return a data frame with some entried dropped.
| Method | __getattr__ |
access the DataFrame Results.df directly from Results |
| Method | __init__ |
load data when filepathname exists. |
| Method | append |
append a single data row |
| Method | backup |
backup saved data by making a file copy |
| Method | check |
Undocumented |
| Method | column |
a sorted list of values in the column of name column. |
| Method | drop |
call self.df.drop and reassign result |
| Method | equalize |
Undocumented |
| Method | extend |
extend frame by data which is a sequence of rows |
| Method | load |
Undocumented |
| Method | print |
do not use iterrows but itertuples https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas |
| Method | reset |
caveat: the original index is lost which may be undesirable |
| Method | save |
save data, warn when saving takes more than time_s seconds. |
| Instance Variable | backup |
Undocumented |
| Instance Variable | df |
Undocumented |
| Instance Variable | last |
Undocumented |
| Instance Variable | name |
Undocumented |
| Property | failure |
TODO: revise such that we have a boolean failed_setting column |
| Property | summary |
return number of finite entries, number of different values, and |
| Method | _polish |
a quick hack |
| Instance Variable | _extension |
Undocumented |
load data when filepathname exists.
format='.csv' is human readible, however '.feather' is much more performant, '.parquet' should work too.
do not use iterrows but itertuples
https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas
save data, warn when saving takes more than time_s seconds.
kwargs are passed to the saving method of the pandas data frame.
Details: save is in essence just a shortcut for:
self.backup() self.df.to_feather(self.name + self.extension) # assuming .extension == '.feather'
TODO: generalize by passing the DataFrame method name for saving, like
.save('to_feather')? Annoyingly, pandas does not add a proper
extension by default.