Statistical comparison
Statistical comparisons
- lisa.stats.series_mean_stats(series, kind, confidence_level=0.95)[source]
Compute the mean along with the a confidence interval based on the T-score.
- Returns:
A tuple with:
The mean
The standard deviation, or its equivalent
The standard error of the mean, or its equivalent (Harmonic Standard Error, Geometric Standard Error).
The interval, as an 2-tuple of +/- values
- Parameters:
- lisa.stats.guess_mean_kind(unit, control_var)[source]
Guess which kind of mean should be used to summarize results in the given unit.
- Returns:
'arithmetic'
if an arithmetic mean should be used, or'harmonic'
. Geometric mean uses cannot be inferred by this function.- Parameters:
unit (str) – Unit of the values, e.g.
'km/h'
.control_var (str) – Control variable, i.e. variable that is fixed during the experiment. For example, in a car speed experiment, the control variable could be the distance (fixed distance), or the time. In that case, we would have
unit='km/h'
andcontrol_var='h'
if the time was fixed, orcontrol_var='km'
if the distance was fixed.
- class lisa.stats.Stats(df, value_col='value', ref_group=None, filter_rows=None, compare=True, agg_cols=None, mean_ci_confidence=None, stats=None, stat_col='stat', unit_col='unit', ci_cols=('ci_minus', 'ci_plus'), control_var_col='fixed', mean_kind_col='mean_kind', non_normalizable_units={'pval'})[source]
Bases:
Loggable
Compute the statistics on an input
pandas.DataFrame
in “database” format.- Parameters:
df (pandas.DataFrame) –
Dataframe in database format, i.e. meaningless index, and values in a given column with the other columns used as tags.
Note
Redundant tag columns (aka that are equal) will be removed from the dataframe.
value_col (str) – Name of the column containing the values.
ref_group (dict(str, object)) –
Reference group used to compare the other groups against. It’s format is
dict(tag_column_name, tag_value)
. The comparison will be made on subgroups built out of all the other tag columns, with the reference subgroups being the one matching that dictionary. If the tag value isNone
, the key will only be used for grouping in graphs. Comparison will add the following statistics:A 2-sample Komolgorov-Smirnov test
'ks2samp_test'
column. This test is non-parametric and checks for difference in distributions. The only assumption is that the distribution is continuous, which should suit almost all use casesMost statistics will be normalized against the reference group as a difference percentage, except for a few non-normalizable values.
Note
The group referenced must exist, otherwise unexpected behaviours might occur.
filter_rows (dict(object, object) or None) – Filter the given
pandas.DataFrame
with a dict of {“column”: value) that rows has to match to be selected.compare (bool) – If
True
, normalize most statistics as a percentage of change compared toref_group
.Columns to aggregate on. In a sense, the given columns will be treated like a compound iteration number. Defaults to:
iteration
column if available, otherwiseAll the tag columns that are neither the value nor part of the
ref_group
.
mean_ci_confidence (float) – Confidence level used to establish the mean confidence interval, between
0
and1
.stats (dict(str, str or collections.abc.Callable)) –
Dictionnary of statistical functions to summarize each value group formed by tag columns along the aggregation columns. If
None
is given as value, the name will be passed topandas.core.groupby.SeriesGroupBy.agg()
. Otherwise, the provided function will be run.Note
One set of keys is special:
'mean'
,'std'
and'sem'
. When valueNone
is used, a custom function is used instead of the one frompandas
, which will compute other related statistics and provide a confidence interval. An attempt will be made to guess the most appropriate kind of mean to use using themean_kind_col
,unit_col
andcontrol_var_col
:The mean itself, as:
'mean'
(arithmetic)'hmean'
(harmonic)'gmean'
(geometric)
The Standard Error of the Mean (SEM):
'sem'
(arithmetic)'hse'
(harmonic)'gse'
(geometric)
The standard deviation:
'std'
(arithmetic)'hsd'
(harmonic)'gsd'
(geometric)
stat_col (str) – Name of the column used to hold the name of the statistics that are computed.
unit_col (str) – Name of the column holding the unit of each value (as a string).
ci_cols (tuple(str, str)) – Name of the two columns holding the confidence interval for each computed statistics.
control_var_col – Name of the column holding the control variable name in the experiment leading to the given value. .. seealso::
guess_mean_kind()
control_var_col – str
mean_kind_col (str) –
Type of mean to be used to summarize this value.
Note
Unless geometric mean is used,
unit_col
andcontrol_var_col
should be used to make things more obvious and reduce risks of confusion.non_normalizable_units (list(str)) – List of units that cannot be normalized against the reference group.
Examples:
import pandas as pd # The index is meaningless, all what matters is to uniquely identify # each row using a set of tag columns, such as 'board', 'kernel', # 'iteration', ... df = pd.DataFrame.from_records( [ ('juno', 'kernel1', 'bench1', 'score1', 1, 42, 'frame/s', 's'), ('juno', 'kernel1', 'bench1', 'score1', 2, 43, 'frame/s', 's'), ('juno', 'kernel1', 'bench1', 'score2', 1, 420, 'frame/s', 's'), ('juno', 'kernel1', 'bench1', 'score2', 2, 421, 'frame/s', 's'), ('juno', 'kernel1', 'bench2', 'score', 1, 54, 'foobar', ''), ('juno', 'kernel2', 'bench1', 'score1', 1, 420, 'frame/s', 's'), ('juno', 'kernel2', 'bench1', 'score1', 2, 421, 'frame/s', 's'), ('juno', 'kernel2', 'bench1', 'score2', 1, 4200, 'frame/s', 's'), ('juno', 'kernel2', 'bench1', 'score2', 2, 4201, 'frame/s', 's'), ('juno', 'kernel2', 'bench2', 'score', 1, 540, 'foobar', ''), ('hikey','kernel1', 'bench1', 'score1', 1, 42, 'frame/s', 's'), ('hikey','kernel1', 'bench1', 'score2', 1, 420, 'frame/s', 's'), ('hikey','kernel1', 'bench2', 'score', 1, 54, 'foobar', ''), ('hikey','kernel2', 'bench1', 'score1', 1, 420, 'frame/s', 's'), ('hikey','kernel2', 'bench1', 'score2', 1, 4200, 'frame/s', 's'), ('hikey','kernel2', 'bench2', 'score', 1, 540, 'foobar', ''), ], columns=['board', 'kernel', 'benchmark', 'metric', 'iteration', 'value', 'unit', 'fixed'], ) # Get a Dataframe will all the default statistics. Stats(df).df # Use a ref_group will also compare other groups against it Stats(df, ref_group={'board': 'juno', 'kernel': 'kernel1'}).df
- property df
pandas.DataFrame
containing the statistics.See also
get_df()
for more controls.
- get_df(remove_ref=None, compare=None)[source]
Returns a
pandas.DataFrame
containing the statistics.- Parameters:
compare (bool or None) – See
Stats
compare
parameter. IfNone
, it will default to the value provided toStats
.remove_ref (bool or None) – If
True
, the rows of the reference group described byref_group
for this object will be removed from the returned dataframe. IfNone
, it will default tocompare
.
- plot_stats(filename=None, remove_ref=None, backend=None, groups_as_row=False, kind=None, **kwargs)[source]
Returns a
matplotlib.figure.Figure
containing the statistics for the class inputpandas.DataFrame
.- Parameters:
filename (str or None) – Path to the image file to write to.
remove_ref (bool or None) – If
True
, do not plot the reference group. Seeget_df()
.backend (str or None) – Holoviews backend to use:
bokeh
ormatplotlib
. IfNone
, the current holoviews backend selected withhv.extension()
will be used.groups_as_row (bool) – By default, subgroups are used as rows in the subplot matrix so that the values shown on a given graph can be expected to be in the same order of magnitude. However, when there are many subgroups, this can lead to very large and somewhat hard to navigate plot matrix. In this case, using the group for the rows might help a great deal.
kind (str or None) –
Type of plot. Can be any of:
horizontal_bar
vertical_bar
None
- Variable keyword arguments:
Forwarded to
get_df()
.
- plot_histogram(cumulative=False, bins=50, nbins=None, density=False, **kwargs)[source]
Returns a
matplotlib.figure.Figure
with histogram of the values in the inputpandas.DataFrame
.
- plot_values(**kwargs)[source]
Returns a holoviews element with the values in the input
pandas.DataFrame
.- Parameters:
filename (str or None) – Path to the image file to write to.
Workload Automation
- class lisa.wa.StatsProp[source]
Bases:
object
Provides a
stats
property.- get_stats(ensure_default_groups=True, ref_group=None, agg_cols=None, **kwargs)[source]
Returns a
lisa.stats.Stats
loaded with the resultpandas.DataFrame
.- Parameters:
ensure_default_groups (bool) – If
True
, ensure ref_group will contain appropriate keys for usual Workload Automation result display.ref_group – Forwarded to
lisa.stats.Stats
- Variable keyword arguments:
Forwarded to
lisa.stats.Stats
- property stats
Short-hand property equivalent to
self.get_stats()
See also
- class lisa.wa.WAOutput(path, kernel_path=None)[source]
Bases:
StatsProp
,Mapping
,Loggable
Recursively parse a
Workload Automation
output, using registered collectors (leaf subclasses ofWACollectorBase
). The data collected are accessible through apandas.DataFrame
in “database” format:meaningless index
all values are tagged using tag columns
- Parameters:
path (str) – Path containing a Workload Automation output.
kernel_path – Kernel source path. Used to resolve the name of the kernel which ran the workload.
kernel_path – str
Example:
wa_output = WAOutput('wa/output/path') # Pick a specific collector. See also WAOutput.get_collector() stats = wa_output['results'].stats stats.plot_stats(filename='stats.html')
- get_collector(name, **kwargs)[source]
Returns a new collector with custom parameters passed to it.
- Parameters:
name (str) – Name of the collector.
- Variable keyword arguments:
Forwarded to the collector’s constructor.
Example:
WAOutput('wa/output/path').get_collector('energy', postprocess=func)
- property jobs
List containing all the jobs present in the output of ‘wa run’.
- class lisa.wa.WACollectorBase(wa_output, df_postprocess=None)[source]
Bases:
StatsProp
,Loggable
,ABC
Base class for all
Workload Automation
dataframe collectors.- Parameters:
df_postprocess (collections.abc.Callable) – Function called to postprocess the collected
pandas.DataFrame
.
See also
Instances of this classes are typically built using
WAOutput.get_collector()
rather than directly.- property df
pandas.DataFrame
containing the data collected.
- class lisa.wa.WAResultsCollector(wa_output, df_postprocess=None)[source]
Bases:
WACollectorBase
Collector for the
Workload Automation
test results.- NAME = 'results'
- class lisa.wa.WAArtifactCollectorBase(wa_output, df_postprocess=None)[source]
Bases:
WACollectorBase
Workload Automation
artifact collector base class.
- class lisa.wa.WAEnergyCollector(wa_output, df_postprocess=None)[source]
Bases:
WAArtifactCollectorBase
WA collector for the energy_measurement augmentation.
Example:
def postprocess(df): df = df.pivot_table(values='value', columns='metric', index=['sample', 'iteration', 'workload']) df = pd.DataFrame({ 'CPU_power': ( df['A55_power'] + df['A76_1_power'] + df['A76_2_power'] ), }) df['unit'] = 'Watt' df = df.reset_index() df = df.melt(id_vars=['sample', 'iteration', 'workload', 'unit'], var_name='metric') return df WAOutput('wa/output/path').get_collector( 'energy', df_postprocess=postprocess, ).df
- NAME = 'energy'
- get_stats(**kwargs)[source]
Returns a
lisa.stats.Stats
loaded with the resultpandas.DataFrame
.- Parameters:
ensure_default_groups (bool) – If
True
, ensure ref_group will contain appropriate keys for usual Workload Automation result display.ref_group – Forwarded to
lisa.stats.Stats
- Variable keyword arguments:
Forwarded to
lisa.stats.Stats
- class lisa.wa.WATraceCollector(wa_output, trace_to_df=<function _stub_trace_to_df>, **kwargs)[source]
Bases:
WAArtifactCollectorBase
WA collector for the trace augmentation.
- Parameters:
trace_to_df (collections.abc.Callable) – Function used by the collector to convert the
lisa.trace.Trace
to apandas.DataFrame
.- Variable keyword arguments:
Forwarded to
lisa.trace.Trace
.
Example:
def trace_idle_analysis(trace): cpu = 0 df = trace.ana.idle.df_cluster_idle_state_residency([cpu]) df = df.reset_index() df['cpu'] = cpu # Melt the column 'time' into lines, so that the dataframe is in # "database" format: each value is uniquely identified by "tag" # columns return df.melt( var_name='metric', value_vars=['time'], id_vars=['idle_state'], ) WAOutput('wa/output/path').get_collector( 'trace', trace_to_df=trace_idle_analysis, ).df
- NAME = 'trace'
- property traces
lisa.utils.LazyMapping
that maps job names & iteration numbers to their correspondinglisa.trace.Trace
.
- class lisa.wa.WAJankbenchCollector(wa_output, df_postprocess=None)[source]
Bases:
WAArtifactCollectorBase
WA collector for the jankbench frame timings.
The collector framework will return a single
pandas.DataFrame
with the results from every jankbench job inlisa.stats.Stats
format (i.e. the returned dataframe is arranged such that each reported metric is separated as a separate row). The metrics reported are:.
total_duration
: Time in milliseconds to complete the frame.
jank_frame
: Boolean indicator of missed frame deadline.1
is a Jank frame,0
is not..
name
: Subtest name, provided by the Jankbench app.
frame_id
: monotonically increasing frame number, starts from1
for each subtest iteration.An example plotter matching the old-style output can be found in the jupyter notebook working directory at
ipynb/wltests/WAOutput-JankbenchDemo.ipynb
If you have existing code expecting a more direct translation of the original sqlite database format, you can massage the collected dataframe back into a closer resemblance to the original source database with this sequence of pandas operations:
wa_output = WAOutput('wa/output/path') df = wa_output['jankbench'].df db_df = df.pivot(index=['iteration', 'id', 'kernel', 'frame_id'], columns=['variable']) db_df = db_df['value'].reset_index() db_df.columns.name = None # db_df now looks more like the original format
- NAME = 'jankbench'
- get_stats(**kwargs)[source]
Returns a
lisa.stats.Stats
loaded with the resultpandas.DataFrame
.- Parameters:
ensure_default_groups (bool) – If
True
, ensure ref_group will contain appropriate keys for usual Workload Automation result display.ref_group – Forwarded to
lisa.stats.Stats
- Variable keyword arguments:
Forwarded to
lisa.stats.Stats
- class lisa.wa.WASysfsExtractorCollector(wa_output, path, type='diff', **kwargs)[source]
Bases:
WAArtifactCollectorBase
WA collector for the syfs-extractor augmentation.
Example:
def pixel6_energy_meter(df): # Keep only CPU's meters df = df[df.value.str.contains('S4M_VDD_CPUCL0|S3M_VDD_CPUCL1|S2M_VDD_CPUCL2')] df[['variable', 'value']] = df.value.str.split(', ', expand=True) def _clean_variable(variable): if 'S4M_VDD_CPUCL0' in variable: return 'little-energy' if 'S3M_VDD_CPUCL1' in variable: return 'mid-energy' if 'S2M_VDD_CPUCL2' in variable: return 'big-energy' return '' df['variable'] = df['variable'].apply(_clean_variable) df['value'] = df['value'].astype(int) df['unit'] = "bogo-ujoules" # Add a total energy variable df = pd.concat([ df, pd.DataFrame(data={ 'variable': 'total-energy', 'value': [df['value'].sum()] }) ]) df.ffill(inplace=True) return df df = WAOutput('.').get_collector( 'sysfs-extractor', path='/sys/bus/iio/devices/iio:device0/energy_value', df_postprocess=pixel6_energy_meter ).df
- NAME = 'sysfs-extractor'