Module Stats
Compute various statistical indicators for a collection of values
Stats for a collection of type 'a, either int or float.
If 'kind is exhaustive, then this contains all values, and can compute the median, q1 and q3 (computing these requires sorting, but it is only done once, however, adding new values will require resorting).
Otherwise, when 'kind is compact, this is a constant sized aggregate. It does not store all the values, only the minimal required to compute some stats (min, max, sum, sum of squares). It can't compute the median or quartiles.
Constuctors
val make_compact : ('a, exhaustive) t -> ('a, compact) tConvert an exhaustive collection into a compact representation
Compact values
While empty constructors are provided, accessing stats of empty will raise CollectionTooShort. List and array constructors are linear in the size of the given list/array.
Exhaustive values
val exhaustive_int_empty : (int, exhaustive) tval exhaustive_float_empty : (float, exhaustive) tval exhaustive_int_singleton : int -> (int, exhaustive) tval exhaustive_float_singleton : float -> (float, exhaustive) tval exhaustive_of_int_list : int list -> (int, exhaustive) tval exhaustive_of_float_list : float list -> (float, exhaustive) tval exhaustive_of_int_array : int array -> (int, exhaustive) tval exhaustive_of_float_array : float array -> (float, exhaustive) tAdding values
Add a new value to the collection. Constant time operation.
Concatenate both collections. Constant time operation.
Add all values in the list, equivalent to List.fold_left add_value, linear in the size of the list for compact values, O(n log n) (in size of list+collection) for exhaustive
Add all values in the array, equivalent to Array.fold_left add_value, linear in the size of the array for compact values, O(n log n) (in size of array+collection) for exhaustive
Accessors
All accessors are constant time operations, except median, q1 and q3, which need to sort the collection. Sorting is only done once and then saved, so getting the q1 after computing the median is constant time.
val size : ('a, 'kind) t -> intThe size of the collection, i.e. the number of elements
val sum : ('a, 'kind) t -> 'aSum of all items in the collection: \sum_i x_i
val sum_squares : ('a, 'kind) t -> 'aThe sum of the squares of the collection: \sum_i x_i^2. May raise Z.Overflow.
val min : ('a, 'kind) t -> 'aThe minimal element
val max : ('a, 'kind) t -> 'aThe maximal element
val range : ('a, 'kind) t -> 'aThe range, i.e. max - min.
val average : ('a, 'kind) t -> floatThe average/mean value: i.e. sum collection / size collection.
val variance : ('a, 'kind) t -> floatThe variance: i.e. sum of the squares of the difference with the average \sum_i (x_i - \mu)^2
val median : ('a, exhaustive) t -> floatThe median, or 2nd quartile
val q1 : ('a, exhaustive) t -> floatThe first quartile, requires size >= 4
val q3 : ('a, exhaustive) t -> floatThe third quartile, requires size >= 4
Export values
Export the list/array of values, sorted in increasing order. If unsorted, these will sort the collection (O(n log n)), else they will copy it O(n).
val to_list : ('a, exhaustive) t -> 'a listval to_array : ('a, exhaustive) t -> 'a arrayPretty printers
Both of these take an extra unit parameter to mark the end of the optional arguments.
val pp_percent :
?justify:bool ->
?precision:int ->
unit ->
Stdlib.Format.formatter ->
(int * int) ->
unitpp_percent () fmt (num, denom) prints the ratio num / denom as a percentage, including a final "%" symbol. Rounds fractions, so "20.99%" is printed as "21.0%" when precision is 1.
Standard SI unit prefix list: ""; "k"; "M"; "G"; "T"; "P"; "E"; "Z"; "Y"; "R"; "Q".
val pp_with_unit :
?justify:bool ->
?unit_prefixes:string list ->
?separator:string ->
?base:int ->
unit ->
Stdlib.Format.formatter ->
int ->
unitpp_with_unit () fmt nb prints the number nb with at most three digits using the specified unit prefixes. For example:
pp_unit fmt 123->"123"pp_unit fmt 12345->"12.3k"pp_unit fmt 123456789->"123M"
Multi-session loggers
module StatLogger (S : sig ... end) () : sig ... endSave stats between mutliple codex runs. Each logger saves a mapping string -> stat between various runs.