physt package

Modules

physt.bin_utils module

Methods for investigation and manipulation of bin arrays.

physt.bin_utils.is_bin_subset(sub, sup)

Check whether all bins in one binning are present also in another:

Parameters:
  • sub (array_like) – Candidate for the bin subset
  • sup (array_like) – Candidate for the bin superset
Returns:

Return type:

bool

physt.bin_utils.is_bin_superset(sup, sub)

Inverse of is_bin_subset

physt.bin_utils.is_consecutive(bins, rtol=1e-05, atol=1e-08)

Check whether the bins are consecutive (edges match).

Does not check if the bins are in rising order.

Returns:
Return type:bool
physt.bin_utils.is_rising(bins)

Check whether the bins are in raising order.

Does not check if the bins are consecutive.

Parameters:bins (array_like) –
Returns:
Return type:bool
physt.bin_utils.make_bin_array(bins)

Turn bin data into array understood by HistogramXX classes.

Parameters:bins (array_like) – Array of edges or array of edge tuples
Returns:
Return type:numpy.ndarray

Examples

>>> make_bin_array([0, 1, 2])
array([[0, 1],
       [1, 2]])
>>> make_bin_array([[0, 1], [2, 3]])
array([[0, 1],
       [2, 3]])
physt.bin_utils.to_numpy_bins(bins)

Convert physt bin format to numpy edges.

Parameters:bins (array_like) – 1-D (n) or 2-D (n, 2) array of edges
Returns:edges – all edges
Return type:np.ndarray
physt.bin_utils.to_numpy_bins_with_mask(bins)

Numpy binning edges including gaps.

Parameters:bins (array_like) – 1-D (n) or 2-D (n, 2) array of edges
Returns:
  • edges (np.ndarray) – all edges
  • mask (np.ndarray) – List of indices that correspond to bins that have to be included

Examples

>>> to_numpy_bins_with_mask([0, 1, 2])
(array([0.,   1.,   2.]), array([0, 1]))
>>> to_numpy_bins_with_mask([[0, 1], [2, 3]])
(array([0, 1, 2, 3]), array([0, 2])

physt.binnings module

Different binning algorithms/schemas for the histograms.

class physt.binnings.BinningBase(bins=None, numpy_bins=None, includes_right_edge=False, adaptive=False)

Bases: object

Abstract base class for binning schemas.

  • define at least one of the following properties: bins, numpy_bins (cached conversion exists)
  • if you modify bins, put _bins and _numpy_bins into proper state (None may be sufficient)
  • checking of proper bins should be done in __init__
  • if you want to support adaptive histogram, override _force_bin_existence
  • implement _update_dict to contain the binning representation
  • the constructor (and facade methods) must accept any kwargs (and ignores those that are not used).
adaptive_allowed

bool – Whether is possible to update the bins dynamically

inconsecutive_allowed

bool – Whether it is possible to have bins with gaps

TODO

Check the last point (does it make sense?)

adapt(other)

Adapt this binning so that it contains all bins of another binning.

Parameters:other (BinningBase) –
adaptive_allowed = False
apply_bin_map(bin_map)
Parameters:bin_map (Iterator(tuple)) – The bins must be in ascending order
Returns:
Return type:BinningBase
as_fixed_width(copy=True)

Convert binning to recipe with fixed width (if possible.)

Parameters:copy (bool) – Ensure that we receive another object
Returns:
Return type:FixedWidthBinning
as_static(copy=True)

Convert binning to a static form.

Parameters:copy (bool) – Ensure that we receive another object
Returns:A new static binning with a copy of bins.
Return type:StaticBinning
bin_count

The total number of bins.

Returns:
Return type:int
bins

Bins in the wider format (as edge pairs)

Returns:bins – shape=(bin_count, 2)
Return type:np.ndarray
copy()

An identical, independent copy.

Returns:
Return type:BinningBase
first_edge

The left edge of the first bin.

Returns:
Return type:float
force_bin_existence(values)

Change schema so that there is a bin for value.

It is necessary to implement the _force_bin_existence template method.

Parameters:values (np.ndarray) – All values we want bins for.
Returns:bin_map – None => There was no change in bins int => The bins are only shifted (allows mass assignment) Otherwise => the iterable contains tuples (old bin index, new bin index)
new bin index can occur multiple times, which corresponds to bin merging
Return type:Iterable[tuple] or None or int
static from_dict(a_dict)
includes_right_edge
inconsecutive_allowed = False
is_adaptive()

Whether the binning can be adapted to include values not currently spanned.

Returns:
Return type:bool
is_consecutive(rtol=1e-05, atol=1e-08)

Whether all bins are in a growing order.

Parameters:atol (rtol,) – numpy tolerance parameters
Returns:
Return type:bool
is_regular(rtol=1e-05, atol=1e-08)

Whether all bins have the same width.

Parameters:atol (rtol,) – numpy tolerance parameters
Returns:
Return type:bool
last_edge

The right edge of the last bin.

Returns:
Return type:float
numpy_bins

Bins in the numpy format

This might not be available for inconsecutive binnings.

Returns:edges – shape=(bin_count+1,)
Return type:np.ndarray
numpy_bins_with_mask

Bins in the numpy format, including the gaps in inconsecutive binnings.

Returns:edges, mask
Return type:np.ndarray

See also

bin_utils.to_numpy_bins_with_mask

set_adaptive(value=True)

Set/unset the adaptive property of the binning.

This is available only for some of the binning types.

to_dict()

Dictionary representation of the binning schema.

This serves as template method, please implement _update_dict

class physt.binnings.ExponentialBinning(log_min, log_width, bin_count, includes_right_edge=True, adaptive=False, **kwargs)

Bases: physt.binnings.BinningBase

Binning schema with exponentially distributed bins.

adaptive_allowed = False
copy()

An identical, independent copy.

Returns:
Return type:BinningBase
is_regular(*args, **kwargs)

Whether all bins have the same width.

Parameters:atol (rtol,) – numpy tolerance parameters
Returns:
Return type:bool
numpy_bins

Bins in the numpy format

This might not be available for inconsecutive binnings.

Returns:edges – shape=(bin_count+1,)
Return type:np.ndarray
class physt.binnings.FixedWidthBinning(bin_width, bin_count=0, bin_times_min=None, min=None, includes_right_edge=False, adaptive=False, bin_shift=None, align=True, **kwargs)

Bases: physt.binnings.BinningBase

Binning schema with predefined bin width.

adaptive_allowed = True
as_fixed_width(copy=True)

Convert binning to recipe with fixed width (if possible.)

Parameters:copy (bool) – Ensure that we receive another object
Returns:
Return type:FixedWidthBinning
bin_count

The total number of bins.

Returns:
Return type:int
bin_width
copy()

An identical, independent copy.

Returns:
Return type:BinningBase
first_edge

The left edge of the first bin.

Returns:
Return type:float
is_regular(*args, **kwargs)

Whether all bins have the same width.

Parameters:atol (rtol,) – numpy tolerance parameters
Returns:
Return type:bool
last_edge

The right edge of the last bin.

Returns:
Return type:float
numpy_bins

Bins in the numpy format

This might not be available for inconsecutive binnings.

Returns:edges – shape=(bin_count+1,)
Return type:np.ndarray
class physt.binnings.NumpyBinning(numpy_bins, includes_right_edge=True, **kwargs)

Bases: physt.binnings.BinningBase

Binning schema working as numpy.histogram.

copy()

An identical, independent copy.

Returns:
Return type:BinningBase
numpy_bins

Bins in the numpy format

This might not be available for inconsecutive binnings.

Returns:edges – shape=(bin_count+1,)
Return type:np.ndarray
class physt.binnings.StaticBinning(bins, includes_right_edge=True, **kwargs)

Bases: physt.binnings.BinningBase

as_static(copy=True)

Convert binning to a static form.

Returns:A new static binning with a copy of bins.
Return type:StaticBinning
Parameters:copy (bool) – if True, returns itself (already satisfying conditions).
copy()

An identical, independent copy.

Returns:
Return type:BinningBase
inconsecutive_allowed = True
physt.binnings.as_binning(obj, copy=False)

Ensure that an object is a binning

Parameters:
  • obj (BinningBase or array_like) – Can be a binning, numpy-like bins or full physt bins
  • copy (bool) – If true, ensure that the returned object is independent
physt.binnings.calculate_bins(array, _=None, *args, **kwargs)

Find optimal binning from arguments.

Parameters:
  • array (arraylike) – Data from which the bins should be decided (sometimes used, sometimes not)
  • _ (int or str or Callable or arraylike or Iterable or BinningBase) – To-be-guessed parameter that specifies what kind of binning should be done
  • check_nan (bool) – Check for the presence of nan’s in array? Default: True
  • range (tuple) – Limit values to a range. Some of the binning methods also (subsequently) use this parameter for the bin shape.
Returns:

A two-dimensional array with pairs of bin edges (not necessarily consecutive).

Return type:

BinningBase

physt.binnings.calculate_bins_nd(array, bins=None, *args, **kwargs)

Find optimal binning from arguments (n-dimensional variant)

Usage similar to calculate_bins.

Returns:
Return type:List[BinningBase]
physt.binnings.exponential_binning(data=None, bin_count=None, range=None, **kwargs)

Construct exponential binning schema.

Parameters:
  • bin_count (Optional[int]) – Number of bins
  • range (Optional[tuple]) – (min, max)
Returns:

Return type:

ExponentialBinning

See also

numpy.logspace()

physt.binnings.fixed_width_binning(data=None, bin_width=1, range=None, includes_right_edge=False, **kwargs)

Construct fixed-width binning schema.

Parameters:
  • bin_width (float) –
  • range (Optional[tuple]) – (min, max)
  • align (Optional[float]) – Must be multiple of bin_width
Returns:

Return type:

FixedWidthBinning

physt.binnings.human_binning(data=None, bin_count=None, range=None, **kwargs)

Construct fixed-width ninning schema with bins automatically optimized to human-friendly widths.

Typical widths are: 1.0, 25,0, 0.02, 500, 2.5e-7, …

Parameters:
  • bin_count (Optional[int]) – Number of bins
  • range (Optional[tuple]) – (min, max)
Returns:

Return type:

FixedWidthBinning

physt.binnings.ideal_bin_count(data, method='default')

A theoretically ideal bin count.

Parameters:
  • data (array_like or None) – Data to work on. Most methods don’t use this.
  • method (str) –
    Name of the method to apply, available values:
    • default (~sturges)
    • sqrt
    • sturges
    • doane
    • rice

    See https://en.wikipedia.org/wiki/Histogram for the description

Returns:

Number of bins, always >= 1

Return type:

int

physt.binnings.integer_binning(data=None, **kwargs)

Construct fixed-width binning schema with bins centered around integers.

Parameters:
  • range (Optional[Tuple[int]]) – min (included) and max integer (excluded) bin
  • bin_width (Optional[int]) – group “bin_width” integers into one bin (not recommended)
Returns:

Return type:

StaticBinning

physt.binnings.numpy_binning(data, bins=10, range=None, *args, **kwargs)

Construct binning schema compatible with numpy.histogram

Parameters:
  • data (array_like, optional) – This is optional if both bins and range are set
  • bins (int or array_like) –
  • range (Optional[tuple]) – (min, max)
  • includes_right_edge (Optional[bool]) – default: True
Returns:

Return type:

NumpyBinning

See also

numpy.histogram()

physt.binnings.quantile_binning(data=None, bins=10, qrange=(0.0, 1.0), **kwargs)

Binning schema based on quantile ranges.

This binning finds equally spaced quantiles. This should lead to all bins having roughly the same frequencies.

Note: weights are not (yet) take into account for calculating quantiles.

Parameters:
  • bins (sequence or Optional[int]) – Number of bins
  • qrange (Optional[tuple]) – Two floats as minimum and maximum quantile (default: 0.0, 1.0)
Returns:

Return type:

StaticBinning

physt.binnings.static_binning(data=None, bins=None, **kwargs)

Construct static binning with whatever bins.

physt.histogram1d module

One-dimensional histograms.

class physt.histogram1d.Histogram1D(binning, frequencies=None, errors2=None, **kwargs)

Bases: physt.histogram_base.HistogramBase

One-dimensional histogram data.

The bins can be of different widths.

The bins need not be consecutive. However, some functionality may not be available for non-consecutive bins (like keeping information about underflow and overflow).

_stats

dict

These are the basic attributes that can be used in the constructor (see there) Other attributes are dynamic.

axis_name
bin_centers

Centers of all bins.

Returns:
Return type:numpy.ndarray
bin_left_edges

Left edges of all bins.

Returns:
Return type:numpy.ndarray
bin_right_edges

Right edges of all bins.

Returns:
Return type:numpy.ndarray
bin_sizes
bin_widths

Widths of all bins.

Returns:
Return type:numpy.ndarray
binning

The binning.

Note: Please, do not try to update the object themself.

Returns:
Return type:physt.binnings.BinningBase
bins

Array of all bin edges.

Returns:Wide-format [[leftedge1, rightedge1], … [leftedgeN, rightedgeN]]
Return type:numpy.ndarray
cumulative_frequencies

Cumulative frequencies.

Note: underflow values are not considered

Returns:
Return type:numpy.ndarray
fill(value, weight=1)

Update histogram with a new value.

Parameters:
  • value (float) – Value to be added.
  • weight (float, optional) – Weight assigned to the value.
Returns:

  • int – index of bin which was incremented (-1=underflow, N=overflow, None=not found)
  • Note (If a gap in unconsecutive bins is matched, underflow & overflow are not valid anymore.)
  • Note (Name was selected because of the eponymous method in ROOT)

fill_n(values, weights=None, dropna=True)

Update histograms with a set of values.

Parameters:
  • values (array_like) –
  • weights (Optional[array_like]) –
  • drop_na (Optional[bool]) – If true (default), all nan’s are skipped.
find_bin(value)

Index of bin corresponding to a value.

Parameters:value (float) – Value to be searched for.
Returns:index of bin to which value belongs (-1=underflow, N=overflow, None=not found - inconsecutive)
Return type:int
classmethod from_xarray(arr)

Convert form xarray.Dataset

Parameters:arr (xarray.Dataset) – The data in xarray representation
inner_missed
max_edge

Right edge of the last bin.

Returns:
Return type:float
mean()

Statistical mean of all values entered into histogram.

This number is precise, because we keep the necessary data separate from bin contents.

Returns:
Return type:float
min_edge

Left edge of the first bin.

Returns:
Return type:float
numpy_bins

Bins in the format of numpy.

Returns:
Return type:numpy.ndarray
numpy_like()
overflow
select(axis, index, force_copy=False)

Alias for [] to be compatible with HistogramND.

std(ddof=0)

Standard deviation of all values entered into histogram.

This number is precise, because we keep the necessary data separate from bin contents.

Parameters:ddof (int) – Not yet used.
Returns:
Return type:float
to_dataframe()

Convert to pandas DataFrame.

This is not a lossless conversion - (under/over)flow info is lost.

Returns:
Return type:pandas.DataFrame
to_xarray()

Convert to xarray.Dataset

Returns:
Return type:xarray.Dataset
total_width

Total width of all bins.

In inconsecutive histograms, the missing intervals are not counted in.

Returns:
Return type:float
underflow
variance(ddof=0)

Statistical variance of all values entered into histogram.

This number is precise, because we keep the necessary data separate from bin contents.

Parameters:ddof (int) – Not yet used.
Returns:
Return type:float
physt.histogram1d.calculate_frequencies(data, binning, weights=None, validate_bins=True, already_sorted=False, dtype=None)

Get frequencies and bin errors from the data.

Parameters:
  • data (array_like) – Data items to work on.
  • binning (physt.binnings.BinningBase) – A set of bins.
  • weights (array_like, optional) – Weights of the items.
  • validate_bins (bool, optional) – If True (default), bins are validated to be in ascending order.
  • already_sorted (bool, optional) – If True, the data being entered are already sorted, no need to sort them once more.
  • dtype (Optional[type]) – Underlying type for the histogram. (If weights are specified, default is float. Otherwise long.)
Returns:

  • frequencies (numpy.ndarray) – Bin contents
  • errors2 (numpy.ndarray) – Error squares of the bins
  • underflow (float) – Weight of items smaller than the first bin
  • overflow (float) – Weight of items larger than the last bin
  • stats (dict) – { sum: …, sum2: …}

Note

Checks that the bins are in a correct order (not necessarily consecutive). Does not check for numerical overflows in bins.

physt.histogram_base module

HistogramBase - base for all histogram classes.

class physt.histogram_base.HistogramBase(binnings, frequencies=None, errors2=None, **kwargs)

Bases: object

Histogram base class.

Behaviour shared by all histogram classes.

The most important daughter classes are: - Histogram1D - HistogramND

There are also special histogram types that are modifications of these classes.

The methods you should override: - fill - fill_n (optional) - copy - _update_dict (optional)

Underlying data type is int64 / float or an explicitly specified other type (dtype).

_binnings

Iterable[BinningBase] – Schema for binning(s)

_frequencies

array_like – Bin contents

_errors2

array_like – Square errors associated with the bin contents

_meta_data

dict – All meta-data (names, user-custom values, …). Anything can be put in. When exported, all information is kept.

_dtype

np.dtype – Type of the frequencies and also errors (int64, float64 or user-overridden)

_missed

array_like – Various storage for missed values in different histogram types (1 value for multi-dimensional, 3 values for one-dimensional)

Invariants
----------
- Frequencies in the histogram should always be non-negative.
Many operations rely on that, but it is not always enforced.
(TODO

Fix this?)

See also

histogram1d, histogram_nd, special

adaptive
axis_names

Names of axes (stored in meta-data).

Returns:
Return type:tuple[str]
bin_count

Total number of bins.

Returns:
Return type:int
copy(include_frequencies=True)

Copy the histogram.

Parameters:include_frequencies (Optional[bool]) – If false, all frequencies are set to zero.
Returns:copy
Return type:HistogramBase
densities

Frequencies normalized by bin sizes.

Useful when bins are not of the same size.

Returns:
Return type:np.ndarray
dtype

Data type of the bin contents.

Returns:
Return type:np.dtype
errors

Bin errors.

Returns:
Return type:np.ndarray
errors2

Squares of the bin errors.

Returns:
Return type:np.ndarray
fill(value, weight=1, **kwargs)

Add a value.

Abstract method - to be implemented in daughter classes.s

Parameters:
  • value – Value to be added. Can be scalar or array depending on the histogram type.
  • weight (Optional) – Weight of the value

Note

May change the dtype if weight is set

fill_n(values, weights=None, **kwargs)

Add more values at once.

This (default) implementation uses a simple loop to add values using fill method. Actually, it is not used in neither Histogram1D, nor HistogramND.

Parameters:
  • values (Iterable) – Values to add
  • weights (Optional[Iterable]) – Optional values to assign to each value

Note

This method should be overloaded with a more efficient one.

May change the dtype if weight is set.

frequencies

Frequencies (values, contents) of the histogram bins.

Returns:Array of bin frequencies
Return type:np.ndarray
classmethod from_dict(a_dict)

Create an instance from a dictionary.

If customization is necessary, override the _from_dict_kwargs template method, not this one.

Parameters:a_dict (dict) –
Returns:
Return type:HistogramBase
has_same_bins(other)

Whether two histograms share the same binning.

Returns:
Return type:bool
is_adaptive()

Whether the binning can be changed with operations.

Returns:
Return type:bool
merge_bins(amount=None, min_frequency=None, axis=None, inplace=False)

Reduce the number of bins and add their content:

Parameters:
  • amount (int) – How many adjacent bins to join together.
  • min_frequency (float) – Try to have at least this value in each bin (this is not enforce e.g. for minima between high bins)
  • axis (int or None) – On which axis to do this (None => all)
  • inplace – Whether to modify this histogram or return a new one
Returns:

if inplace, return

Return type:

HistogramBase or None

meta_data

A dictionary of non-numerical information about the histogram.

It contains several pre-defined ones, but you can add any other. These are preserved when saving and also in operations.

Returns:
Return type:dict
missed

Total number (weight) of entries that missed the bins.

Returns:
Return type:float
name

Name of the histogram (stored in meta-data).

Returns:
Return type:str
ndim

Dimensionality of histogram’s data.

i.e. the number of axes along which we bin the values.

Returns:
Return type:int
normalize(inplace=False, percent=False)

Normalize the histogram, so that the total weight is equal to 1.

Parameters:
  • inplace (bool) – If True, updates itself. If False (default), returns copy
  • percent (bool) – If True, normalizes to percent instead of 1. Default: False
Returns:

HistogramBase

Return type:

either modified copy or self

See also

densities(), HistogramND.partial_normalize()

plot

Proxy to plotting.

This attribute is a special proxy to plotting. In the most simple cases, it can be used as a method. For more sophisticated use, see the documentation for physt.plotting package.

Returns:
Return type:physt.plotting.PlottingProxy
set_adaptive(value=True)

Change the histogram binning to (non)adaptive.

This requires binning in all dimensions to allow this.

Parameters:value (bool) –
set_dtype(value, check=True)

Change data type of the bin contents.

Allowed conversions: - from integral to float types - between the same category of type (float/integer) - from float types to integer if weights are trivial

Parameters:
  • value (np.dtype or something convertible to it.) –
  • check (bool) – If True (default), all values are checked against the limits
shape

Shape of histogram’s data.

Returns:One-element tuple with the number of bins along each axis.
Return type:tuple[int]
title

Title of the histogram to be displayed when plotted (stored in meta-data).

If not specified, defaults to name.

Returns:
Return type:str
to_dict()

Dictionary with all data in the histogram.

This is used for export into various formats (e.g. JSON) If a descendant class needs to update the dictionary in some way (put some more information), override the _update_dict method.

Returns:
Return type:collections.OrderedDict
to_json(path=None, **kwargs)

Convert to JSON representation.

Parameters:path (Optional[str]) – Where to write the JSON.
Returns:The JSON representation.
Return type:str
total

Total number (sum of weights) of entries excluding underflow and overflow.

Returns:
Return type:float

physt.histogram_nd module

Multi-dimensional histograms.

class physt.histogram_nd.Histogram2D(binnings, frequencies=None, **kwargs)

Bases: physt.histogram_nd.HistogramND

Specialized 2D variant of the general HistogramND class.

In contrast to general HistogramND, it is plottable.

T

Histogram with swapped axes.

Returns:
Return type:Histogram2D - a copy with swapped axes
numpy_like()
partial_normalize(axis=0, inplace=False)

Normalize in rows or columns.

Parameters:
  • axis (int or str) – Along which axis to sum (numpy-sense)
  • inplace (bool) – Update the object itself
Returns:

hist

Return type:

Histogram2D

class physt.histogram_nd.HistogramND(dimension, binnings, frequencies=None, **kwargs)

Bases: physt.histogram_base.HistogramBase

Multi-dimensional histogram data.

accumulate(axis)

Calculate cumulative frequencies along a certain axis.

Parameters:axis (int or str) –
Returns:new_hist – Histogram of the same type & size
Return type:HistogramND or Histogram2D
bin_sizes
binnings

The binnings.

Note: Please, do not try to update the objects themselves.

Returns:
Return type:list[physt.binnings.BinningBase]
bins

Matrix of bins.

Returns:Two sets of array bins.
Return type:list[np.ndarray]
fill(value, weight=1, **kwargs)

Add a value.

Abstract method - to be implemented in daughter classes.s

Parameters:
  • value – Value to be added. Can be scalar or array depending on the histogram type.
  • weight (Optional) – Weight of the value

Note

May change the dtype if weight is set

fill_n(values, weights=None, dropna=True, columns=False)

Add more values at once.

Parameters:
  • values (array_like) – Values to add. Can be array of shape (count, ndim) or array of shape (ndim, count) [use columns=True] or something convertible to it
  • weights (array_like) – Weights for values (optional)
  • dropna (bool) – Whether to remove NaN values. If False and such value is met, exception is thrown.
  • columns (bool) – Signal that the data are transposed (in columns, instead of rows). This allows to pass list of arrays in values.
find_bin(value, axis=None)

Index(indices) of bin corresponding to a value.

Parameters:
  • value (array_like) – Value with dimensionality equal to histogram
  • axis (Optional[int]) – If set, find axis along an axis. Otherwise, find bins along all axes. None = outside the bins
Returns:

If axis is specified, a number. Otherwise, a tuple. If not available, None.

Return type:

int or tuple or None

get_bin_centers(axis=None)
get_bin_edges(axis=None)
get_bin_left_edges(axis=None)
get_bin_right_edges(axis=None)
get_bin_widths(axis=None)
numpy_bins

Numpy-like bins (if available)

Returns:
Return type:list[np.ndarray]
projection(*axes, **kwargs)

Reduce dimensionality by summing along axis/axes.

Parameters:
  • axes (Iterable[int or str]) – List of axes for the new histogram. Could be either numbers or names. Must contain at least one axis.
  • name (Optional[str] # TODO: Check) – Name for the projected histogram (default: same)
  • type (Optional[type] # TODO: Check) – If set, predefined class for the projection
Returns:

Return type:

HistogramND or Histogram2D or Histogram1D (or others in special cases)

select(axis, index, force_copy=False)

Select in an axis.

Parameters:
  • axis (int or str) – Axis, in which we select.
  • index (int or slice) – Index of bin (as in numpy).
  • force_copy (bool) – If True, identity slice force a copy to be made.
Returns:

Return type:

HistogramND or Histogram2D or Histogram1D (or others in special cases)

total_size

The total size of the bin space.

Returns:
Return type:float

Note

Perhaps not optimized, but should work also with transformed axes

physt.histogram_nd.calculate_frequencies(data, ndim, binnings, weights=None, dtype=None)

“Get frequencies and bin errors from the data (n-dimensional variant).

Parameters:
  • data (array_like) – 2D array with ndim columns and row for each entry.
  • ndim (int) – Dimensionality od the data.
  • binnings – Binnings to apply in all axes.
  • weights (Optional[array_like]) – 1D array of weights to assign to values. (If present, must have same length as the number of rows.)
  • dtype (Optional[type]) – Underlying type for the histogram. (If weights are specified, default is float. Otherwise int64.)
Returns:

  • frequencies (array_like)
  • errors2 (array_like)
  • missing (scalar[dtype])

physt.io module

Input and output for histograms.

JSON format is included by default. Other formats are/will be available as modules.

Note: When implementing, try to work with a JSON-like
tree and reuse create_from_dict and HistogramBase.to_dict.
exception physt.io.VersionError

Bases: Exception

physt.io.create_from_dict(data, format_name)

Once dict from source data is created, turn this into histogram.

Parameters:data (dict) – Parsed JSON-like tree.
Returns:histogram – A histogram (of any dimensionality)
Return type:HistogramBase
physt.io.require_compatible_version(compatible_version, word='File')

Check that compatible version of input data is not too new.

physt.special module

Transformed histograms.

These histograms use a transformation from input values to bins in a different coordinate system.

There are three basic classes:

  • PolarHistogram
  • CylindricalHistogram
  • SphericalHistogram

Apart from these, there are their projections into lower dimensions.

And of course, it is possible to re-use the general transforming functionality by adding TransformedHistogramMixin among the custom histogram class superclasses.

class physt.special.AzimuthalHistogram(binning, frequencies=None, errors2=None, **kwargs)

Bases: physt.histogram1d.Histogram1D

Projection of polar histogram to 1D with respect to phi.

This is a special case of a 1D histogram with transformed coordinates.

fill(value, weight=1)

Update histogram with a new value.

Parameters:
  • value (float) – Value to be added.
  • weight (float, optional) – Weight assigned to the value.
Returns:

  • int – index of bin which was incremented (-1=underflow, N=overflow, None=not found)
  • Note (If a gap in unconsecutive bins is matched, underflow & overflow are not valid anymore.)
  • Note (Name was selected because of the eponymous method in ROOT)

fill_n(values, weights=None, dropna=True)

Update histograms with a set of values.

Parameters:
  • values (array_like) –
  • weights (Optional[array_like]) –
  • drop_na (Optional[bool]) – If true (default), all nan’s are skipped.
class physt.special.CylinderSurfaceHistogram(binnings, frequencies=None, radius=1, **kwargs)

Bases: physt.special.TransformedHistogramMixin, physt.histogram_nd.HistogramND

2D histogram in coordinates on cylinder surface.

This is a special case of a 2D histogram with transformed coordinates: - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range - z as the last direction without modification, in (-inf, +inf) range

radius

float – The radius of the surface. Useful for plotting

radius

Radius of the cylindrical surface.

Useful for calculating densities.

Returns:
Return type:float
class physt.special.CylindricalHistogram(binnings, frequencies=None, **kwargs)

Bases: physt.special.TransformedHistogramMixin, physt.histogram_nd.HistogramND

3D histogram in cylindrical coordinates.

This is a special case of a 3D histogram with transformed coordinates: - r as radius projection to xy plane in the (0, +inf) range - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range - z as the last direction without modification, in (-inf, +inf) range

bin_sizes
projection(*args, **kwargs)

Projection to lower-dimensional histogram.

The inheriting class should implement the _projection_class_map class attribute to suggest class for the projection. If the arguments don’t match any of the map keys, HistogramND is used.

classmethod transform(value)

Convert cartesian (general) coordinates into internal ones.

Parameters:value (array_like) – This method should accept both scalars and numpy arrays. If multiple values are to be transformed, it should of (nvalues, ndim) shape.
Returns:
Return type:float or array_like
class physt.special.DirectionalHistogram(binnings, frequencies=None, radius=1, **kwargs)

Bases: physt.special.TransformedHistogramMixin, physt.histogram_nd.HistogramND

2D histogram in spherical coordinates.

This is a special case of a 2D histogram with transformed coordinates: - theta as angle between z axis and the vector, in the (0, 2*pi) range - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range

bin_sizes
radius

Radius of the surface.

Useful for calculating densities.

class physt.special.PolarHistogram(binnings, frequencies=None, **kwargs)

Bases: physt.special.TransformedHistogramMixin, physt.histogram_nd.HistogramND

2D histogram in polar coordinates.

This is a special case of a 2D histogram with transformed coordinates: - r as radius in the (0, +inf) range - phi as azimuthal angle in the (0, 2*pi) range

bin_sizes
classmethod transform(value)

Convert cartesian (general) coordinates into internal ones.

Parameters:value (array_like) – This method should accept both scalars and numpy arrays. If multiple values are to be transformed, it should of (nvalues, ndim) shape.
Returns:
Return type:float or array_like
class physt.special.RadialHistogram(binning, frequencies=None, errors2=None, **kwargs)

Bases: physt.histogram1d.Histogram1D

Projection of polar histogram to 1D with respect to radius.

This is a special case of a 1D histogram with transformed coordinates.

bin_sizes
fill(value, weight=1)

Update histogram with a new value.

Parameters:
  • value (float) – Value to be added.
  • weight (float, optional) – Weight assigned to the value.
Returns:

  • int – index of bin which was incremented (-1=underflow, N=overflow, None=not found)
  • Note (If a gap in unconsecutive bins is matched, underflow & overflow are not valid anymore.)
  • Note (Name was selected because of the eponymous method in ROOT)

fill_n(values, weights=None, dropna=True)

Update histograms with a set of values.

Parameters:
  • values (array_like) –
  • weights (Optional[array_like]) –
  • drop_na (Optional[bool]) – If true (default), all nan’s are skipped.
class physt.special.SphericalHistogram(binnings, frequencies=None, **kwargs)

Bases: physt.special.TransformedHistogramMixin, physt.histogram_nd.HistogramND

3D histogram in spherical coordinates.

This is a special case of a 3D histogram with transformed coordinates: - r as radius in the (0, +inf) range - theta as angle between z axis and the vector, in the (0, 2*pi) range - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range

bin_sizes
classmethod transform(value)

Convert cartesian (general) coordinates into internal ones.

Parameters:value (array_like) – This method should accept both scalars and numpy arrays. If multiple values are to be transformed, it should of (nvalues, ndim) shape.
Returns:
Return type:float or array_like
class physt.special.TransformedHistogramMixin

Bases: object

Histogram with non-cartesian (or otherwise transformed) axes.

This is a mixin, providing transform-aware find_bin, fill and fill_n.

When implementing, you are required to provide tbe following: - transform method to convert rectangular (suggested to make it classmethod) - bin_sizes property

In certain cases, you may want to have default axis names + projections. Look at PolarHistogram / SphericalHistogram / CylindricalHistogram as an example.

bin_sizes
fill(value, weight=1, transformed=False)
fill_n(values, weights=None, dropna=True, transformed=False)
find_bin(value, axis=None, transformed=False)
Parameters:
  • value (array_like) – Value with dimensionality equal to histogram.
  • transformed (bool) – If true, the value is already transformed and has same axes as the bins.
projection(*axes, **kwargs)

Projection to lower-dimensional histogram.

The inheriting class should implement the _projection_class_map class attribute to suggest class for the projection. If the arguments don’t match any of the map keys, HistogramND is used.

classmethod transform(value)

Convert cartesian (general) coordinates into internal ones.

Parameters:value (array_like) – This method should accept both scalars and numpy arrays. If multiple values are to be transformed, it should of (nvalues, ndim) shape.
Returns:
Return type:float or array_like
physt.special.cylindrical_histogram(data=None, rho_bins='numpy', phi_bins=16, z_bins='numpy', transformed=False, *args, **kwargs)

Facade construction function for the CylindricalHistogram.

physt.special.polar_histogram(xdata, ydata, radial_bins='numpy', phi_bins=16, transformed=False, *args, **kwargs)

Facade construction function for the PolarHistogram.

Parameters:
  • transformed (bool) –
  • phi_range (Optional[tuple]) –
  • range
physt.special.spherical_histogram(data=None, radial_bins='numpy', theta_bins=16, phi_bins=16, transformed=False, *args, **kwargs)

Facade construction function for the SphericalHistogram.

Module contents

physt

P(i/y)thon h(i/y)stograms. Inspired (and based on) numpy.histogram, but designed for humans(TM) on steroids(TM).

(C) Jan Pipek, 2016-8, MIT licence See https://github.com/janpipek/physt

physt.h(data, bins=10, *args, **kwargs)

Facade function to create n-dimensional histograms.

3D variant of this function is also aliased as “h3”.

Parameters:
  • data (array_like) – Container of all the values
  • bins (Any) –
  • weights (array_like, optional) – (as numpy.histogram)
  • dropna (bool) – whether to clear data from nan’s before histogramming
  • name (str) – name of the histogram
  • axis_names (Iterable[str]) – names of the variable on x axis
  • adaptive – whether the bins should be updated when new non-fitting value are filled
  • dtype (Optional[type]) – Underlying type for the histogram. If weights are specified, default is float. Otherwise int64
  • dim (int) – Dimension - necessary if you are creating an empty adaptive histogram
Returns:

Return type:

physt.histogram_nd.HistogramND

See also

numpy.histogramdd()

physt.h1(data, bins=None, *args, **kwargs)

Facade function to create 1D histograms.

This proceeds in three steps: 1) Based on magical parameter bins, construct bins for the histogram 2) Calculate frequencies for the bins 3) Construct the histogram object itself

Guiding principle: parameters understood by numpy.histogram should be understood also by physt.histogram as well and should result in a Histogram1D object with (h.numpy_bins, h.frequencies) same as the numpy.histogram output. Additional functionality is a bonus.

This function is also aliased as “h1”.

Parameters:
  • data (array_like, optional) – Container of all the values (tuple, list, np.ndarray, pd.Series)
  • bins (int or sequence of scalars or callable or str, optional) – If iterable => the bins themselves If int => number of bins for default binning If callable => use binning method (+ args, kwargs) If string => use named binning method (+ args, kwargs)
  • weights (array_like, optional) – (as numpy.histogram)
  • keep_missed (Optional[bool]) – store statistics about how many values were lower than limits and how many higher than limits (default: True)
  • dropna (bool) – whether to clear data from nan’s before histogramming
  • name (str) – name of the histogram
  • axis_name (str) – name of the variable on x axis
  • adaptive (bool) – whether we want the bins to be modifiable (useful for continuous filling of a priori unknown data)
  • dtype (type) – customize underlying data type: default int64 (without weight) or float (with weights)
  • numpy.histogram parameters are excluded, see the methods of the Histogram1D class itself. (Other) –
Returns:

Return type:

physt.histogram1d.Histogram1D

See also

numpy.histogram()

physt.h2(data1, data2, bins=10, *args, **kwargs)

Facade function to create 2D histograms.

For implementation and parameters, see histogramdd.

This function is also aliased as “h2”.

Returns:
Return type:physt.histogram_nd.Histogram2D

See also

numpy.histogram2d(), histogramdd()

physt.h3(data, *args, **kwargs)

Facade function to create 3D histograms.

Parameters:data (array_like or list[array_like] or tuple[array_like]) – Can be a single array (with three columns) or three different arrays (for each component)
Returns:
Return type:physt.histogram_nd.HistogramND
physt.histogram(data, bins=None, *args, **kwargs)

Facade function to create 1D histograms.

This proceeds in three steps: 1) Based on magical parameter bins, construct bins for the histogram 2) Calculate frequencies for the bins 3) Construct the histogram object itself

Guiding principle: parameters understood by numpy.histogram should be understood also by physt.histogram as well and should result in a Histogram1D object with (h.numpy_bins, h.frequencies) same as the numpy.histogram output. Additional functionality is a bonus.

This function is also aliased as “h1”.

Parameters:
  • data (array_like, optional) – Container of all the values (tuple, list, np.ndarray, pd.Series)
  • bins (int or sequence of scalars or callable or str, optional) – If iterable => the bins themselves If int => number of bins for default binning If callable => use binning method (+ args, kwargs) If string => use named binning method (+ args, kwargs)
  • weights (array_like, optional) – (as numpy.histogram)
  • keep_missed (Optional[bool]) – store statistics about how many values were lower than limits and how many higher than limits (default: True)
  • dropna (bool) – whether to clear data from nan’s before histogramming
  • name (str) – name of the histogram
  • axis_name (str) – name of the variable on x axis
  • adaptive (bool) – whether we want the bins to be modifiable (useful for continuous filling of a priori unknown data)
  • dtype (type) – customize underlying data type: default int64 (without weight) or float (with weights)
  • numpy.histogram parameters are excluded, see the methods of the Histogram1D class itself. (Other) –
Returns:

Return type:

physt.histogram1d.Histogram1D

See also

numpy.histogram()

physt.histogram2d(data1, data2, bins=10, *args, **kwargs)

Facade function to create 2D histograms.

For implementation and parameters, see histogramdd.

This function is also aliased as “h2”.

Returns:
Return type:physt.histogram_nd.Histogram2D

See also

numpy.histogram2d(), histogramdd()

physt.histogramdd(data, bins=10, *args, **kwargs)

Facade function to create n-dimensional histograms.

3D variant of this function is also aliased as “h3”.

Parameters:
  • data (array_like) – Container of all the values
  • bins (Any) –
  • weights (array_like, optional) – (as numpy.histogram)
  • dropna (bool) – whether to clear data from nan’s before histogramming
  • name (str) – name of the histogram
  • axis_names (Iterable[str]) – names of the variable on x axis
  • adaptive – whether the bins should be updated when new non-fitting value are filled
  • dtype (Optional[type]) – Underlying type for the histogram. If weights are specified, default is float. Otherwise int64
  • dim (int) – Dimension - necessary if you are creating an empty adaptive histogram
Returns:

Return type:

physt.histogram_nd.HistogramND

See also

numpy.histogramdd()