flippers.summary#
- flippers.summary(L, polarities, digits=3, normalize=False)#
Calculate summary statistics for the given weak label matrix and polarities.
- Parameters:
L (pd.DataFrame) – Weak label DataFrame of shape (n_samples, n_weak).
polarities (ndarray | List) – 1D array or list of size n_weak containing the polarities of each weak label.
digits (int) – Number of digits to round the output statistics to. Default 3.
normalize (int) – When True, shows overlaps/matches/conflicts as a ratio of coverage.
- Returns:
“polarity”: The polarity of each weak label.
”coverage”: The average ratio of samples that are assigned each weak label.
”confidence”: The average confidence level of the assigned weak labels.
”overlaps”: The ratio of assigned labels that have overlapping labels.
”matches”: The ratio of assigned labels that have other matching labels.
”conflicts”: The ratio of assigned labels that have conflicting labels.
- Return type:
DataFrame of shape (n_weak, n_summaries) containing the following columns
Example
>>> L = pd.DataFrame([[0, 1, 0], [1, 0, 1], [0, 0, 0], [1, 1, 1]]) >>> polarities = [0, 1, 1] >>> flippers.summary(L, polarities) polarity coverage confidence overlaps matches conflicts 0 0 0.5 1.0 0.50 0.00 0.50 1 1 0.5 1.0 0.25 0.25 0.25 2 1 0.5 1.0 0.50 0.25 0.50 >>> flippers.summary(L, polarities, normalize=True) polarity coverage confidence overlaps matches conflicts 0 0 0.5 1.0 1.0 0.0 1.0 1 1 0.5 1.0 0.5 0.5 0.5 2 1 0.5 1.0 1.0 0.5 1.0