flippers.models.Voter#

class flippers.models.Voter(polarities, cardinality=0)#

Bases: _Model

Basic model that bases its decisions on a sum of votes (optionally weighted) for each class.

Parameters:
  • polarities (ndarray | List) – List that maps weak labels to polarities, size n_weak.

  • cardinality (int) –

    Number of possible label values.

    If unspecified, it will be inferred from the maximum value in polarities.

Example

>>> polarities = [1, 0, 1, 1]
>>> cardinality = 2
>>> model = ModelClass(polarities, cardinality)
fit(L, class_balances=[])#

Fit the Voter model. This computes the weights for each class.

Reweighing the votes help especially when specific classes have a high overlap in their weak labels.

The weights are computed so the weighted sum of votes over training matches the given class balance.

This guarantees mean(y_pred_proba_train) = class_balance.

For majority voting, do not use fit.

Parameters:
  • L (pd.DataFrame) – Weak label dataframe.

  • class_balances (ListLike) –

    Numpy array of shape cardinality giving a weight to each class.

    When unspecified, assumes all classes are equally likely.

Return type:

None

Example

>>> L = [[1, 0, 1, 2], [0, 1, 2, 1], [1, 2, 1, 0], [0, 1, 0, 2]]
>>> class_balances = [0.6, 0.4]
>>> base_model.fit(L, class_balances)
predict_proba(L)#

Predict probabilites using weighted voting.

Parameters:

L (pd.DataFrame) – Weak label dataframe.

Return type:

Array of predicted probabilities of shape (len(L), cardinality)

Example

>>> L = [[1, 0, 1, 2], [0, 1, 0, 0]]
>>> proba = snorkel_model.predict_proba(L)
>>> # proba.shape = (len(L), cardinality)
classmethod load(filepath)#

Load a saved model from a file.

Parameters:

filepath (str) – Path to the file containing the saved model.

Return type:

The loaded model object.

Example

>>> model = ModelClass.load("label_model.pkl")
predict(L, strategy='majority')#

Predict labels for the given weak label matrix using the specified strategy.

Parameters:
  • L (ndarray | DataFrame) –

    Weak label dataframe.

    Shape: (n_samples, n_weak)

  • strategy (str) –

    Prediction strategy to use. Supported values: majority, probability.

    Controls how labels are predicted from the predicted probabilites.

    • majority: Predict the label with the highest number of votes.

    • probability: Predict label j with probability proba[i, j].

      This can be useful to enforce specific class_balances in the predictions.

    Default is “majority”.

    If there are no votes for a sample, will predict -1.

Return type:

1-D array of predicted labels of size n_samples

Example

>>> L = [[1, 0, 1, 2], [0, 1, 0, 0]]
>>> predictions = base_model.predict(L)
>>> # predictions.shape = (len(L),)
save(filepath)#

Save the model to a file.

Parameters:

filepath (str) – Path to the file where the model will be saved.

Example

>>> model.save("label_model.pkl")