fasttext#
Text classification with FastText models that are compatible with cleanlab. This module allows you to easily find label issues in your text datasets.
You must first pip install fasttext
Classes:
|
Functions:
|
Returns a generator, yielding two lists containing [labels], [text]. |
- class cleanlab.experimental.fasttext.FastTextClassifier(train_data_fn, test_data_fn=None, labels=None, tmp_dir='', label='__label__', del_intermediate_data=True, kwargs_train_supervised={}, p_at_k=1, batch_size=1000)[source]#
Bases:
sklearn.base.BaseEstimator
Methods:
fit
([X, y, sample_weight])Trains the fast text classifier.
get_params
([deep])Get parameters for this estimator.
predict
([X, train_data, return_labels])Predict labels of X
predict_proba
([X, train_data, return_labels])Produces a probability matrix with examples on rows and classes on columns, where each row sums to 1 and captures the probability of the example belonging to each class.
score
([X, y, sample_weight, k])Compute the average precision @ k (single label) of the labels predicted from X and the true labels given by y.
set_params
(**params)Set the parameters of this estimator.
- fit(X=None, y=None, sample_weight=None)[source]#
Trains the fast text classifier. Typical usage requires NO parameters, just clf.fit() # No params.
- Parameters
X (
iterable
,e.g. list
,numpy array (default None)
) – The list of indices of the data to use. When in doubt, set as None. None defaults to range(len(data)).y (
None
) – Leave this as None. It’s a filler to suit sklearns reqs.sample_weight (
None
) – Leave this as None. It’s a filler to suit sklearns reqs.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters
deep (
bool
, defaultTrue
) – If True, will return the parameters for this estimator and contained subobjects that are estimators.- Returns
params – Parameter names mapped to their values.
- Return type
dict
- predict_proba(X=None, train_data=True, return_labels=False)[source]#
Produces a probability matrix with examples on rows and classes on columns, where each row sums to 1 and captures the probability of the example belonging to each class.
- score(X=None, y=None, sample_weight=None, k=None)[source]#
Compute the average precision @ k (single label) of the labels predicted from X and the true labels given by y. score expects a y variable. In this case, y is the noisy labels.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (
dict
) – Estimator parameters.- Returns
self – Estimator instance.
- Return type
estimator instance