crandas.crlearn#

class crandas.crlearn.logistic_regression.LogisticRegression(penalty='l2', *, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver='lbfgs', max_iter=10, multi_class='auto', verbose=0, warm_start=False, n_jobs=None, l1_ratio=None, classes=[], n_classes=2)#

Bases: object

Logistic Regression Classifier Object with the same parameters as the Scikit learn Logistic Regression Class

See: https://github.com/scikit-learn/scikit-learn/blob/98cf537f5/sklearn/linear_model/_logistic.py#L783 for its parameters.

fit(X, y, sample_weight=None, max_iter=None, warm_start=None, **query_args)#

Fit a Logistic Regression model on the data

NOTE: Compared to Scikit learn we add the parameter max_iter and warm_start. Scikit learn treats max_iter and warm_start as object configurations which are set at construction and cannot be changed. We prefer to give the user the freedom of deviating form the global setting in successive fitting calls.

We rather use the corresponding class attributes as default values for each call to fit.

Parameters:
  • X (CDataFrame) – predictor variables

  • y (CDataFrame) – response variable (should have only 1 column) that columns should be integer.

  • sample_weight – array of weights assigned to individual sampled (Not yet supported)

  • max_iter (int) – deviation from Scikit (see note above)

  • warm_start (bool) – deviation from Scikit (see note above) if True: determines whether successive fits continue approximation from where it stopped else: indicates that each successive fit will start from scratch.

  • query_args – See queryargs

Returns:

self

Return type:

LogisticRegression

get_beta(**kwargs)#

Get the fitted parameters (i.e. intercept_ and coef_ combined in 1 table named beta).

predict(X, decision_boundary=0.5, **query_args)#

Make (binary) predictions on a dataset using a logistic regression model

Note: this returns binary predictions, not probabilities!

Parameters:
  • X (CDataFrame) – predictor variables

  • decision_boundary (float) – number between 0 and 1; records with a probability below this value are classified as 0, greater than or equal to as 1

  • query_args – See queryargs

Returns:

column consisting of the predicted probabilities

Return type:

CDataFrame

predict_proba(X, **query_args)#

Make (probability) predictions on a dataset using a logistic regression model

Note: this returns probabilities, not binary predictions

Parameters:
  • X (CDataFrame) – predictor variables

  • query_args – See queryargs

Returns:

column consisting of the predicted probabilities

Return type:

CDataFrame

class crandas.crlearn.logistic_regression.LogisticRegressionStateObject(reg_type, **kwargs)#

Bases: StateObject