crandas.crlearn¶

crandas.crlearn.linear_model.CLinearRegression¶: Alias for LinearRegression (use of this alias is deprecated)

class crandas.crlearn.linear_model.LinearRegression(alpha=0.0, *, fit_intercept=True, copy_X=True, n_jobs=None, positive=False, **query_args)¶

Bases: Model

Linear ridge regression classifier corresponding to the scikit-learn Ridge class (see here).

Parameters:

alpha: regularization strength (see scikit-learn documentation); defaults to 0.0

Other constructor parameters are for compatibility with scikit-learn and cannot be overridden.

Attributes:

n_features_in_: number of input features
feature_names_in_: input feature names
beta_: (encrypted) fitted parameters (intercept and respective feature coefficients)
standard_error_: (encrypted) standard-error of each fitted parameter
singular_: boolean representing if singularity of the model is detected

fit(X, y, **query_args)¶

Fit a Linear Regression model on the data

Parameters:

X (DataFrame) – Training data
y (DataFrame) – Target data (should have only 1 column)
query_args – See queryargs

Return type:

self

get_beta(**query_args)¶

Get the fitted parameters (i.e. intercept and feature coeficients) as a table

This function is deprecated; instead, use Model.open() to open the model, and use the returned beta_ attribute.

predict(X, **query_args)¶

Make predictions on a dataset using a linear regression model

Note: this returns predictions on the target, not probabilities!

Parameters:

X (DataFrame) – predictor variables
query_args – See queryargs

Returns:

table containing the column consisting of the predicted target values

Return type:

DataFrame

score(X, y, **query_args)¶

Scores the linear regression model using the R2 metric

Parameters:

X (DataFrame) – Test data
y (DataFrame) – Target test data (should have only 1 column)
query_args – See queryargs

Return type:

self

crandas.crlearn.linear_model.Ridge(alpha=1.0, *, fit_intercept=True, copy_X=True, max_iter=None, tol=None, solver='cholesky', positive=False, random_state=None, **params_and_query_args)¶

Create a new ridge regression model (LinearRegression) with given alpha (1.0 by default)

Other parameters are for compatibility with scikit-learn and cannot be overriden.

crandas.crlearn.logistic_regression.CLogisticRegression¶: Alias for LogisticRegression (use of this alias is deprecated)

class crandas.crlearn.logistic_regression.LogisticRegression(penalty='l2', *, optimizer='lbfgs', type='binomial', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver=None, max_iter=10, verbose=0, warm_start=False, n_jobs=None, l1_ratio=None, **query_args)¶

Bases: Model

Logistic Regression Classifier Object with the same parameters as the Scikit-learn Logistic Regression class.

See here for its parameters.

Parameters:

type (string) – (binomial/multinomial/ordinal)
optimizer (Optimizer) – optimizer used to fit the model (see crandas.crlearn.optimizer.OptimizerParams)
max_iter (int) – number of iterations to perform
warm_start (bool) – whether to continue fitting from the previous optimizer state

Other constructor parameters have the same meaning as in scikit-learn but cannot be changed from their defaults.

feature_names_in_¶: input feature names

n_classes_¶

number of output classes

Type:: int

feature_name_out_¶: output feature name

optimizer_¶: attributes of the optimizer used to fit the model (see crandas.crlearn.optimizer.OptimizerAttributes)

beta_¶: (encrypted) fitted parameters (intercept(s) and coefficients)

fit(X, y, *, n_classes=None, sample_weight=None, **query_args)¶

Fit a Linear Regression model on the data

Parameters:

X (DataFrame) – Training data
y (DataFrame) – Target data (should have only 1 column)
n_classes (int or None) – Number of output classes (categories). For binomial models, if not given, n_classes is assumed to be equal to two. For other models, if not given, the number of classes is derived from the metadata of y.
sample_weight (None) – Not supported
query_args – See queryargs

Returns:

self

Return type:

LogisticRegression

from_beta(*, type='binomial', n_classes=2, feature_names_in, feature_name_out='out')¶

Upload pre-fittted logistic regression model

Parameters:

beta (list[float]) – Fitted parameters
type (str, default "binomial") – Type of model (“binomial”/”multinomial”/”ordinal”)
n_classes (int, default 2) – Number of classes
feature_names_in (list[str]) – Input feature names
feature_name_out (str, default "out") – Output feature name

Returns:

Logistic regression model with given parameters

Return type:

LogisticRegression

predict(X, decision_boundary=0.5, **query_args)¶

Make (binary) predictions on a dataset using a logistic regression model

Note: this returns binary predictions, not probabilities!

Parameters:

X (DataFrame) – predictor variables
decision_boundary (float) – number between 0 and 1; records with a probability below this value are classified as 0, greater than or equal to as 1
query_args – See queryargs

Returns:

table containing the column consisting of the predicted target values

Return type:

DataFrame

predict_proba(X, **query_args)¶

Make (probability) predictions on a dataset using a logistic regression model

Note: this returns probabilities, not binary predictions

Parameters:

X (DataFrame) – predictor variables
query_args – See queryargs

Returns:

table with columns representing predicted class probabilities per input record

Return type:

DataFrame

crandas.crlearn.metrics.classification_accuracy(y, y_pred, n_classes=2, **query_args)¶

Compute the classification accuracy on class predictions

Parameters:

y (DataFrame) – column with the actual values in range
y_pred (DataFrame) – column with the predictions in range
n_classes (int) – number of classes (default = 2)
query_args – See queryargs

Returns:

fixed point number between 0 and 1

Return type:

DataFrame

crandas.crlearn.metrics.confusion_matrix(y, y_pred, n_classes=2, **query_args)¶

Compute the confusion matrix on class predictions

The y-axis of the result represents the true class. The x-axis the predicted class.

Parameters:

y (DataFrame) – column with the actual values in range
y_pred (DataFrame) – column with the predictions in range
n_classes (int) – number of classes (default = 2)
query_args – See queryargs

Returns:

matrix of size n_classes * n_classes

Return type:

DataFrame

crandas.crlearn.metrics.mcfadden_r2(model, X, y, **query_args)¶

Compute the McFadden R^2 metric

Parameters:

model (LogisticRegression) – logistic regression model
X (DataFrame) – predictor variables
y (DataFrame) – binary response variable (should have only 1 column)
query_args – See queryargs

Returns:

fixed point number between 0 and 1

Return type:

DataFrame

crandas.crlearn.metrics.model_deviance(model, X, y, **query_args)¶

Compute the model deviance

Parameters:

model (LogisticRegression) – logistic regression model
X (DataFrame) – predictor variables
y (DataFrame) – binary response variable (should have only 1 column)
query_args – See queryargs

Returns:

fixed point number between 0 and 1

Return type:

DataFrame

crandas.crlearn.metrics.null_deviance(y, **query_args)¶

Compute the null deviance

Parameters:

y (DataFrame) – binary response variable (should have only 1 column)
query_args – See queryargs
NOTE (both classes NEED to be present in 'y', otherwise the computations are undefined internally (logarithm of 0))

Returns:

fixed point number between 0 and 1

Return type:

DataFrame

crandas.crlearn.metrics.precision_recall(y, y_pred, **query_args)¶

Compute the precision and recall on predictions

Parameters:

y (DataFrame) – column with the actual values (binary)
y_pred (DataFrame) – column with the predictions (binary)

query_args :: See queryargs

Returns:: two fixed numbers between 0 and 1
Return type:: DataFrame

crandas.crlearn.metrics.score_r2(y, y_pred, **query_args)¶

Compute the R^2 metric on predictions

Parameters:

y (DataFrame) – column with the actual values
y_pred (DataFrame) – column with the predictions
query_args – See queryargs

Returns:

fixed point number between < 1

Return type:

DataFrame

crandas.crlearn.metrics.tjur_r2(y, y_pred, **query_args)¶

Compute the Tjur R^2 metric on predictions

Parameters:

y (DataFrame) – column with the actual values (binary)
y_pred (DataFrame) – column with the predictions (probabilities!)
query_args – See queryargs

Returns:

fixed point number between -1 and 1

Return type:

DataFrame