crandas.ctypes

exception crandas.ctypes.ColumnBoundDerivedWarning(colname, coltype)

Bases: Warning

class crandas.ctypes.Ctype

Bases: object

Ctypes, or “crandas types”, are an extensible client-side type system that allow the user to provide additional type information beyond pandas/numpy dtypes.

Ctypes are represented as class instances, e.g. NullableInteger(). Some classes take arguments in their initialization, like Varchar(max_length=12). Each Ctype also has a string representation, like “varchar[12]”. Either of these can be specified to the ctype kwarg of cd.DataFrame, so e.g. >>> cd.DataFrame({“ints”: [1, 2, 3], “strings”: [“a”, “bb”, “ccc”]},

ctype={“ints”: NullableInteger(), “strings”: “varchar[5]”})

If a manual ctype is not specified, the appropriate ctype is automatically deduced using the pandas dtype. For details of how this is implemented, see the Ctype.for_series() classmethod.

The JSON representation of a ctype is of the following form:
type (str)

column type (b: bytes | fp: fixed point | f: fractional | i: integer | s: string | d: date)

elements_per_value (int)

number of elements in the column

nullable (bool)

boolean determining if the column is nullable

Additionally it may contain a constraints field in the form of a Validation.

Internal workings

Each class (so e.g. Integer) has CtypeBase as a base class, and is decorated with @Ctype.register, which registers the Ctype’s .dtype, .ctype properties so that the Ctype class may perform automatic ctype inference on pandas.Series objects.

classmethod for_series(series, ctype_spec=None)

Determine the Ctype for a pandas.Series object, based on the specified ctype_str, the series.dtype, the ctype_cls.from_series() function, or the value_type (i.e. the type of next(iter(series))), in that order.

classmethod from_spec(ctype_spec)

Determine the Ctype based on a specification, that is a ctype object (i.e. an instance of a subclass of CtypeBase), a string, or a Python type

crandas.ctypes.derive_int_bounds(series, spec_min_value, spec_max_value, max_byte_size=4)

Derive int bounds from series and max/min specification

If specified maximum and/or minumum is given, this range is used, and it is verified that the values in the series comply with the range.