crandas.stateobject

class crandas.stateobject.ObjectProperty(nm)

Bases: object

Property of a StateObject

If the property is not set, an UnknownSchemaError is raised.

The value of the property is set to/gotten from the attribute self.nm, so, e.g., CDataFrame.columns maps to CDataFrame._columns. Accordingly, for example, If the columns of a CDataFrame are not set (so inside a transaction or a dry run), then .columns gives an UnknownSchemaError whereas ._columns returns None (and similarly for .nrows). So the .columns should be used by users, and by functions that really need to know the columns (for example, merge needs to know the names of the columns to be able to formulate its query and cannot do anything otherwise). And we can to use ._columns to enable functions to work in transactions/dry-run mode if they can still do something that makes sense if .columns is None.

enum crandas.stateobject.SaveMode(value)

Bases: Enum

Only save if the target name does not exist

Valid values are as follows:

IF_NOT_EXISTS = <SaveMode.IF_NOT_EXISTS: 1>
IF_EXISTS = <SaveMode.IF_EXISTS: 2>
ALWAYS = <SaveMode.ALWAYS: 3>
class crandas.stateobject.StateObject(**kwargs)

Bases: ResponseHandler, DeferredBase

Base class for VDL state objects.

A VDL state object is an object that is stored remotely in the VDL and on which operations can be performed, such as a CDataFrame.

clone()

Creates a clone of the StateObject. Only needed for objects that can be opened

The clone is used to open the object: when the object is opened, essentially, self.clone().json_to_opened(…) is called.

get_deferred(json_query, *, session)

Called when query is added to a transaction

Parameters:
  • json_q (JSON struct) – Query to be performed (as passed to vdl_query; in particular, with placeholders in place and without the signature used for authorization)

  • session (crandas.base.Session) – Session in which query is executed

Returns:

Return value to be provided to caller of vdl_query

Return type:

Deferred

get_dry_run_result(json_query, *, session)

Called when executing query in dry-run mode

Parameters:
  • json_q (JSON struct) – Query to be performed (as passed to vdl_query; in particular, with placeholders in place and without the signature used for authorization)

  • session (crandas.base.Session) – Session in which query is executed

Returns:

Return value to be provided to caller of vdl_query

Return type:

object

open_dry_run_result()

Return opened version of the object when run in dry-run mode

Should return an object of the same type as json_to_opened.

parse_response(json_query, json_answer, binary_data, prss_nonce, ix, *, session)

Called upon receiving a response to the query from the server

Parameters:
  • json_q (JSON struct) – Query to be performed (as passed to vdl_query; in particular, with placeholders in place and without the signature used for authorization)

  • json_a (JSON struct) – Answer received from server

  • binary_data (binary data stream, see crandas.queries.Query.getdata()) – Stream of binary data for answer

  • prss_nonce (str) – Server-supplied nonce for streaming uploads/downloads

  • ix (int) – Transaction index for masking (0 if not in transaction; otherwise: 1, 2, …)

Returns:

Return value to be provided to caller of vdl_query

Return type:

object

property reference

Obtain a JSON-serializable reference to the object.

remove(**query_args)

Remove object from server

Parameters:

query_args – See Query Arguments

property result

Returns object

Can be used to retrieve the result of an operation inside a transaction regardless of its mode: if the mode is “open”, then .result needs to be used. Otherwise .result is not needed but this property makes sure it can be used.

save(name=None, *, save_mode=SaveMode.ALWAYS, **query_args)

Save object

Saves the object, e.g., a computation result. As a result of saving, the object is treated in the same way as an upload, meaning e.g. that (depending on the server configuration), the object may remain available after server restarts. See Computed objects might be removed from cache.

Parameters:
  • name (str, default: None) – If set: store the name under the given symbolic name. The object can then be retrieved e.g. by cd.get_table('name') or cd.get(name='name')

  • save_mode (SaveMode, default: SaveMode.ALWAYS) – The default behavior is to always save. Specify this parameter to change this behavior to only save in case the target name already, or does not already, exists. Values other than SaveMode.ALWAYS are only allowed if a name is also given. An error is raised if the object was not saved due to the save mode.

  • query_args – See Query Arguments

Raises:

cd.errors.EngineError: – The object could not be persisted (see details above)

exception crandas.stateobject.UnknownSchemaError

Bases: RuntimeError

crandas.stateobject.get(handle=None, *, name=None, map_dummy_handles=None, **query_args)

Access a previously uploaded object by its handle or name.

The previously uploaded object is specified by its handle using cd.get(handle) or by its name using the cd.get(name=name). Either the handle or the name needs to be specified, but not both.

When called from a script recording, the retrieved object is checked against the object that was used when recording. This check includes the object type and, in the case of tables, also the column names/types.

Parameters:
  • handle (str) – Handle (hex-encoded string)

  • name (str) – Symbolic name of the object.

  • map_dummy_handles (bool, optional) –

    Whenever a script is being recorded (see crandas.script), the default behavior is to interpret all calls to cd.get(handle) as dummy_for:<handle> table names. This allows the user to use the same handle in both script recording and execution, even though the script recording takes place in a different environment where the real table handle does not exist.

    This behavior can be overridden in two levels: for the entire script or for a single call to get. For the entire script, mapping dummy handles can be disabled by supplying map_dummy_handles as False in the call to crandas.script.record(). For the call to get, by specifying this argument as either True or False, the mapping behavior is forced to be either enabled or disabled, regardless of the current script mode.

  • query_args – See Query Arguments. Note that name is not interpreted as a query argument (assigning a new name to the object) but as an already existing name by which to retrieve the object. To save an existing object under a new name, use obj.save(name="name").

Returns:

The object with the specified handle or name

Return type:

StateObject

Raises:

ValueError

  • Schema validation failed

crandas.stateobject.get_upload_handles(get_from=None)

Get list of handles of all objects uploaded to the VDL.

See crandas.stateobject.list_uploads() for an overview of which objects are considerd to be “uploaded”

Parameters:

get_from (str, optional) – If set to “mem”, only return objects that are in the working memory of the VDL server. If set to “disk”, only return objects that are not in the working memory. (The latter causes the objects to be loaded into memory at the server.)

crandas.stateobject.list_uploads(*, also_on_disk=False)

Get list of all objects uploaded to the VDL.

“Uploaded” objects are the results of functions that upload a local dataframe to the VDL, e.g., crandas.crandas.upload_pandas_dataframe(), crandas.crandas.DataFrame(), and crandas.crandas.read_csv(). It does not include tables that are computed from other tables or created via crandas.crandas.demo_table().

By default, for objects that are stored on-disk at the server, the metadata and creation date are not given. See the also_on_disk parameter below.

Parameters:

also_on_disk (bool, default: False) – If set to False, objects that need to be retrieved from disk at the server are listed, but their type and creation date are not returned. If set to True, these objects are loaded into memory and their metadata is returned.

Returns:

Dataframe containing handles, creation date, and type information

Return type:

pd.DataFrame