Skip to content

crandas.stateobject

ObjectProperty(nm)

Property of a StateObject

If the property is not set, an UnknownSchemaError is raised.

The value of the property is set to/gotten from the attribute self.nm, so, e.g., DataFrame.columns maps to DataFrame._columns. Accordingly, for example, If the columns of a DataFrame are not set (so inside a transaction or a dry run), then .columns gives an UnknownSchemaError whereas ._columns returns None (and similarly for .nrows). So the .columns should be used by users, and by functions that really need to know the columns (for example, merge needs to know the names of the columns to be able to formulate its query and cannot do anything otherwise). And we can to use ._columns to enable functions to work in transactions/dry-run mode if they can still do something that makes sense if .columns is None.

SaveMode

Bases: Enum

Only save if the target name does not exist

IF_EXISTS = 2 class-attribute instance-attribute

Always save, regardless of whether target name already exists

IF_NOT_EXISTS = 1 class-attribute instance-attribute

Only save if the target name already exists

StateObject()

Bases: ResponseHandler, DeferredBase

Base class for Engine state objects.

A Engine state object is an object that is stored remotely in the Engine and on which operations can be performed, such as a DataFrame.

result property

Returns object

Can be used to retrieve the result of an operation inside a transaction regardless of its mode: if the mode is "open", then .result needs to be used. Otherwise .result is not needed but this property makes sure it can be used.

clone()

Creates a clone of the StateObject. Only needed for objects that can be opened

The clone is used to open the object: when the object is opened, essentially, self.clone().json_to_opened(...) is called.

open(*, _page=None, **query_args)

Queries the engine to open a StateObject

open_dry_run_result()

Return opened version of the object when run in dry-run mode

Should return an object of the same type as json_to_opened.

opened_to_json_output(value)

Convert an opened pandas value to its json representation used for dummy outputs. See here for the specification of outputs. This function is called on an object to which json_to_opened or open_dry_run_result has been called first.

remove(**query_args)

Remove object from server

PARAMETER DESCRIPTION
query_args

TYPE: (optional, dict) DEFAULT: {}

save(name=None, *, dummy_for=None, save_mode=SaveMode.ALWAYS, **query_args)

Save object

Saves the object, e.g., a computation result. As a result of saving, the object is stored permanently (until deleted), and is not subject any more to cache purging (see here).

If a symbolic name name is given as an argument, this name is mapped to the current object. If the name was previously mapped to another object, the mapping is updated but the original object remains available under its handle or under any other symbolic names that it may have. If after saving the previously mapped object no longer has any symbolic names, it will again be subject to purging.

The dummy_for argument can be used to mark an object as dummy data for script recordings; see here.

PARAMETER DESCRIPTION
name

If set: store the name under the given symbolic name. The object can then be retrieved e.g. by cd.get_table('name') or cd.get(name='name')

DEFAULT: None

save_mode

The default behavior is to always save. Specify this parameter to change this behavior to only save in case the target name already, or does not already, exists. Values other than SaveMode.ALWAYS are only allowed if a name is also given. An error is raised if the object was not saved due to the save mode.

DEFAULT: ALWAYS

dummy_for

Handle as string of 64 hexadecimal characters. This indicates that the table resulting from this computation should be interpreted as dummy data for the specified handle. May not be specified if name is also specified.

TYPE: str DEFAULT: None

query_args

TYPE: (optional, dict) DEFAULT: {}

RETURNS DESCRIPTION
StateObject

The object, i.e., self

RAISES DESCRIPTION
EngineError

The object could not be persisted (see details above)

get(handle=None, *, name=None, dummy_handle=None, prod_handle=None, map_dummy_handles=None, **query_args)

Access a previously uploaded object by its handle or name.

The previously uploaded object is specified by its handle using cd.get(handle) or by its name using the cd.get(name=name). Either the handle or the name needs to be specified, but not both.

When called from a script recording, the retrieved object is checked against the object that was used when recording. This check includes the object type and, in the case of tables, also the column names/types.

PARAMETER DESCRIPTION
handle

Handle (hex-encoded string)

TYPE: str DEFAULT: None

name

Symbolic name by which to retrieve the object.

DEFAULT: None

map_dummy_handles

Whenever a script is being recorded (see crandas.script), the default behavior is to interpret all calls to cd.get(handle) as dummy_for:<handle> table names. This allows the user to use the same handle in both script recording and execution, even though the script recording takes place in a different environment where the real table handle does not exist.

This behavior can be overridden in two levels: for the entire script or for a single call to get. For the entire script, mapping dummy handles can be disabled by supplying map_dummy_handles as False in the call to crandas.script.record. For the call to get, by specifying this argument as either True or False, the mapping behavior is forced to be either enabled or disabled, regardless of the current script mode.

TYPE: bool DEFAULT: None

query_args

TYPE: (optional, dict) DEFAULT: {}

RETURNS DESCRIPTION
StateObject

The object with the specified handle or name

RAISES DESCRIPTION
ValueError
  • Schema validation failed

get_upload_handles(get_from=None)

Get list of handles of all objects uploaded to the engine.

See list_uploads() for an overview of which objects are considered to be "uploaded"

PARAMETER DESCRIPTION
get_from

If set to "mem", only return objects that are in the working memory of the engine server. If set to "disk", only return objects that are not in the working memory. (The latter causes the objects to be loaded into memory at the server.)

TYPE: str DEFAULT: None

list_uploads(*, also_on_disk=True)

Get list of all objects uploaded to the engine.

"Uploaded" objects are the results of functions that upload a local dataframe to the engine, e.g., upload_pandas_dataframe], DataFrame, and read_csv(). It does not include tables that are computed from other tables or created via demo_table().

By default, for objects that are stored on-disk at the server, the metadata and creation date are not given. See the also_on_disk parameter below.

PARAMETER DESCRIPTION
also_on_disk

If set to True, objects that need to be retrieved from disk at the server are are loaded into memory and their metadata is returned. If set to False, objects that need to be retrieved from disk at the server are listed, but their type and creation date are not returned.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
DataFrame

Dataframe containing handles, creation date, and type information