crandas.check_recording
Functionalities to check script recordings.
This module provides functionality to detect possible problems that may cause a recorded script to give problems when used in production.
Note that the checks described on this page are only performed during script recording (and not, e.g., during script use).
Conditional call detection
Conditional call detection checks that crandas calls are not performed conditionally during script recording; see tips_conditional.
For example, in the code example below, the call cd.DataFrame(...) is
performed conditional on whether some_condition() holds. If this condition
is true during script recording but false when applying the script on real data,
this will cause an error when using the script:
The same is true, for example, when calling crandas functions from an if or
while statement. Because such a crandas call can be potentially problematic,
crandas gives a ConditionalCallDetected warning in such a case.
Note that, if the crandas call is not actually performed during script recording,
e.g., if some_condition() is False, then no warning is given.
Suppressing warnings
In many cases, use of crandas from an if, for, etc., can be legitimate
(for example, in the case of a for loop that is always called the same
number of times). In such cases, the warning can be suppressed by adding a
comment that contains the text crandas-dontwarn to the conditional
statement, e.g.:
Performing joins during script recording
When performing a join during script recording, crandas provides a warning that such a join can cause errors in production if the values of the join columns are not unique. For example, the following will cause a warning.
import crandas as cd
cd.script.record()
cdf = cd.demo_table(10, 10)
cd.merge(cdf, cdf, left_on="col1", right_on="col2")
cd.merge(cdf, cdf, left_on=cdf.groupby("col1"), right_on="col2")
cd.merge(cdf, cdf, left_on="col1", right_on=cdf.groupby("col2"))
To suppress this warning, pass the appropriate validate argument to the
merge function: 1:1 for a one-to-one join; 1:m for a one-to-many
join; or m:1 for a many-to-one join, e.g.:
(...)
cd.merge(cdf, cdf, on="col1", validate="1:1")
cd.merge(cdf, cdf, left_on=cdf.groupby("col1"), right_on="col2", validate="m:1")
cd.merge(cdf, cdf, left_on="col1", right_on=cdf.groupby("col2"), validate="1:m")
This warning can be disabled; see below.
Configuring the checking functionality
The checking functionality can be configured using the following configuration
options (see crandas.config):
check_recording(bool, default:True): enable/disable recording checkcheck_recording_label(str, default:crandas-dontwarn): label to suppress warnings (see above)
Important
Be careful to change this setting from outside of the script, because this lowers the reusability of scripts when sharing them with people with other settings:
check_recording_throw(bool, default:False): throw exception instead of warningcheck_recording_conditional(bool, default:True): if recording check is enabled, check for conditional callscheck_recording_join(bool, default:True): if recording check is enabled, check for joins
For example, the snippet below changes the label to suppress warnings and then uses the updated label:
ConditionalCallDetected(filename, conditional_node)
Bases: UserWarning
Represents a warning that a conditional crandas call was detected
JoinDetected(*, left_on, right_on, **kwargs)
Bases: UserWarning
Warning that the user performs a join without the validate argument in script recording