crandas.check_recording¶
Functionalities to check script recordings.
This module provides functionality to detect possible problems that may cause a recorded script to give problems when used in production.
Note that the checks described on this page are only performed during script recording (and not, e.g., during script use).
Conditional call detection¶
Conditional call detection checks that crandas calls are not performed conditionally during script recording; see Crandas commands should be outside conditional branches.
For example, in the code example below, the call cd.DataFrame(...)
is
performed conditional on whether some_condition()
holds. If this condition
is true during script recording but false when applying the script on real data,
this will cause an error when using the script:
import crandas as cd
cd.script.record()
if some_condition():
cd.DataFrame(...)
The same is true, for example, when calling crandas functions from an if
or
while
statement. Because such a crandas call can be potentially problematic,
crandas gives a ConditionalCallDetected
warning in such a case.
Note that, if the crandas call is not actually performed during script recording,
e.g., if some_condition()
is False
, then no warning is given.
Suppressing warnings¶
In many cases, use of crandas from an if
, for
, etc., can be legitimate
(for example, in the case of a for
loop that is always called the same
number of times). In such cases, the warning can be suppressed by adding a
comment that contains the text crandas-dontwarn
to the conditional
statement, e.g.:
import crandas as cd
cd.script.record()
if some_condition(): # crandas-dontwarn
cd.DataFrame(...)
Performing joins during script recording¶
When performing a join during script recording, crandas provides a warning that such a join can cause errors in production if the values of the join columns are not unique. For example, the following will cause a warning.
import crandas as cd
cd.script.record()
cdf = cd.demo_table(10, 10)
cd.merge(cdf, cdf, left_on="col1", right_on="col2")
cd.merge(cdf, cdf, left_on=cdf.groupby("col1"), right_on="col2")
cd.merge(cdf, cdf, left_on="col1", right_on=cdf.groupby("col2"))
To suppress this warning, pass the appropriate validate
argument to the
merge
function: 1:1
for a one-to-one join; 1:m
for a one-to-many
join; or m:1
for a many-to-one join, e.g.:
(...)
cd.merge(cdf, cdf, on="col1", validate="1:1")
cd.merge(cdf, cdf, left_on=cdf.groupby("col1"), right_on="col2", validate="m:1")
cd.merge(cdf, cdf, left_on="col1", right_on=cdf.groupby("col2"), validate="1:m")
This warning can be disabled; see below.
Configuring the checking functionality¶
The checking functionality can be configured using the following configuration
options (seee crandas.config
):
check_recording
(bool, default:True
): enable/disable recording checkcheck_recording_label
(str, default:crandas-dontwarn
): label to suppress warnings (see above)NOTE: be careful to change this setting from outside of the script, because this lowers the reusability of scripts when sharing them with people with other settings
check_recording_throw
(bool, default:False
): throw exception instead of warningcheck_recording_conditional
(bool, default:True
): if recording check is enabled, check for conditional callscheck_recording_join
(bool, default:True
): if recording check is enabled, check for joins
For example, the snippet below changes the label to suppress warnings and then uses the updated label:
import crandas as cd
from crandas.config import settings
settings.check_recording_label = "ignore"
cd.script.record()
if True: # ignore
cd.demo_table(1,1)
- exception crandas.check_recording.ConditionalCallDetected(filename, conditional_node)¶
Bases:
UserWarning
Reprents an error that a conditional crandas call was detected
- exception crandas.check_recording.JoinDetected(*, left_on, right_on, **kwargs)¶
Bases:
UserWarning
Warning that the user performs a join without the
validate
argument in script recording