Installation (local python)#

Warning

The entire installation section only applies if you are working in a production environment or running crandas locally. In our demo environments, crandas is pre-installed – so skip ahead to the next section.

To complete this manual you will need to download and install Python 3.9 or higher.

If you are installing on Windows, don’t forget to tick the box that says Add python.exe to PATH to ensure you can run python from any directory.

Warning

For step 1, only do 1a or 1b, not both. If you are generating your own keys follow 1a, if not 1b.

1a. Generate key pairs using 2023keygen.py script#

Key pairs are needed to securely encrypt any communication with the Virtual Data Lake (such as requesting and approving transactions).

In order to generate the key pairs, follow these steps:

  1. Save the 2023keygen script as a file named 2023keygen_analystapprover.py in your home directory (can type %HOMEPATH% into the address bar of the File Explorer window or you can type in C:\Users\ into the address bar and click the one with your username).

Note

You can also drag and drop the python file into the home directory once you have navigated to it. You can double check the path by pressing Alt + D.

  1. Open a command prompt (click windows key + R, then type cmd - or you can just search for it) and navigate to your home directory(%HOMEPATH% on windows or cd~ on Linux) where you saved the 2023keygen_analystapprover.py file (if it already says C:\Users\{your username} then you are already there).

  2. Run the following command to install the required nacl library: pip install pynacl

  3. Now, run the script by executing: python 2023keygen_analystapprover.py

5. The script will generate a folder called vdl_certs (in the same folder where you saved the keygen file) and the key pairs will be inside. There will be a number of files with the “.sk” and “.pk” extensions.

The .sk files are secret keys that should be kept private. Save them in a local folder on your system.

The .pk files are public keys. You will be requested to share those keys to set up and operate the VDL. Your admin will request these files to set up the VDL.

After completing these steps, you will have successfully generated the key pairs and can proceed to use crandas in your Python scripts.

1b. Download the certificates and storing them in your home folder#

As the certificates are the means of authentication to the production environment, they will be provided out of band. You should have received an e-mail that contains a link to our secure filesharing system. The first step is storing these certificates in a sub-folder in your home directory.

Windows

  1. Download the ZIP from the RL fileshare, and store it in your Downloads folder.

  2. Extract the ZIP file

  3. Open your home folder:
    1. Press WIN+R, the Windows Run window should open

    2. Enter %HOMEDRIVE%%HOMEPATH% and press enter.

    3. A new Windows Explorer window should open that shows your Home folder.

  4. In this home folder create a new folder called vdl_certs

  5. Move the contents of the ZIP file you unpacked to the vdl_certs folder you just created.

Linux

  1. Download the ZIP from the RL fileshare, and store it in your Downloads folder.

  2. Extract the ZIP file

  3. Open your home folder in a Terminal window (cd ~)

  4. In this home folder create a new folder called vdl_certs (mkdir vdl_certs)

  5. Move the contents of the ZIP file you unpacked to the vdl_certs folder you just created.

2. Installing crandas in a virtual environment#

To be able to use crandas in your Python scripts we are going to install it in a virtual environment using venv.

Windows

  1. Open a Command Prompt window:
    1. Press WIN+R, the Windows Run window should open

    2. Enter cmd and press enter.

    3. Navigate to the Home directory: cd %HOMEPATH% (or same as when saving the script above)

  2. Create virtual environment by executing: python -m venv .crandas

  3. Activate virtual environment by executing: .\.crandas\Scripts\activate.bat (you will know it has been activated as it will say (.crandas))

  4. Install crandas: pip install crandas==1.3.1 --index-url=https://pypi.rosemancloud.com

Linux

  1. Open a Terminal window and navigate to the home directory folder: cd ~

  2. Create virtual environment: python3 -m venv .crandas

  3. Activate virtual environment: source .crandas/bin/activate.

  4. Install crandas: pip install crandas=={input version} --index-url=https://pypi.rosemancloud.com (replace {input version} with 1.3.1 for example)

3. Setting up the session variables in your script#

We should still be in the virtual environment we created shown by (.crandas). Now we need to install a development environment such that we can work with crandas more easily, for example: pip install notebook will install jupyter notebook in the virtual environment (you will know it has been finished as it will say successfully installed...).

Note

Once you have installed crandas in your virtual environment you can use it with any python editor of your choice.

We can start creating our analysis by executing jupyter notebook (this is an example) and clicking new to start a new notebook.

Finally we need to tell crandas which VDL endpoint and which certificates to use when running your analysis. An example is included below.

#import the crandas package we have installed
import crandas as cd

#import the session class, to set variables
from crandas.base import session

#import Pathlib
from  pathlib  import  Path

#Update to https://**NODE_IP**:**NODE_PORT**/api/v1
#(e.g. https://vdl-1c-cr-node2.rosemancloud.com:32601/api/v1) -
#this will be provided by the Roseman Labs admin
session.endpoint = '_____'

#Set the base path to the folder where we have stored the certificates
session.base_path = Path.home()/ 'vdl_certs'

#Set the path to analystsign.sk
#(check the folder vdl_certs in your home directory to see the id added to the end of the file)
session.query_signing_key = Path.home()/ 'vdl_certs/analystsign0.sk'

#Set the path to the http cert. used to communicate with VDL node
#(this will be provided by the Roseman Labs Admin)
session.certificate_path = Path("httpd0.crt")

#(for on-premise only) Set the assert hostname to the correct
#DNS host name (this will be provided by the Roseman Labs admin)
session.assert_hostname = '    '

#Set path to json which contains the signed transactions -
#this can be downloaded from the Web Portal (script approval platform) once the cluster is running.
session.authorization_file = 'signed-transactions.jsonl'

#Set to True if all JSONs that are sent to VDL node need to be printed
session.print_json = None

After this, we can check which VDL server we will connect to (to double-check we have set it correctly).

#Show which VDL server we will connect to
print("Virtual Data Lake URL: " + cd.base.session.endpoint)

Users may confirm that a connection has been established by checking the value of base.session.connected.

print(cd.base.session.connected)

If the value is set to True, this indicates that the connection has been successfully made.

Warning

To activate your virtual environment again: - Windows key + R, then type cmd. - .\.crandas\Scripts\activate.bat