Installation (local python)#

This page explains how crandas can be installed locally. Note that crandas can run in either a design or a production environments. More information about these environment and how they relate to each other can be found on the help center.

Warning

The entire installation section only applies if you are working in a production environment or running crandas locally. In demo environments or Jupyter environments provided by Roseman Labs, crandas is pre-installed – so skip ahead to the next section. To complete this manual you will need to download and install Python 3.9 or higher. If you are installing on Windows, don’t forget to tick the box that says Add python.exe to PATH to ensure you can run python from any directory.

1. Connecting to the virtual data lake#

There are two different ways to connect to the vitual data lake: using the actual keys and certificates or using a connection file that contains all of this material in itself. From version 1.9, the connection file is the recommended mode to connect to the VDL.

Overview of certificates and files involved#

Before you begin this process, it is useful to understand what each of the files do.

  • Server public keys (.pk): We have a public key for each of the 3 servers. These keys are used to encrypt input data that is sent towards each server. The crandas client connects to one of the servers and sends the encrypted input data. The connected server, that can only decrypt 1 of the 3 encrypted streams, then forwards the encrypted data to the other two servers.

  • Certificates: When the crandas client connects to the Virtual Data Lake server, we use the certificate in order to authenticate the server for the client. This ensures that the client is connecting to the correct server.

  • Analyst key (.sk): When an analysis is uploaded to the portal by the analyst, it is signed with their analyst key. When executing in production, this key will be needed to ensure that the analyst that uploaded the analysis is the one that actually executes it (verified with the public key). This key should remain private.

  • Connection file: File that contains the URL and certificate for one of the servers, and public keys for the 3 servers to encrypt the uploads.

Start a connection#

Log in into the portal and go to Settings >> Account. Click on ‘Download connection file’, and the file <your-environment>.vdlconn it will be automatically stored in your Downloads folder.

  1. Open your home folder:
    1. Press WIN+R, the Windows Run window should open

    2. Enter %HOMEDRIVE%%HOMEPATH% and press enter.

    3. A new Windows Explorer window should open that shows your Home folder.

  2. In this home folder create a new folder called .config (unless it already exists)

  3. Go inside your .config folder and create a new folder named crandas (unless it already exists)

  4. Move the file <your-environment>.vdlconn to the crandas folder you just created.

In case you want to generate the keys and certificates yourself, you can follow these steps:

Generate own key material
  1. You will receive a 2023keygen.py script. Save this script as a file named 2023keygen_analystapprover.py in your home directory (can type %HOMEPATH% into the address bar of the File Explorer window or you can type in C:\Users\ into the address bar and click the one with your username).

  2. Open a command prompt (click windows key + R, then type cmd - or you can just search for it) and navigate to your home directory(%HOMEPATH% on windows or cd~ on Linux) where you saved the 2023keygen_analystapprover.py file (if it already says C:\Users\{your username} then you are already there).

  3. Run the following command to install the required nacl library: pip install pynacl

  4. Now, run the script by executing: python 2023keygen_analystapprover.py

  5. The script will generate a folder called vdl_certs (in the same folder where you saved the keygen file) and the key pairs will be inside.

After completing these steps, you will have successfully generated the key pairs and can proceed to use crandas in your Python scripts.

2. Installing crandas#

To be able to use crandas in your Python scripts we are going to install it by using: pip install crandas --index-url=https://pypi.rosemancloud.com.

crandas is hosted on a private Roseman Labs server rather than pypi.org, so it is necessary to explicitly add the server url.

Install a specific version of crandas#

To install a specific version of crandas we can run: pip install crandas==<version> --index-url=https://pypi.rosemancloud.com. You can replace <version> with the version of crandas you wish to install, e.g. v1.9.0.

We recommend installing crandas in a virtual environment:

If we want to install crandas in a virtual environment using venv on Windows.

  1. Open a Command Prompt window:
    1. Press WIN+R, the Windows Run window should open

    2. Enter cmd and press enter.

    3. Navigate to the Home directory: cd %HOMEPATH% (or same as when saving the script above)

  2. Create virtual environment by executing: python -m venv .crandas

  3. Activate virtual environment by executing: .\.crandas\Scripts\activate.bat (you will know it has been activated as it will say (.crandas))

  4. Install crandas: pip install crandas --index-url=https://pypi.rosemancloud.com

Note

When you exit your virtual environment, you can activate it again as follows:
  • Windows key + R, then type cmd.

  • .\.crandas\Scripts\activate.bat

Now crandas has been installed, we can start creating a script in a python editor of our choice. In this case, we refer to Jupyter:

Note

On Debian/Ubuntu systems, you need to install the python3-venv package using the following command.
apt install python3.10-venv

You might need super user (sudo) privileges to execute this command.

For use with Jupyter notebooks#

To install crandas for use with Jupyter, use pip install crandas[notebook] --index-url=https://pypi.rosemancloud.com. This also installs dependencies that are needed to let crandas function well with Jupyter, in particular to show the progress bar for long-running operations.

Note

When using Jupyter notebooks in Visual Studio Code, make sure to have the latest versions of packages, as earlier versions of VS Code did not correctly display the progress bar. More information about which package versions are needed can be found here.

  • pip install --force-reinstall -v "ipywidgets == 7.7.2"

  • pip install --force-reinstall -v "jupyterlab_widgets == 1.1.1"

3. Start an analysis#

We should still be in the virtual environment we created shown by (.crandas). Now we need to install a development environment such that we can work with crandas more easily, for example: pip install notebook will install jupyter notebook in the virtual environment (you will know it has been finished as it will say successfully installed...).

We can start creating our analysis by executing python -m jupyter notebook (or any python editor) and clicking new to start a new notebook.

Finally we need to tell crandas which VDL endpoint and which certificates to use when running your analysis. This depends on whether you used the connection file or the actual certificates and key material:

import crandas as cd
from crandas.base import session

# connect your session to the VDL
session.connect("<your-environment>")

After this, we can check which VDL server we will connect to (to double-check we have set it correctly).

#Show which VDL server we will connect to
print("Virtual Data Lake URL: " + cd.base.session.endpoint)

To confirm that the setup has been done correctly, just send any query to the server, for example cd.demo_table().

After this, crandas will be ready for us to use it to make secure computations.