.. _installing: Installation (local python) ########################### This page explains how crandas can be installed locally. Note that crandas can run in either a design or a production environments. More information about these environment and how they relate to each other can be found on the `help center `_. .. warning:: The entire installation section only applies if you are working in a production environment or running crandas locally. In demo environments or Jupyter environments provided by Roseman Labs, crandas is pre-installed -- so skip ahead to :ref:`the next section `. To complete this manual you will need to download and install `Python 3.9 or higher `_. If you are installing on Windows, don't forget to tick the box that says **Add python.exe to PATH** to ensure you can run python from any directory. Overview of certificates and files involved ============================================ Before you begin this process, it is useful to understand what each of the files do. - **Server public keys (.pk)**: We have a public key for each of the 3 servers. These keys are used to encrypt input data towards each server. The crandas client connects to one of the servers and sends the encrypted input data. The connected server, that can only decrypt 1 of the 3 encrypted streams, then forwards the encrypted data to the other two servers. - **Certificates:** When the crandas client connects to the Virtual Data Lake server, we use the certificate in order to authenticate the server for the client. This ensures that the client is connecting to the correct server. - **Analyst key:** When an analysis is uploaded to the portal the analysis is signed with their secret key (.sk). When executing in production, this key will be needed to ensure that the analyst that uploaded the analysis is the one that actually executes it (verified with the public key). - **Connection file:** It contains the URL and certificate for one of the servers, and public keys for the 3 servers to encrypt the uploads. .. note:: There are two different ways to connect to the Vitual Data Lake: using the connection file or using the certificate and the server public keys. From version 1.9, the connection file is the recommended mode to connect to the VDL. 1. Downloading the files and storing them in your home folder =============================================================== 1a. Connection file (v1.9.0 or higher) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Log in into the portal and go to Settings >> Account. Click on 'Download connection file', and the file :code:`.vdlconn` it will be automatically stored in your :code:`Downloads` folder. Windows --------- 1. Open your home folder: a. Press WIN+R, the Windows Run window should open b. Enter :code:`%HOMEDRIVE%%HOMEPATH%` and press enter. c. A new Windows Explorer window should open that shows your Home folder. 2. In this home folder create a new folder called :code:`.config` (unless it already exists) 3. Go inside your :code:`.config` folder and create a new folder named :code:`crandas` (unless it already exists) 4. Move the file :code:`.vdlconn` to the :code:`crandas` folder you just created. Linux ------- 1. Open your home folder in a Terminal window (:code:`cd ~`) 2. Create a new :code:`crandas` folder inside the :code:`.config` folder (:code:`mkdir -p ~/.config/crandas`). If the :code:`.config` folder doesn't exist then it will be created. 3. Move the file :code:`.vdlconn` to the :code:`crandas` folder you just created (:code:`mv ~/Downloads/.vdlconn ~/.config/crandas`) 1b. Certificate and public keys ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As the certificates are the means of authentication to the production environment, they will be provided out of band. You should have received an e-mail that contains a link to `our secure filesharing system `_. The first step is storing these certificates in a sub-folder in your home directory. Windows --------- 1. Download the ZIP or TAR file from the RL fileshare, and store it in your :code:`Downloads` folder. 2. Extract the ZIP or TAR file. You can do this with a program such as 7-Zip. 3. Open your home folder: a. Press WIN+R, the Windows Run window should open b. Enter :code:`%HOMEDRIVE%%HOMEPATH%` and press enter. c. A new Windows Explorer window should open that shows your Home folder. 4. In this home folder create a new folder called :code:`vdl_certs` 5. Move the contents of the ZIP/TAR file you unpacked to the vdl_certs folder you just created. Linux ------- 1. Download the ZIP or TAR from the RL fileshare, and store it in your :code:`Downloads` folder. 2. Extract the ZIP or TAR file (go to the folder where the file is located. Right click on the file and then "Extract here" or "Extract to..."). 3. Open your home folder in a Terminal window (:code:`cd ~`) 4. In this home folder create a new folder called :code:`vdl_certs` (:code:`mkdir vdl_certs`) 5. Move the contents of the ZIP file you unpacked to the :code:`vdl_certs` folder you just created. 2. Installing crandas ====================== To be able to use crandas in your Python scripts we are going to install it by using: ``pip install crandas --index-url=https://pypi.rosemancloud.com``. crandas is hosted on a private Roseman Labs server rather than `pypi.org `_, so it is necessary to explicitly add the server url. Install a specific version of crandas ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To install a specific version of crandas we can run: ``pip install crandas== --index-url=https://pypi.rosemancloud.com``. You can replace ```` with the version of crandas you wish to install, e.g. ``v1.8.0``. .. hint:: We recommend the use of virtual environments to install crandas and its dependencies, especially for beginner users. Install crandas in a virtual environment on Windows ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If we want to install crandas in a virtual environment using ``venv`` on Windows. 1. Open a Command Prompt window: a. Press WIN+R, the Windows Run window should open b. Enter ``cmd`` and press enter. c. Navigate to the Home directory: ``cd %HOMEPATH%`` (or same as when saving the script above) 2. Create virtual environment by executing: ``python -m venv .crandas`` 3. Activate virtual environment by executing: ``.\.crandas\Scripts\activate.bat`` (you will know it has been activated as it will say ``(.crandas)``) 4. Install crandas: ``pip install crandas --index-url=https://pypi.rosemancloud.com`` .. note:: While installing crandas, we might encounter missing dependencies, for example Visual C++ is needed to build pandas. In that case, install the missing dependencies and reboot before attempting to install crandas again. Install crandas in a virtual environment on Linux ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If we want to install crandas in a virtual environment using ``venv`` on Linux. 1. Open a Terminal window and navigate to the home directory folder: ``cd ~`` 2. Create virtual environment: ``python3 -m venv .crandas`` 3. Activate virtual environment: ``source .crandas/bin/activate``. 4. Install crandas: ``pip install crandas --index-url=https://pypi.rosemancloud.com`` .. note:: On Debian/Ubuntu systems, you need to install the ``python3-venv`` package using the following command. .. code:: bash apt install python3.10-venv You might need super user (``sudo``) privileges to execute this command. For use with Jupyter notebooks ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To install crandas for use with jupyter, use ``pip install crandas[notebook] --index-url=https://pypi.rosemancloud.com``. This also installs dependencies that are needed to let crandas function well with Jupyter, in particular to show the progress bar for long-running operations. .. note:: When using Jupyter notebooks in Visual Studio Code, make sure to have the latest versions of packages, as earlier versions of VS Code did not correctly display the progress bar. More information about which package versions are needed can be found `here. `_ 3. Setting up the session variables in your script ==================================================== We should still be in the virtual environment we created shown by ``(.crandas)``. Now we need to install a development environment such that we can work with crandas more easily, for example: ``pip install notebook`` will install jupyter notebook in the virtual environment (you will know it has been finished as it will say ``successfully installed...``). .. note:: Once you have installed crandas in your virtual environment you can use it with any python editor of your choice. We can start creating our analysis by executing ``jupyter notebook`` (this is an example) and clicking **new** to start a new notebook. 3a. Setting up the connection file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code:: python import crandas as cd from crandas.base import session # connect your session to the VDL session.connect("") 3b. Setting up the path to the certificate and the server public keys ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Finally we need to tell crandas which VDL endpoint and which certificates to use when running your analysis. An example is included below. .. code:: python import crandas as cd from crandas.base import session from pathlib import Path # Update to provided endpoint https://**NODE_IP**:**NODE_PORT**/api/v1 # (e.g. https://vdl-1c-cr-node2.rosemancloud.com:32601/api/v1) session.endpoint = '_____' # Set the base path to the folder where we have stored the certificates session.base_path = Path('./vdl_certs') # connect your session to the VDL session.connect() .. # Set the path to the http certificate # (provided by Roseman Labs) session.certificate_path = Path("httpd0.crt") # Set the path to the analyst key # Note: The filename might have a different key, confirm by checking the vdl_certs directory session.query_signing_key = Path.home()/ 'vdl_certs/analystsign0.sk' # (Production only) Set path to json which contains the authorized transactions # The authorization is generated in the Web Portal session.authorization_file = 'signed-transactions.jsonl' After this, we can check which VDL server we will connect to (to double-check we have set it correctly). .. code:: python #Show which VDL server we will connect to print("Virtual Data Lake URL: " + cd.base.session.endpoint) To confirm that the setup has been done correctly, just send any query to the server, for example ``cd.demo_table()``. After this, crandas will be ready for us to use it to make secure computations. .. [1] If you have not received those, please contact Roseman Labs. .. note:: To activate your virtual environment again: - For Windows users: - ``Windows key`` + ``R``, then type ``cmd``. - ``.\.crandas\Scripts\activate.bat`` - For Linux users: - ``Ctrl`` + ``Alt`` + ``T``. - ``source .crandas/bin/activate``