.. _installing: Installation (on-premise) ========================= .. warning:: The entire installation section only applies if you are working in a production environment or running crandas locally. In our demo environments, crandas is pre-installed -- so skip ahead to :ref:`the next section `. To complete this manual you will need to download and install `Python 3.9 or higher `_. If you are installing on Windows, don't forget to tick the box that says **Add python.exe to PATH** to ensure you can run python from any directory. .. warning:: For step 1, only do 1a or 1b, **not both**. If you are generating your own keys follow 1a, if not 1b. 1a. Generate key pairs using ``2023keygen.py`` script ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Key pairs are needed to securely encrypt any communication with the Virtual Data Lake (such as requesting and approving transactions). In order to generate the key pairs, follow these steps: 1. Save the 2023keygen script as a file named ``2023keygen_analystapprover.py`` in your home directory (can type ``%HOMEPATH%`` into the address bar of the File Explorer window or you can type in ``C:\Users\`` into the address bar and click the one with your username). .. note:: You can also drag and drop the python file into the home directory once you have navigated to it. You can double check the path by pressing ``Alt + D``. 2. Open a command prompt (click ``windows key`` + ``R``, then type ``cmd`` - or you can just search for it) and navigate to your home directory(``%HOMEPATH%`` on windows or ``cd~`` on Linux) where you saved the ``2023keygen_analystapprover.py`` file (if it already says ``C:\Users\{your username}`` then you are already there). 3. Run the following command to install the required nacl library: ``pip install pynacl`` 4. Now, run the script by executing: ``python 2023keygen_analystapprover.py`` 5. The script will generate a folder called ``vdl_certs`` (in the same folder where you saved the keygen file) and the key pairs will be inside. There will be a number of files with the ".sk" and ".pk" extensions. The .sk files are secret keys that should be kept private. Save them in a local folder on your system. The .pk files are public keys. You will be requested to share those keys to set up and operate the VDL. Your admin will request these files to set up the VDL. After completing these steps, you will have successfully generated the key pairs and can proceed to use crandas in your Python scripts. 1b. Download the certificates and storing them in your home folder ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As the certificates are the means of authentication to the production environment, they will be provided out of band. You should have received an e-mail that contains a link to `our secure filesharing system `_. The first step is storing these certificates in a sub-folder in your home directory. **Windows** 1. Download the ZIP from the RL fileshare, and store it in your `Downloads` folder. 2. Extract the ZIP file 3. Open your home folder: a. Press WIN+R, the Windows Run window should open b. Enter :code:`%HOMEDRIVE%%HOMEPATH%` and press enter. c. A new Windows Explorer window should open that shows your Home folder. 4. In this home folder create a new folder called `vdl_certs` 5. Move the contents of the ZIP file you unpacked to the vdl_certs folder you just created. **Linux** 1. Download the ZIP from the RL fileshare, and store it in your `Downloads` folder. 2. Extract the ZIP file 3. Open your home folder in a Terminal window (:code:`cd ~`) 4. In this home folder create a new folder called `vdl_certs` (:code:`mkdir vdl_certs`) 5. Move the contents of the ZIP file you unpacked to the `vdl_certs` folder you just created. 2. Installing crandas in a virtual environment ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To be able to use crandas in your Python scripts we are going to install it in a virtual environment using `venv`. **Windows** 1. Open a Command Prompt window: a. Press WIN+R, the Windows Run window should open b. Enter ``cmd`` and press enter. c. Navigate to the Home directory: ``cd %HOMEPATH%`` (or same as when saving the script above) 2. Create virtual environment by executing: ``python -m venv .crandas`` 3. Activate virtual environment by executing: ``.\.crandas\Scripts\activate.bat`` (you will know it has been activated as it will say ``(.crandas)``) 4. Install crandas: ``pip install crandas==1.3.1 --index-url=https://pypi.rosemancloud.com`` 5. When prompted, enter the provided user and password (see warning below) details (if you have not received these, request the crandas installation user credentials from your Roseman Labs admin). [1]_ .. warning:: When entering the password, you will not see any text appear. This is common as it is a security feature to stop people looking over your shoulder when you type it. The only way to check is by carefully typing out the password and clicking enter. **Linux** 1. Open a Terminal window and navigate to the home directory folder: ``cd ~`` 2. Create virtual environment: ``python3 -m venv .crandas`` 3. Activate virtual environment: ``source .crandas/bin/activate``. 4. Install crandas: ``pip install crandas=={input version} --index-url=https://pypi.rosemancloud.com`` (replace ``{input version}`` with ``1.3.1`` for example) 5. When prompted, enter the provided user and password (see warning below) details (if you have not received these, request the crandas installation user credentials from your Roseman Labs admin). [1]_ .. warning:: When entering the password, you will not see any text appear. This is common as it is a security feature to stop people looking over your shoulder when you type it. The only way to check is by carefully typing out the password and clicking enter. 3. Setting up the session variables in your script ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We should still be in the virtual environment we created shown by ``(.crandas)``. Now we need to install a development environment such that we can work with crandas more easily, for example: ``pip install notebook`` will install jupyter notebook in the virtual environment (you will know it has been finished as it will say ``successfully installed...``). .. note:: Once you have installed crandas in your virtual environment you can use it with any python editor of your choice. We can start creating our analysis by executing ``jupyter notebook`` (this is an example) and clicking **new** to start a new notebook. Finally we need to tell crandas which VDL endpoint and which certificates to use when running your analysis. An example is included below. .. code:: python #import the crandas package we have installed import crandas as cd #import the session class, to set variables from crandas.base import session #import Pathlib from pathlib import Path #Update to https://**NODE_IP**:**NODE_PORT**/api/v1 #(e.g. https://vdl-1c-cr-node2.rosemancloud.com:32601/api/v1) - #this will be provided by the Roseman Labs admin session.endpoint = '_____' #Set the base path to the folder where we have stored the certificates session.base_path = Path.home()/ 'vdl_certs' #Set the path to analystsign.sk #(check the folder vdl_certs in your home directory to see the id added to the end of the file) session.query_signing_key = Path.home()/ 'vdl_certs/analystsign0.sk' #Set the path to the http cert. used to communicate with VDL node #(this will be provided by the Roseman Labs Admin) session.certificate_path = Path("httpd0.crt") #(for on-premise only) Set the assert hostname to the correct #DNS host name (this will be provided by the Roseman Labs admin) session.assert_hostname = ' ' #Set path to json which contains the signed transactions - #this can be downloaded from the Web Portal (script approval platform) once the cluster is running. session.authorization_file = 'signed-transactions.jsonl' #Set to True if all JSONs that are sent to VDL node need to be printed session.print_json = None After this, we can check which VDL server we will connect to (to double-check we have set it correctly). .. code:: python #Show which VDL server we will connect to print("Virtual Data Lake URL: " + cd.base.session.endpoint) Users may confirm that a connection has been established by checking the value of ``base.session.connected``. .. code:: python print(cd.base.session.connected) If the value is set to ``True``, this indicates that the connection has been successfully made. .. [1] If you have not received those, please contact Roseman Labs. .. warning:: To activate your virtual environment again: - ``Windows key`` + ``R``, then type ``cmd``. - ``.\.crandas\Scripts\activate.bat``