Tips and tricks#

Integers are just more efficient#

The way crandas ensures data privacy is by processing data through Multi Party Computation (or MPC). MPC is a series of mathematical techniques and protocols that are mostly based in discrete mathematics. This means that out of all the types used in cranmera, integers are the closest to the building blocks needed in the backend. The main consequence of this is that integers are considerably more efficient than any other data type. For example, it takes only one share to represent any integer, but it takes one share per character of a string.

Therefore, if you are working with a lot of data, you might want to convert some string columns into integers before uploading them to the VDL. This is especially useful if some of your string columns contain categorical data or just a few entries that repeat constantly. To do this, simply take your string column and assign a number starting from zero to each new entry. Now replace the column in your table with a column with the respective integers.

Of course, now you are left with the question of how to recover the information in the strings afterward. The first option is to keep a local copy of the assignment of strings to integers. If you want to share the data with other parties, you can get them that data directly. This works for categorical data, but what if you also need to keep those strings private? Easy. Just upload the integer-string assignment table to the VDL! Now you can do all the analysis on the table with integers, making it more efficient. Once you are done and want to retrieve the data, just do a left join of the two tables and you will have your strings again!