SCADL: A new side-channel attack tool using deep learning

Blog posts

In 2019, the Ledger Donjon (which is the Ledger product security team) released lascar as a side-channel attack (SCA) tool. Since then, Deep Learning SCAs (DL-SCAs) research has seen significant progress with several interesting publications.

In our research activities, we have developed a new tool that implements the most recent DL-SCA methods for side-channel evaluations.We are pleased to release scadl, a new in-house tool designed to perform SCAs using deep learning. In line with our open-source approach, this project will help students, security researchers, and security experts in evaluation labs.

Introduction

In recent years, DL-SCAs have appeared with the promise of more competitive performance than other techniques. Also, many research papers proved that such techniques can break protected cryptographic implementations using common side-channel countermeasures such as masking, jitter, and random delay insertion. To keep up with this research trend, we integrated the following techniques into scadl:

  • Normal profiling: This is a straightforward profiling technique in which the attacker uses a known-key dataset to train a DL model, which is then used to attack an unknown-key dataset. This technique was presented in these works, 1 and 2. The authors also showed the strength of such techniques against protected designs with jitter and masking.
  • Non-profiling: While similar to Differential Power Analysis (DPA), this technique takes advantage of DL attacks against protected designs. However, it offers several advantages over DPA when attacking protected designs, such as masking and desynchronization, because it uses the accuracy of DL models over any statistical-based methods, such as DPA, that require trace processing.
  • Multi-label: A technique to attack multiple keys using only one DL model.
  • Multi-tasking: Another technique for attacking multiple keys using a single model.
  • Data augmentation: A technique to increase the dataset to boost the DL efficiency. Scadl includes mixup and random-crop.
  • Attribution methods: This technique reverses the DL model to understand how it behaves during the prediction phase. It helps to improve the DL model’s performance and can be used as a leakage detection technique.
Tutorials

The repository provides several tutorials as examples for each technique.

Dataset

Scadl uses two different datasets in the tutorial. The first dataset is collected by running a non-protected AES on ChipWhisperer-Lite. The second dataset is ASCAD, which is widely used in side-channel attacks (SCAs).

Power consumption traces using a ChipWhisperer

Power consumption trace using a ChipWhisperer

Example

As we mentioned before, scadl implements different types of DL-based attacks. Here is an example using scadl for non-profiling DL on the ASCAD dataset.

  • First, we construct a DL model based on MLP as an example. This model contains two dense layers in addition to the last layer.
def mlp_ascad(len_samples: int) -> keras.Model:
    """It returns an mlp model"""
    model = Sequential()
    model.add(Input(shape=(len_samples,)))
    model.add(Dense(20, activation="relu"))
    model.add(Dense(10, activation="relu"))
    model.add(Dense(2, activation="softmax"))
    model.compile(loss="mean_squared_error", optimizer="adam", metrics=["accuracy"])
    return model
  • Then, we construct a leakage model. It is a masked AES design and we deliver the time samples of manipulating the mask with the sbox output. Therefore, the following leakage model is used based on the least significant bit.
def leakage_model(data: np.ndarray, guess: int) -> int:
    """It calculates lsb"""
    return 1 & ((sbox[data["plaintext"][TARGET_BYTE] ^ guess]))
  • After that, we pre-process the leakage traces by normalization and subtracting the average to reduce the complexity of the DL model.
x_train = normalization(remove_avg(leakages), feature_range=(-1, 1))
  • The final step includes brute-forcing the unknown key and calculating the model accuracy/loss for each guessed key. The correct key should give the highest accuracy (or the lowest loss). This process has a set number of epochs that can be tuned depending on the efficiency of the used DL model.
EPOCHS = 10
guess_range = range(0, 256)
acc = np.zeros((len(guess_range), EPOCHS))
profile_engine = NonProfile(leakage_model=leakage_model)
for index, guess in enumerate(tqdm(guess_range)):
    acc[index] = profile_engine.train(
        model=mlp_ascad(x_train.shape[1]),
        x_train=x_train,
        metadata=metadata,
        hist_acc="accuracy",
        guess=guess,
        num_classes=2,
        epochs=EPOCHS,
        batch_size=1000,
        verbose=0
    )
guessed_key = np.argmax(np.max(acc, axis=1))
print(f"guessed key = {guessed_key}")
  • The following figure shows the accuracy of all the brute-forced keys. The black curve shows the accuracy of the correctly guessed key.

References
  1. B. Timon, Non-Profiled Deep Learning-based Side-Channel attacks with Sensitivity Analysis , CHES, 2019.
  2. H. Maghrebi, Deep Learning based Side-Channel Attack: a New Profiling Methodology based on Multi-Label Classification , Cryptology ePrint Archive, 2020.
  3. B. Hettwer et al., Deep Neural Network Attribution Methods for Leakage Analysis and Symmetric Key Recovery, Cryptology ePrint Archive, 2019.
  4. T. Marquet et al., Exploring Multi-Task Learning in the Context of Masked AES Implementations, COSADE, 2024.
  5. K. Abdellatif, Mixup Data Augmentation for Deep Learning Side-Channel Attacks, Cryptology ePrint Archive, 2021.

Karim M. Abdellatif, PhD (TwitterLinkedin)
Senior Staff Hardware Security Engineer at Ledger Donjon

Leo Benito (Linkedin)
Hardware Security Engineer at Ledger Donjon

You might also like