The Oribatida v1.3 Family of Lightweight Authenticated Encryption Schemes

Arghya Bhattacharjee; Cuauhtemoc Mancillas López; Eik List; Mridul Nandi

doi:10.1515/jmc-2020-0018

Open Access Published by De Gruyter January 29, 2021

The Oribatida v1.3 Family of Lightweight Authenticated Encryption Schemes

Arghya Bhattacharjee , Cuauhtemoc Mancillas López , Eik List and Mridul Nandi

From the journal Journal of Mathematical Cryptology

https://doi.org/10.1515/jmc-2020-0018

Abstract

Permutation-based modes have been established for lightweight authenticated encryption, as can be seen from the high interest in the ongoing NIST lightweight competition. However, their security is upper bounded by O(σ²/2^c) bits, where σ are the number of calls and c is the hidden capacity of the state. The development of more schemes that provide higher security bounds led to the CHES’18 proposal Beetle that raised the bound to O(rσ/2^c), where r is the public rate of the state.

While authenticated encryption can be performed in an on-line manner, authenticated decryption assumes that the resulting plaintext is buffered and never released if the corresponding tag is incorrect. Since lightweight devices may lack the resources for buffering, additional robustness guarantees, such as integrity under release of unverified plaintexts (Int-RUP), are desirable. In this stronger setting, the security of the established schemes, including Beetle, is limited by O(q_pq_d/2^c), where q_d is the maximal number of decryption queries, and q_p that of off-line primitive queries, which motivates novel approaches.

This work proposes Oribatida, a permutation-based AE scheme that derives s-bit masks from previous permutation outputs to mask ciphertext blocks. Oribatida can provide a security bound of O(rσ²/^c+s), which allows smaller permutations for the same level of security. It provides a security level dominated by O(σd2/2c) under Int-RUP adversaries, which eliminates the dependency on primitive queries. We prove its security under nonce-respecting and Int-RUP adversaries. We show that our Int-RUP bound is tight and show general attacks on previous constructions.

Keywords: Authenticated encryption; permutation; provable security

MSC 2010: 11T06; 11T71; 11Y16; 94A60

1 Introduction

1.1 Permutation-based Modes

Permutation-based modes have been established for various applications of symmetric-key cryptography during the previous decade. Keyless modes have been standardized as the hash function SHA-3 and its derivative SHAKE for the extendable-output functions [36]. Keyed modes are used for authentication [51] or encryption [13]. Moreover, permutation-based schemes have found widespread adoption for authenticated encryption, as the CAESAR co-selection Ascon [34], or many more candidates have shown, e.g., PRIMATES [6], NORX [9], Ketje [19], or Keyak [20].

The sponge [17] and duplex [15] modes transform an internal n-bit state iteratively with a public permutation. Both modes absorb an input stream block-wise to generate a pseudo-random output stream. While sponges separate the input (absorption) and output (squeezing) phases, the duplex mode generates the i-th output block directly after the i-th input block has been absorbed. In both modes, an n-bit permutation absorbs the data in r-bit chunks, where the outer part of the state r < n is called the rate. The hidden inner part of the state of c = n − r bits is called the capacity, where r and c are a trade-off between performance and security.

Keyed Sponge Variants were introduced by Bertoni et al. [18] and can be categorized into inner-keyed, outer-keyed, and full-keyed variants (cf. [48]); recently, Dobraunig and Mennink added the suffix-keyed sponge [35]. The inner-keyed sponge [28] initializes the inner part with the key, (0 ‖ K), whereas the outer-keyed sponge [18] (so-dubbed by [8]) concatenates key and message K ‖ M for the output. The full-keyed sponge [16] employs the full state in the absorption phase; the suffix-keyed sponge uses a keyed function only at the end.

Permutation-based modes are analyzed mostly in the ideal-permutation model. For authenticated encryption, an adversary A shall distinguish between two worlds consisting of two oracles: each world has (1) a construction oracle that A can ask encryption and verification queries to and (2) a primitive oracle that provides access to the internal permutation. The former oracle represents on-line queries, whereas the latter represents off-line queries. A can ask q_e encryption queries, q_v verification queries, and q_p construction queries; σ usually denotes the number of blocks over all construction queries.

Sponge Modes for Authenticated Encryption started with the Duplex construction and the AE scheme SpongeWrap [15] and MonkeyDuplex [16], and led to a considerable corpus of analysis, e.g., [8, 32, 37, 49, 52]. Early, Bertoni et al. [14] showed that the sponge is indifferentiable from a random oracle [47] for up to O(2^c/2) calls to the permutation. Their follow-up work [18] improved the bounds for the unkeyed sponge to O(qpσ2c+qc2k) if σ ≪ 2^c/2. For SpongeWrap, Bertoni et al. [15] had shown a privacy bound of O(q2k+σ22c) and an authenticity bound of O(q2k+σ22c+q2τ) .

Jovanovic et al. [41] improved the asymptotic authenticity bound, although under the limitation of at most σ ≪ 2^c/2 decryption queries. Summarizing many previous results, Mennink [48] showed that keyed sponges achieve PRF security of around O(qc2+qcqp2c)+AdvΠKP . He coined the final term the key-prediction security, e.g., O(qp2k) for full-keyed sponges and k < n. The recent duplex-based AE scheme Beetle [25] added a transform to the output so that the plaintext input that is added to the inner part and to the ciphertext output block differ. As a result, Beetle offered a bound of O(rqp+rσ2c+qv+qp2r+σ2+qp22n) . Table 1 summarizes some of the most noteworthy results in the past. Improvements to those general bounds appear hard, which motivates the search for novel constructions.

Table 1

Existing PRF bounds for keyed sponges. FKS/IKS/OKS = full-/inner-/outer-keyed sponge, IP = ideal permutation, indiff. = indifferentiability.

Scheme
FKS	IKS	OKS	Model	Bound	Work

–	–	•	Indiff.	O(σ2+σ2c)	[14]
–	•	•	IP	O(σ22c)	[28]
–	–	•	IP	O(σ2+σqp2c)	[8]
–	•	–	IP	O(σ2+σqp2c+qp2n)	[8]
NORX-like			IP	O(min(σ22n,σ2c,q2k))	[41]
–	–	•	IP	O(qc2+σqc+qcqp2c)	[37]
•	–	–	IP	O(σ22n+qc2m2c+σcqp2k)	[49]
–	•	•	IP	O(qc2+qcqp2c)	[52]
•	–	–	IP	O(qcqp+qc22c+qp2k)	[32]
•	–	–	IP	O(qc2+qcqp2c+qp2k)	[48]
–	–	•	IP	O(qc2+qcqp2c+ck/rqp2k)	[48]

Correct Authenticated Decryption requires the entire plaintext to be buffered until the tag has been verified. On certain architectures, this requirement can exceed the available storage and induce unacceptable latency. Andreeva et al. [7] introduced notions for privacy and integrity under the release of unverified plaintext material. Security guarantees such as Int-RUP represent valuable additional levels of robustness.

1.2 Research Gap

While Beetle improved the bound for nonce-based authenticated encryption, this bound does not hold in the Int-RUP setting: as we will outline, there exists an attack with an advantage of O(qpqv2c) . The naturally arising question is whether one can achieve higher Int-RUP security. This is motivated by a practical demand: while the primary advantage of sponges is simplicity, they are also useful for lightweight applications in resource-constrained environments. In such contexts, buffering multi-block decryption outputs is likely infeasible, which renders Int-RUP security advantageous. For example, the ongoing NIST lightweight competition [53] requests 112-bit integrity security for AE schemes and support of at least 2⁵⁰ − 1 encrypted bytes of data. Inserting them into Mennink’s bound [48] of qc2+qcqp2c , the NIST requirements would imply permutation sizes of at least 172 bits plus a plausible rate. A higher Int-RUP security could lead to smaller permutations, reducing area, and energy consumption simultaneously.

1.3 Contribution

This work contains three contributions: First, it answers the question above in the affirmative by showing a sponge with dynamic s-bit masks that achieves nAE security of O(qp2k+qd2τ+qpσ2c+s+σ22n) . Replacing the terms by the NIST requirements of k ≥ 128, τ ≥ 64, q_p ≤ 2¹¹² and σ ≥ 2⁵⁰, a trade-off could use c + s ≥ 162, or better c + s ≥ 192 bits to be secure for up to σ ∈ O(2⁶⁴) blocks and q_p ≤ 2¹²⁸ primitive queries. Thus, the rate could be reduced considerably compared to the bound from [48]. We prove that the Int-RUP advantage is at most O(qd2/2c) , i.e., depends solely on the number of on-line queries, contrasting the bound of O(q_dq_p/2^c) for the generic duplex and previous constructions. The difference may seem minor; though, eliminating the dependency of off-line (i.e., primitive) queries is a valuable gain. We propose a novel permutation-based AE scheme Oribatida that applies dynamic masks to the outputs and provides the NIST security bounds with permutation sizes of only 192 bits. As our second contribution, we show that the bound provided by our proposal is tight with an attack that matches the proved bound. Moreover, we show that it applies also to other duplex-based designs in general if masks would be added.

Remark 1

Finally, we acknowledge an observation by Rohit and Sarkar on the NIST lightweight mailing list [59]. We note that our proposal here represents a slightly updated variant Oribatida compared to the NIST submission [21] that addresses their observation by masking the authentication tag. We call it also Oribatida v1.3, but use Oribatida hereafter. We will discuss the effect of the slight update later.

1.4 Outline

After a brief recall of the necessary preliminaries, Section 3 motivates our proposal by showing Int-RUP attacks on the duplex mode and other existing schemes. Section 4 describes Oribatida in general. We close the parenthesis of Int-RUP attacks on Oribatida and other duplex-based modes when in Section 5. We analyze the security on Oribatida for the standard nonce-based AE setting in Section 6 and in the Int-RUP setting in Section 7. Next, Section 8 compares it with those second-round NIST lightweight candidates that claim Int-RUP security. Section 9 discusses the slight update from [21] and the associated improvement. Subsequently, Section 10 specifies an instance with a Simon-based permutation, whose security is discussed in Section 11 from previous works. Section 12 reports on the result of a hardware implementation of Oribatida before Section 13 concludes this work.

2 Preliminaries

2.1 General Notations

We use uppercase letters (e.g., X, Y) for functions and variables, lowercase letters (e.g., x, y) for indices and lengths, as well as calligraphic uppercase letters (e.g., 𝒳, 𝒴) for sets and spaces. We write 𝔽₂ for the field of characteristic 2 and 𝔽2n={0,1}n for the set of vectors over 𝔽₂, i.e., strings of n bits. |X| denotes the number of bits of X. Given X∈𝔽2n , we write X[i] for the i-th bit of X, and define the bit order by X = (X[n − 1] ‖ . . . ‖ X[1] ‖ X[0]). For t ≤ n, we use msb_t(X) = (X[n − 1]. . . X[n − t]) to return the t rightmost (or most significant when interpreting X as integer) and lsb_t(X) = (X[t − 1] . . . X[1]X[0]) to return the t leftmost (or least significant when interpreting X as integer) bits of X. We write ∅ for the empty set and ε for the empty string.

We denote by X[x..y] the range of X[x], . . . , X[y] for non-zero integers x and y. Given binary strings X and Y, we denote their concatenation by X ‖ Y and their bitwise XOR by X ⊕ Y when |X| = |Y|. For positive integers x and y and bit strings of different lengths X∈𝔽2x and Y∈𝔽2y with x ≥ y, we define X ⊕_y Y =^def X ⊕ (0^x−y ‖ Y).

We write X ↞ 𝒳 to indicate that X is chosen uniformly at random and independent from other variables from a set 𝒳. We consider Func(𝒳, 𝒴) to be the set of all mappings F : 𝒳 → 𝒴, and Perm(𝒳) to be the set of all permutations over 𝒳. Given an event E, we denote the probability of E by Pr[E]. We denote the invalid symbol by ⊥. Moreover, we denote by (n)k=def∏i=0k-1(n-i) the falling factorial.

For X∈𝔽2* , we denote by (X1,X2,…,Xx)←nX the splitting of X into n-bit strings X₁, ..., X_x−1, and |X_x| ≤ n, in form of X₁ ‖ . . . ‖ X_x = X. For Y∈𝔽2x , we write (X1,X2,…,Xm)←x1,x2,…,xmY for the splitting of Y into X_l = Y[x−1..x−x_l], X₂ = Y[x−x₁−1..x−x₁−x₂],…, X_m = Y[x_m−1..0], where x = x₁ + x₂ + …+x_m holds. For a given set 𝒳 and non-negative integer x, we write 𝒳^≤x for the union set ∪i=0x𝒳i . For a natural x < 2ⁿ, we write 〈x〉_n for its conversion to an n-bit binary string with the most significant bit left, e.g., 〈135〉₈ = (10000111). We omit n if clear from the context.

2.2 Nonce-based Authenticated Encryption

Let 𝒦 be a set of keys, 𝒩 be a set of nonces, 𝒜 a set of associated data, ℳ a set of messages, 𝒞 a set of ciphertexts, and 𝒯 a set of authentication tags. A nonce N ∈ 𝒩 is an input that must be unique for each authenticated encryption query.

A nonce-based AE scheme (with associated data) Π = (ℰ, 𝒟) is a tuple of deterministic encryption algorithm ℰ : 𝒦 × 𝒩 × 𝒜 × ℳ → 𝒞 × 𝒯 and deterministic decryption algorithm 𝒟 : 𝒦 × 𝒩 × 𝒜 × 𝒞 × 𝒯 → ℳ ∪ {⊥} with associated key space 𝒦. The encryption algorithm ℰ takes a tuple (K, N, A, M) and outputs (C, T), where C is a ciphertext and T an authentication tag. We assume that |C| = |M| holds for all inputs (K, N, A, M) and their corresponding ciphertexts. The associated data is authenticated, but not encrypted. The decryption function 𝒟 takes a tuple (K, N, A, C, T) and outputs either the unique plaintext M for which ℰ_K (N, A, M) = (C, T) holds, or outputs ⊥ if the input is invalid. We introduce ℰKN,A(M) as short form of ℰ_K (N, A, M) and 𝒟KN,A(C,T) for 𝒟_K (N, A, C, T), respectively. We assume that AE schemes are correct, i.e., for all (K, N, A, M) ∈ 𝒦 × 𝒩 × 𝒜 × ℳ, it holds that 𝒟KN,A(ℰKN,A(M))=M .

The ideal AE scheme provides two oracles $ : 𝒩 × 𝒜 × ℳ → 𝒞 × 𝒯 and ⊥ : 𝒩 × 𝒜 × 𝒞 × 𝒯 → ℳ ∪ {⊥} that offer access to encryption and verification. We overload the ⊥ notation to mean the oracle and the symbol of invalid decryption. Given (N, A, M), the ideal encryption oracle outputs ciphertext-tag tuples (C, T) that are random bits of the expected length. The ideal decryption oracle ensures correctness, i.e., given an input (N, A, C, T) where (C, T) had been the output to a previous encryption query (N, A, M), the decryption oracle outputs the corresponding message M. Otherwise, the decryption always returns the invalid symbol ⊥ for every new decryption query that had not been the answer to an earlier encryption query.

Since this work studies schemes based on public permutations, we employ the usual security notions in the ideal-permutation model. So, the adversary always has an additional oracle π^± that provides access to the public permutation π in forward and backward direction. We write Π[π] and ℰ[π], 𝒟[π], etc. to indicate that an authenticated-encryption scheme Π and its algorithms are based on a primitive π ↞ Perm(ℬ), where ℬ = {0, 1}ⁿ is some block space. Note that in the analysis, the permutation is chosen uniformly at random from the set of all permutations over ℬ. Though, the model is an adaption of the ideal-cipher model [60], and not of the random-oracle model [12]. We write Δ_A(𝒪¹; 𝒪²) for the advantage of A to distinguish between 𝒪¹ and 𝒪².

Definition 1

(nAE Security). Let K ↞ 𝒦, π ↞ Perm(ℬ), and let Π[π] = (ℰ[π]_K, 𝒟[π]_K) be a nonce-based authenticated scheme. Let A be a nonce respecting adversary. Then,

AdvΠ[π]nAE(A)=defΔA(ℰ[π]K,𝒟[π]K,π±;$,⊥,π±) .

In the RUP model, the understanding of nonce-based AE differs slightly from the previous definition. Here, we use the notion that the adversary can always see the full resulting plaintext from a decryption query. To formulate the forgery goal, the oracles are adapted. A verification oracle outputs 1 iff the input is valid, and 0 otherwise. A nonce-based RUP authenticated encryption scheme Π˜=(ℰ˜K,𝒟˜K,𝒱˜K) is a three tuple of encryption algorithm ℰ˜:𝒦×𝒩×𝒜×ℳ→𝒞×𝒯 , decryption algorithm 𝒟˜:𝒦×𝒩×𝒜×𝒞×𝒯→ℳ , and verification algorithm 𝒱˜K:𝒦×𝒩×𝒜×𝒞×𝒯→{0,1} . The signature of the encryption and decryption algorithms are unchanged.

Definition 2

(Int-RUP Security). Let K ↞ 𝒦, π ↞ Perm(ℬ), and let Π˜[π]=(ℰ˜[π]K,𝒟˜[π]K,𝒱˜[π]K) be a nonce-based RUP authenticated scheme. Let A be a nonce-respecting adversary. Then,

AdvΠ[π]Int-RUP(A)=defΔA(ℰ˜[π]K,𝒟˜[π]K,𝒱˜[π]K,π±;ℰ˜[π]K,𝒟˜[π]K,⊥,π±) .

2.3 H-coefficient Technique

We will need a proof method in the later parts, where we opt for Patarin’s well-suited H-coefficient technique in the variant by Chen and Steinberger [30, 54]. The results of the interaction of an adversary A with its oracles are collected in a transcript τ. The oracles can sample randomness before the interaction (often a key or an ideal primitive that is sampled beforehand), and are then deterministic throughout the experiment [30]. The challenge of A is to distinguish the real world 𝒪_real from the ideal world 𝒪_ideal. We denote by Θ_real and Θ_ideal random variables that represent the distribution of transcripts in the real and the ideal world, respectively. In general, a transcript τ is called attainable if the probability to obtain τ in the ideal world – i.e. over Θ_ideal – is non-zero. The Fundamental Lemma of the H-coefficients technique, the proof to which is given in [30, 54], states that we can partition the set of all attainable transcripts into two disjoint sets GoodT and BadT.

Lemma 1

(Fundamental Lemma of H-coefficient Technique [54]). Assume that there exist ε₁, ε₂ ≥ 0 such that for any transcript τ ∈ GoodT, it holds that Pr [Θ_real = τ] / Pr [Θ_ideal = τ] ≥ 1 − ε₁. Moreover, assume that Pr [Θ_ideal ∈ BadT] ≤ ε₂. Then under the aforementioned assumptions, for all adversaries A, it holds that Δ_A (𝒪_real; 𝒪_ideal) ≤ ε₁ + ε₂.

Note that in all our analyses, we consider information-theoretic distinguishers A, where we assume idealized primitives. Thus, their resources are bounded only in terms of their maximal numbers of queries and blocks that they can ask to their available oracles. One can derive the computation-theoretic counterparts easily by adding a parameter of the distinguishers’ maximal computational efforts.

3 Int-RUP Attacks on Existing AE Schemes

This section shows attacks under Int-RUP adversaries on the duplex mode, as well as on recent more secure AE schemes Beetle or SPoC. For each construction, we briefly recall the necessary parts of their definition. As summarized in Table 2, the proposed attacks possess an advantage of O(qpqd2c) on the previous constructions. Thus, the improved bounds of Beetle or SPoC do not carry over to the Int-RUP setting. For all attacks, we consider a random permutation π ↞ Perm(ℬ) and a randomly chosen secret key K ↞ 𝒦. We denote by A a nonce-respecting Int-RUP adversary against the individual schemes.

Table 2

Comparison of the security bounds for Int-RUP attacks on previous permutation-based AE schemes and our construction.

Scheme	Bound
Scheme	Unmasked	Masked
Generic Duplex [15]	O(q_dq_p/2^c)	O(qd2/2c)
Beetle [25]	O(q_dq_p/2^c)	O(qd2/2c)
SPoC [5]	O(q_dq_p/2^c)	O(qd2/2r)
Oribatida [This work]	–	O(qd2/2c)

The main idea of all the attacks in this section is as follows: A asks q_d decryption queries s. t. any predetermined r bits (e.g., the first r bits) of the input to one of the permutations of the construction are fixed and known (say X). The remaining n − r = c bits may vary. Next, A asks q_p primitive queries Q¹, Q²,. . . , Q^q_p with the first r bits fixed to X, but with pairwise distinct c bits, and receives R¹, R²,. . . , R^q_p. When q_d · q_p ≈ O(2^c), A can expect a state collision between an on-line input to the permutation and an (off-line) permutation query. This collision can be detected from the first r bits of the outputs of the corresponding construction queries, which will be equal for the colliding inputs. Once A knows the full state at the input to the permutation of the construction, it can revert the permutation calls in the construction and finally recover the key. We adapt this strategy for the duplex mode and Beetle before we consider the differences in SPoC and hybrids.

3.1 Int-RUP Attack on The Duplex Mode

Let us consider the Sponge-Wrap mode [15].

The adversary A asks q_d decryption queries (N, A¹, C), (N, A², C), . . . , (N, A^q_d, C), and receives M¹, M², . . ., M^q_d. The associated data Aⁱ consist of a single block, the ciphertexts C = (C₁, C₂) are fixed to the same two blocks for each query.
Now A can follow the generic idea to complete the attack.

The attack complexity is q_dq_p ≈ O(2^c).

3.2 Int-RUP Attack on Beetle

Beetle [25] is a recent permutation-based light-weight AE scheme. From now onward, we consider the updated variant [26] that fixed the proof and details (such as no double call to the permutation). An overview of the encryption process is provided in Figure 1. The map ρ : {0, 1}^r × {0, 1}^r → {0, 1}^r × {0, 1}^r computes ρ(I1,I2)=def(shuffle(I1)⊕I2,I1⊕I2) for all inputs I₁, I₂ ∈ {0, 1}^r, where shuffle(x) =^def (lsb_r/2 (x) || lsb_r/2 (x) ⊕ msb_r/2 (x)). The results of ρ are ordered as (X_a+i, C_i) ← ρ(Y_a+i, M_i) and (Y_a+i, M_i) ← ρ⁻¹(X_a+i, C_i), respectively.

The adversary A asks q_d encryption queries (N¹, A¹, M), (N², A², M), . . . , (N^q_d, A^q_d, M) to the encryption oracle, and receives C¹, C², . . . , C^q_d. Size of M and Aⁱ is one block for each i.
Then, A asks q_d decryption queries (N¹, A¹, C^1′), (N², A², C^2′), …,(N^q_d, A^q_d, C^q′_d) to the decryption oracle where Ci'←Y2i⊕shuffle(Y2i) . This ensures that the first r bits of the input to the third permutation always equal zero.
Now A can follow the generic idea to complete the attack.

The attack complexity is again q_dq_p ≈ O(2^c).

Figure 1

The Beetle authenticated encryption scheme.

3.3 Int-RUP Attack on SPoC

SPoC (see Figure 2a) is a permutation-based NIST Lightweight candidate [5]. that uses the capacity to derive ciphertext outputs from, while it still absorbs the message in the rate. In SPoC, the adversary cannot fix any part of the state in contrast to SpongeWrap and Beetle– though, there is a similar attack:

A asks q_d queries (N, A, C¹), (N, A, C²), . . ., (N, A, C^q_d) to the decryption oracle and receives M¹, M² , . . ., M^q_d. The associated data A and ciphertext Cⁱ consist of a single block for every i. This ensures that the first r bits of the input to the third permutation always equal M¹ ⊕ C¹.
Now A can follow the generic idea to complete the attack.

So, the attack needs q_dq_p ∈ O(2^c) to work, as before.

Figure 2

3.4 Int-RUP Attack on A Hybrid of Beetle and SPoC

We can generalize our attacks to hybrid modes of Beetle and SPoC as well. Such a hybrid would use both modes Beetle and SPoC in parallel to process the queries. We illustrate it in Figure 2b. Each message block (say M) is parsed into two sections (say M¹ and M²), where |M¹| = r₁ and |M²| = r₂. M¹ is processed with Beetle to a ciphertext block C¹; M² with SPoC to a ciphertext block C²; The final ciphertext block becomes C ← C¹ ‖ C², and the associated-data blocks and the ciphertext blocks for decryption are treated in a similar manner. Note that the hybrid mode is parameterized by r₁, r₂ and c with the condition c ≥ r₂. The size of rate and capacity of the Beetle part are r₁ and c − r₂; the size of both rate and capacity of the SPoC part is r₂. As a result, the size of rate and capacity of the hybrid mode is r = r₁ + r₂ and c. When r₂ = 0, the hybrid mode translates to the Beetle mode. Similarly, when r₁ = 0, the hybrid mode is equivalent to the SPoC mode.

An Int-RUP attack on such modes could be defined as follows:

A asks q_d decryption queries (N, A¹, C), (N, A², C), . . ., (N, A^q_d, C) to the decryption oracle and receives M¹, M², . . . , M^q_d. The ciphertext C and associated data Aⁱ consist of a single block for every i.
There exists at least one value of the last r₂ bits of the input to the third permutation which remains same for at least q=qd2r2 queries. Suppose q such queries are (N, A^{1^′}, C), (N, A^{2^′}, C), . . ., (N, A^{q^′}, C).
A can detect the previous step as it knows the value of the last r₂ bits of the input to the third permutation because that will be equal to the last r₂ bits of C ⊕ Mⁱ.
A retains those q queries and discards the rest.
For each of the above queries, A updates the value of the first r₁ bits of the ciphertext to Y₂ ⊕ shuffle(Y₂) and varies the remaining r₂ bits. This ensures that the first r₁ bits of the input to the third permutation always equal zero.
In the way mentioned above, A can ask q_d more decryption queries to the decryption oracle. This time, a total of r bits (first r₁ bits and last r₂ bits) of the input to the third permutation are fixed and known to A.
Then, A can follow the generic idea to complete the attack.

The attack needs again q_dq_p ∈ O(2^c) as before.

3.5 Discussion

As a takeaway from this section, the unmasked sponge-based AE schemes allow Int-RUP attacks whose advantage can depend linearly on the number of off-line primitive queries. We think that any AE construction which uses linear feedback is vulnerable to such an attack (q_dq_p ∈ O(2^c)) unless it uses more state. The next section defines Oribatida that masks its ciphertexts for higher Int-RUP resistance. After its definition, we will get back to attacks on it and on strengthened versions of the schemes sketched here to show that they can be similarly extended by a ciphertext masking.

4 Specification of Oribatida

At its core, Oribatida is a variant of the monkey-wrap design [16], but adds a ciphertext masking. This section considers a slightly updated version of Oribatida. Section 9 discusses the update from Oribatida (v1.2) from [21] to Oribatida v1.3 in this work.

In the following, let P ∈ Perm(ℬ) be a permutation. We denote by (X_i, Y_i) the inputs and by (U_i, V_i) the outputs of the primitive(s). As in the classical sponge, Oribatida considers the state S_i = (U_i ‖ V_i) as a rate U_i of r bits, where inputs are XORed to, and a capacity V_i of c = n − r bits. Unlike the usual sponge, an s-bit part of the capacity is used to mask the subsequent ciphertext block. The definition is given in Algorithm 1. We assume that the key size is at most the capacity, k ≤ c, and the tag size is at most τ ≤ r bits.

Algorithm 1

Specification of Oribatida v1.3. The domain-encoding functions GetDomainForN, GetDomainForA, and GetDomainFore are instantiation-specific. They are defined in Algorithm 2.

101:	function ℰKN,AM
102:	ℓ_A ← \|A\|
103:	ℓ_E ← \|M\|
104:	d_N ← GetDomainForN (ℓ_A, ℓ_E)
105:	d_A ← GetDomainForA(ℓ_A, ℓ_E)
106:	d_E ← GetDomainForE(ℓ_E)
107:	if \|A\| = 0 then A ← 1 ‖ 0^r−1
108:	A ← pad_r(A)
109:	M ← pad_r(M)
110:	(S₁, V₁) ← Init(K, N, d_N, ℓ_A)
111:	S_a+1 ← ProcessAD(S₁, A, d_A)
112:	(C, T) ← Encrypt(S_a+1, M, V₁, d_E, ℓ_E)
113:	return (C, T)

121:	function Encrypt(S_a+1, M, V₁, d_E, ℓ_E)
122:	x ← ℓ_E mod r
123:	(M1,⋯,Mm)←rM
124:	V ← V₁
125:	for i = 1..m do
126:	(Ua+i,Va+i)←r,cSa+i
127:	X_a+i ← M_i ⊕ U_a+i
128:	C_i ← X_a+i ⊕_s lsbs (V)
129:	Y_{a + i} ← V_{a + i}
130:	if i = m then
131:	Y_{a + i} ← Y_{a +i} ⊕_d d_E
132:	C_m ← msb_x(C_m)
133:	V ← V_{a + i}
134:	S_{a + i + 1} ← P(X_{a + i} ‖ Y_{a + i})
135:	C ← (C₁ ‖ C₂ ‖ . . . ‖ C_m)
136:	T ← msb_τ(S_{a + m + 1}) ⊕_s lsb_s(V)
137:	return (C, T)

141:	function Init(K, N, d_N, ℓ_A)
142:	V₀ ← lsb_s(N ‖ K)
143:	S₁ ← P((N ‖ K) ⊕_d d_N)
144:	V₁ ← lsb_s(S₁)
145:	return (S₁, V₁)

151:	function ProcessAD(S₁, A, d_A)
152:	(A1,⋯,Aa)←rA
153:	for i = 1..a − 1 do
154:	S_{i + 1} ← P(S_i ⊕ (A_i ‖ 0^c))
155:	S_{a + 1} ← P(S_a ⊕ (A_a ‖ 0^c) ⊕_d d_A)
156:	return S_{a + 1}

201:	function 𝒟KN,AC,T
202:	ℓ_A ← \|A\|
203:	ℓ_E ← \|C\|
204:	d_N ← GetDomainForN(ℓ_A, ℓ_E)
205:	d_A ← GetDomainForA(ℓ_A, ℓ_E)
206:	d_E ← GetDomainForE(ℓ_E)
207:	if \|A\| = 0 then A ← 1 ‖ 0^r−1
208:	A ← pad_r(A)
209:	C ← pad_r(C)
210:	(S₁, V₁) ← Init(K, N, d_N, ℓ_A)
211:	S_{a + 1} ← ProcessAD(S₁, A, d_A)
212:	(M, T′) ← Decrypt(S_{a + 1}, C, V₁, d_E, ℓ_E)
213:	if T = T′ then return M
214:	else return ⊥

221:	function Decrypt(S_{a + 1}, C, V₁, d_E, ℓ_E)
222:	x ← ℓ_E mod r
223:	V ← V₁
224:	if ℓ_E = 0 then
225:	T′ ← msb_τ(S_{a + 1}) ⊕_s lsb_s(V)
226:	return (ɛ, T′)
227:	(C1,⋯,Cm)←rC
228:	for i = 1..m do
229:	(Ua+i,Va+i)←r,cSa+i
230:	X_{a + i} ← C_i ⊕_s lsb_s(V)
231:	Y_{a + i} ← V_{a +i}
232:	M_i ← U_{a + i} ⊕ X_{a + i}
233:	if i = m then
234:	Y_{a + i} ← Y_{a + i} ⊕_d d_E
235:	M_m ← msb_x(M_m)
236:	V ← V_{a + i}
237:	S_{a + i +1} ← P(X_{a + i} ‖ Y_{a + i})
238:	M ← (M₁ ‖ M₂ ‖ . . . ‖ M_m)
239:	T ← msb_τ(S_{a + m + 1}) ⊕_s lsb_s(V)
240:	return (M, T′)

241:	function PAD_x(X)
242:	if \|X\| mod x = 0 then return X
243:	return X ‖ 1 ‖ 0^{x−(\|X\| mod x)−1}

251:	function LSB_x(X)
252:	if \|X\| ≤ x then return X
253:	return X[(\|X\| − x − 1)..0]

261:	function MSB_x(X)
262:	if \|X\| ≤ x then return X
263:	return X[(\|X\| − 1)..(\|X\| − x)]

4.1 Initialization

Each variant of Oribatida uses a fixed-size nonce N, whose length ν is such that k + ν = n bits. N is concatenated with the key K to initialize the state: N ‖ K: (X₀, Y₀) ← (N ‖ K) ⊕_d 〈d_N〉_d. The domain d_N is XORed to the d least significant bits of the initial state. The first value S₁ results from S₁ ← P(U₀ ‖ V₀); V₁ is stored for masking the first block of ciphertext later.

Figure 3

Authenticated encryption of a-block associated data A and m-block message M with Oribatida.

4.2 Processing Associated Data

After the initialization, the associated data A is split into r-bit blocks and is absorbed in the rate. If its length is not a multiple of r bits, A is padded with a 10^*-padding if |A| mod r ≢ 0 such that its length becomes the next highest multiple of r bits. If the associated data is empty, it is padded to one full block 10^r−1. In this case, we denote the length of the padded associated data A in blocks also as a = 1. The padded A is split into r-bit blocks (A₁, . . ., A_a). Given (Ui,Vi)←r,cSi is XORed to the rate of the state: X_i ← U_i ⊕ A_i, for 1 ≤ i < a. For all non-final blocks of A, the capacity of the permutation output, V_i, is simply forwarded to that of the subsequent input to the permutation P: Y_i ← V_i. The state is updated with P afterwards, for all 1 < i < a (that is, except the final a-th block of A): S_{i + 1} ← P(X_i ‖ Y_i). When the final block A_a is processed, a domain d_A is XORed to the least significant byte of the capacity.

4.3 Encryption

After A has been processed, the message M is encrypted. Similarly as for the associated data, if its length is not a multiple of r bits, M is padded with a 10^*-padding such that its length after padding becomes the next highest multiple of r bits. An empty message M = ε will not be padded.

After M is split into r-bit blocks (M₁, . . ., M_m) (after padding if necessary), the blocks M_i are processed one after the other. Given the state value (Ua+i,Va+i)←r,cSa+i , the current block M_i is XORed to the rate U_a+i: X_a+i ← M_i ⊕ U_{a + i}. The capacity is simply forwarded: Y_{a + i} ← V_{a + i}. Then, (X_{a + i} ‖ Y_{a + i}) is used as input to a call to P to derive the next state value S_{a + i +1} ← P(X_{a + i} ‖ Y_{a + i}).

The ciphertext blocks C_i are computed from a sum of the current rate, the current plaintext block, and a (partial) earlier mask value from the capacity. The first ciphertext block is computed from C₁ ← X_{a + i}⊕_s lsb_s(V₁). If C₁ is the final ciphertext block, it is computed as C₁ ← msb_{ℓ_E}(X_{a + i} ⊕_s lsb_s(V₁)), where ℓ_E denotes the length of M before padding. Non-final ciphertext blocks C_i, 1 < i < m are computed from C_i ← X_{a + i} ⊕s lsb_s(V_{a + i−1}), for 1 < i < m. If m > 1, the final ciphertext block results from C_m ← msb_{ℓ_E}mod r(X_{a + m}⊕_s lsb_s(V_{a + m−1})). For the final message block, a domain d_E is XORed to the least significant byte of the capacity: Y_{a + m} ← V_{a + m}⊕_d 〈d_E〉_d. Similar as for A, d_E uses three pairwise distinct domains depending on whether M was empty, non-empty and required no padding, or non-empty and has been padded. P is called another time to derive S_{a + m +1} ← P(X_{a + m} ‖ Y_{a + m}). Its rate is XORed with the most significant τ bits of the key V_{a + m}, and – truncated to τ bits if necessary – is released as the authentication tag: T ← msb_τ(S_{a + m +1}) ⊕_s lsb_s(V_{a + m}). Note that, for s = τ as for our instantiations, the tag is masked as the ciphertext output blocks, which unifies this process.

4.4 Decryption

The decryption takes a tuple (K, N, A, C, T). The initialization with K and N as well as the processing of the associated data A is performed in the same manner as for encryption. If |C| mod r ≠ 0, the decryption pads C with a 10^*-padding to the next multiple of r bits. In all cases, it splits C into r-bit blocks (C₁, . . ., C_m−1) plus a final block C_m. If m > 1, the plaintext block is computed as X_{a + i} ← C_i ⊕_s lsb_s(V) and M_i ← (U_{a + i} ⊕ X_{a +i}), where V = V₁ for i = 1, and V = V_a+i−1 otherwise. The capacity is simply forwarded to the next call of the permutation: Y_{a + i} ← V_{a + i}. The subsequent state is then (U_a+i+1 ‖ V_a+i+1) ← S_a+i+1 ← P(X_a+i ‖ Y_a+i).

The final plaintext block is computed from the padded ciphertext block C_m as X_a+m ← C_m ⊕s lsb_s(V) and M_m ← lsb_x(U_a+m ⊕ X_a+m), where x ← ℓ_E mod r. For the final block, the domain d_E is XORed to the least significant byte of the capacity: Y_a+m ← V_a+m⊕_d 〈d_E〉_d. The would-be tag T′ is derived by computing (T′ ‖ Z) ← P(X_a+m ‖ Y_a+m) ⊕_s lsb_s(V_a+m), and using only its most significant τ bits: T′ ← msb_τ(T′ ‖ Z) as for the encryption If T = T′, the ciphertext is considered valid, and M = (M₁ ‖ . . . ‖ M_m) is released as plaintext. Otherwise, the ciphertext is deemed invalid, and ⊥ is returned.

4.5 Domain Separation

For domain separation, Oribatida defines constants d_N, d_A and d_E. The domains are XORed with the least significant byte of the state at three stages. Domains are encoded as d-bit strings, where d = 4 bits suffice in practice. The value depends on the presence of A and M and whether their final blocks are absent, partial, or integral to prevent trivial collisions of inputs to P among blocks of A and M. The constants are determined by four bits (t₃, t₂, t₁, t₀) that reflect inputs in the hardware API, similar to, e.g., [25]:

EOI: t₃ is the end-of-input control bit. This bit is set to 1 if the current data block is the final block of the input. Note that if the associated data is empty, then the created 10^r−1-block is never treated as the final block of the input.
EOT: t₂ is the end-of-type control bit. This bit is set to 1 if the current data block is the final block of the same type, i.e., it is the last block of the nonce/associated data/message. Note that if both the associated data and the message are empty, then neither the nonce nor the created 10^r−1-block of the associated data is considered as the final block of its type.
Partial: t₁ is the partial-control bit. This bit is set to 1 if the size of the current block is less than the block size. Note that if the associated data is empty and the message is non-empty, then the created 10^r−1-block is treated as a partial block.
Type: t₀ is the type-control bit, identifying the type of the current block. For the nonce and the final message block, t₀ = 1. If the associated data is empty and the message is non-empty, then t₀ = 1 for the created 10^r−1-block of the associated data. For all other cases, t₀ = 0.

While processing a data block, the domains are set as the integer representation of t₃ ‖ t₂ ‖ t₁ ‖ t₀. For example, processing the nonce (which is always a complete r-bit block) with empty associated data and non-empty message yields d_N = (t₃t₂t₁t₀) = (0101)₂ = 5. Details are provided in Algorithm 2; ℓ_A denotes the length of A and ℓ_E that of M in bits before padding. An overview is given in Table 3.

Algorithm 2

Instantiated Domains.

11:	function GetDomainForN(ℓ_A, ℓ_E)
12:	if ℓ_A = 0 ∧ ℓ_E = 0 then return 〈9〉_d
13:	return 〈5〉_d

21:	function GetDomainForA(ℓ_A, ℓ_E)
22:	if ℓ_A = 0 ∧ ℓ_E = 0 then return 〈0〉_d
23:	if ℓ_A = 0 ∧ ℓ_E > 0 then return 〈7〉_d
24:	if ℓ_E > 0 ∧ ℓ_A mod r ≡ 0 then return 〈4〉_d
25:	if ℓ_E > 0 ∧ ℓ_A mod r ≢ 0 then return 〈6〉_d
26:	if ℓ_E = 0 ∧ ℓ_A mod r ≡ 0 then return 〈12〉_d
27:	if ℓ_E = 0 ∧ ℓ_A mod r ≢ 0 then return 〈14〉_d

31:	function GetDomainForE(ℓ_E)
32:	if ℓ_E mod r ≡ 0 then return 〈13〉_d
33:	if ℓ_E mod r ≢ 0 then return 〈15〉_d

Table 3

Instantiated domains in Oribatida. • = yes, – = no.

\|A\|		\|M\|		Domains
> 0	modr ≡ 0	> 0	modr ≡ 0	d_N	d_A	d_E
–	–	–	–	〈9〉_d	〈0〉_d	–
–	–	•	–	〈5〉_d	〈7〉_d	〈15〉_d
–	–	•	•	〈5〉_d	〈7〉_d	〈13〉_d

•	–	–	–	〈5〉_d	〈14〉_d	–
•	–	•	–	〈5〉_d	〈6〉_d	〈15〉_d
•	–	•	•	〈5〉_d	〈6〉_d	〈13〉_d

•	•	–	–	〈5〉_d	〈12〉_d	–
•	•	•	–	〈5〉_d	〈4〉_d	〈15〉_d
•	•	•	•	〈5〉_d	〈4〉_d	〈13〉_d

5 Int-RUP Attacks on Schemes with Masked Ciphertexts

The approach of Oribatida is to employ (a portion of) the capacity of the previous permutation output to mask the ciphertext outputs. This strategy is generic enough to also apply it to other modes, such as Beetle or SPoC. We can informally define a masked variant of Beetle and SPoC. The masked beetle uses Z_i−1 and XORs lsb_s(Z_i−1) to the s rightmost bits of the ciphertext block C_i, for i > 1 and s ≤ c. If there is no associated data present, we define Z₀ = K₂. A masked variant of SPoC would employ the rate (since it is the hidden part). Thus, for the masked SPoC, we define s ≤ r and define that lsb_s(U_i−1) is XORed to C_i for i > 0. If no associated data is present, we define U₀ for the rate of the initial input to the permutation P.

For Oribatida, the masked Beetle or the masked SPoC, the attacks in Section 3 do not apply directly. Though, there exist attacks on each of them with complexity O(qd2/2c) .

5.1 The Generic Int-RUP Attack on Oribatida (Masked Duplex)

Here, we consider an attack on Oribatida that shows that our Int-RUP bound of O(qd2/2c) will be tight, so the attack will be successful for qd2≈O(2c) :

A asks q_d decryption queries (Nⁱ, Aⁱ, Cⁱ, Tⁱ), where Nⁱ and Cⁱ is static for all queries. We assume that the associated data Aⁱ are pairwise distinct and consist of a single block for all queries. A obtains Mⁱ from the encryption oracle, for 1 ≤ i ≤ q_d. The rate X2i←C1i⊕slsbs(V1i) is constant for all queries.
For q_d ≈ O(2^c/2), A can expect a collision in the capacity of the input: V2i←V2j for some distinct i ≠ j, i, j ∈ [1.. q_d]. Then, this collision leads to a full-state collision that can be detected when M1i=M1j .
A asks encryption queries (N, Aⁱ, Mⁱ) and obtains (Cⁱ, T) for some tag T.
(N, A^j, Cⁱ, T) is a valid forgery and yields M^j.

5.2 Int-RUP Attack on The Masked Beetle

The adversary A asks q_d encryption queries (N¹, A¹, M), (N², A², M), . . . , (N^q_d, A^q_d, M) to the encryption oracle, and receives C¹, C², . . ., C^q_d. The associated data Aⁱ consist of a single block for each i; the message M contains ⌈cr⌉ blocks.
A asks q_d − 1 decryption queries, one for each encryption query except the first encryption query, to the decryption oracle. The decryption query of the i-th encryption query is (Nⁱ, Aⁱ, C^{i^′}), where Ci'=Y21⊕Y2i⊕shuffle(Y21)⊕shuffle(Y2i) . A can calculate the r.h.s. as shuffle(Y21)⊕shuffle(Y2i)=C1⊕Ci , and Y21⊕Y2i directly from shuffle(Y21)⊕shuffle(Y2i) with the definition of shuffle. So, the first r bits of the input to the third permutation call always equal those of the first encryption query.
Afterwards, A repeats Step 2 to 6 from Section 5.1 to complete the attack.

The attack is successful for qd2≈O(2c) . However, the attack strategy differs for SPoC.

5.3 Int-RUP Attack on The Masked SPoC

Here, A has to perform the attack in two stages.

First, A asks q_d decryption queries (N, A¹, C), (N, A², C), . . . , (N, A^q_d, C) to the decryption oracle, and receives M¹, M², . . ., M^q_d. The associated data Aⁱ consists of a single block for each i; the ciphertext C consists of two blocks.
When q_d ≈ 𝒪(2r), A expects a collision in the first r bits of the input to the third permutation call. A can detect this collision by looking at the first message block because it will be equal only for the two colliding queries.
Suppose the associated data of the two colliding queries are Aⁱ and A^j.
A makes q₁ queries (N, Aⁱ, C¹), (N, Aⁱ, C²), . . . , (N, Aⁱ, C^q₁), and q₂ queries (N, A^j, C¹), (N, A^j, C²), . . ., (N, A^j, C^q₂) to the decryption oracle.
When q₁ · q₂ ≈ 𝒪(2^r), A expects a full state collision at the input to the third permutation, between one query with associated data Aⁱ and another query with associated data A^j.
Suppose the two ciphertexts corresponding to the two colliding queries are C^p and C^q, and the corresponding messages are M^p and M^q.
A identifies those pairs of queries (N, Aⁱ, C^x), (N, A^j, C^y), 1 ≤ x ≤ q₁ and 1 ≤ y ≤ q₂, for which the sum of the second message blocks equals that of the first message blocks. For each such pair, A updates C^x and C^y by appending ⌈cr⌉-1 ciphertext blocks to each s.t., C2x=C2y,…,C⌈cr⌉x=C⌈cr⌉y , and makes decryption queries with the updated ciphertexts. For k < 2, Mkp will equal Mkq .
Next, A asks (N, Aⁱ, M^p) to the encryption oracle; suppose, the tag is T.
Then, A successfully forges with the query (N, A^j, C^q, T).

Again, the probability for forgeries becomes non-negligible when qd2≈O(2c) .

6 nAE Security Analysis

This section analyzes the nAE security of Oribatida. In the following, let K ↞ 𝒦 and π ↞ Perm(ℬ). We use Π[π, π]_K = Π[π]_K as short form of Oribatida, instantiated with π for P and for P′, and keyed by K. Let A be a nonce-respecting nAE adversary w.r.t. Π[π]_K. We denote by q_p, q_f, q_b, q_c, q_e, q_d, σ_c, σ_e, σ_d the number of primitive queries, forward primitive queries, backward primitive queries, construction queries, encryption queries, decryption queries, blocks summed over all construction queries, blocks summed over only all encryption queries, and blocks summed over all decryption queries, respectively. It holds that q_p = q_f + q_b, q_c = q_e + q_d, and σ_c = σ_e + σ_d. For simplicity, we define a function ρ as

ρ(i,j)=def{1if j=1ai+j-1otherwise .

So, Vρ(i,j)i denotes the used block for masking the ciphertext block Cji .

Recall the notion of a longest common prefix from [11]. Let Q = (N, A, M, C, T) be a query of A with the response. Let Q denote a set of queries without Q, i.e., Q ∉ Q. We define the length of the longest common prefix of M and another message M′ as LCP(M,M′)=defmaxi{1≤j≤i:Mj=Mj′} . Given Q and M, we overload the notation by considering the longest common prefix of M with the queries in Q′ = (N′, A′, M′, C′, T′) ∈ Q: LCP(M,𝒬)=defmaxM′∈𝒬{LCP(M,M′)} . We also define

LCPN,A(M,𝒬)=defmax(N′,A′,M′,C′,T′)∈𝒬N′,N∧A′-A{LCP(M,M′)} .

Collisions of chaining values are trivial if in the longest common prefix and non-trivial otherwise.

Theorem 1

(nAE Security of Oribatida). Let A be a nonce-respecting adversary w.r.t. Π[π]_K. Then, AdvΠ[π]KnAE(A) is upper bounded by

(σr)+2(qpr)2r(r-1)+σ22n+3qp2k+r(qd+σd)+2σeqp+qpqc+qd(σe+qp)+2rqp2c+s+qd2τ .

Proof

We follow the strategy of the nAE proof of Beetle [25]. The queries by A and their corresponding answers are collected in a transcript τ = (τ_e, τ_d, τ_p). In that transcript, the encryption construction queries are stored as tuples τ_e = {(Nⁱ, Aⁱ, Mⁱ, Cⁱ, Tⁱ)}, for 1 ≤ i ≤ q_e, the decryption construction queries are stored as tuples τ_d = {(Nⁱ, Aⁱ, Mⁱ, Cⁱ, Tⁱ)}, for 1 ≤ i ≤ q_d, and primitive queries are stored as tuples τ_p = {(Qⁱ, Rⁱ)}, where π(Qⁱ) = Rⁱ, for 1 ≤ i ≤ q_p.

Sampling

We define the ideal oracle to consist of an on-line and an off-line phase. In the on-line phase, the ideal oracle samples the responses (Cⁱ, Tⁱ) uniformly at random from the bit strings of expected lengths for encryption queries. For decryption queries, it always outputs ⊥. For forward primitive queries Qⁱ, it forwards the result of π(Qⁱ) to A; for backward primitive queries Rⁱ, it forwards the result of π⁻¹(Rⁱ).

In the off-line phase, the ideal oracle samples the internal chaining values Vji↞{0,1}c and Uji↞{0,1}r uniformly at random for all construction queries in encryption direction. It derives the analogous internal chaining values Vji , for 1 ≤ j ≤ k for all construction queries in decryption direction (Nⁱ, Aⁱ, Cⁱ, Tⁱ) that share Nⁱ, Aⁱ = N^{i^′}, A^{i^′}, and for which C1i=C1i′,…,Cki=Cki′ holds for some i^′-th construction query, where i ≠ i^′. Moreover, for construction queries whose plaintext or ciphertext length is not a multiple of r bits, the oracle samples exactly the missing bits Cmi uniformly independently at random that are not fixed from previous queries, at most at most r-|Cmii| bits at a time. The so-sampled values for the final blocks Cmii are stored also in the transcript. Moreover, the random key K is revealed to A after the off-line phase.

Bad Events

We define the following bad events. If any of them occurs, the adversary aborts, and we define that it wins in this case.

bad₁: Multi-collision on the rate X in encryption construction queries. For some w ≥ r, ∃ indices (i₁, j₁), (i₂, j₂), . . . , (i_w, j_w) with i₁, i₂, . . ., i_w ∈ [1.. q_e], and j₁ ∈ [1.. a^i₁], j₂ ∈ [1..a^i₂], etc., s. t. Xj1i1=Xj2i2=…=Xjwiw .
bad₂: Collision of permutation inputs in encryption construction queries: ∃ indices (i, j) ≠ (i^′, j^′) with i, i^′ ∈ [1..q_e], j ∈ [1..mⁱ], and j^′ ∈ [1..m^{i^′}] s. t. (Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′) .
bad₃: Collision of permutation outputs in encryption construction queries: ∃ indices (i, j) ≠ (i^′, j^′) with i, i^′ ∈ [1..q_e], j ∈ [1..mⁱ], and j^′ ∈ [1..m^{i^′}] s. t. (Uji ∥ Vji)=(Uj′i′ ∥ Vj′i′) .
bad₄: Collision of permutation inputs between a construction and a primitive query: ∃ indices (i, j, i^′) with i ∈ [1..q_e], j ∈ [1..mⁱ], and i^′ ∈ [1..q_p] s. t. (Xji ∥ Yji)=Qi′ .
bad₅: Collision of permutation outputs between a construction and a primitive query: ∃ indices (i, j, i^′) with i ∈ [1..q_e], j ∈ [1..mⁱ], and i^′ ∈ [1..q_p] s. t. (Uji ∥ Vji)=Ri′ .
bad₆: Initial-state collision with a primitive query: ∃ indices (i, i^′) with i ∈ [1..q_e] and i^′ ∈ [1..q_p] s. t. (X0i ∥ Y0i)=Qi′ .
bad₇: Multi-collision in the rate of w outputs of forward primitive queries: for some w ≥ r, ∃ i₁, i₂, . . ., i_w ∈ [1.. q_p] s. t. msb_r(R^i₁) = msb_r(R^i₂) = . . . = msb_r(R^i_m).
bad₈: Multi-collision in the rate of w outputs of backward primitive queries: for some w ≥ r, ∃ i₁, i₂, . . ., i_w ∈ [1.. q_p] s. t. msb_r(Q^i₁) = msb_r(Q^i₂) = . . . = msb_r(Q^i_w).

We define that the adversary is provided with all internal chaining values Vji and Uji after its interaction, but before it outputs its decision bit, which only strengthens the adversary. We define the set of bad transcripts BadT, to contain exactly those attainable transcripts τ for which at least one of the bad events occurred. It holds that Pr[Θideal∈BadT]≤∑i=18Pr[badi] . The probability of bad transcripts in the ideal world is treated in the proof of Lemma 2. The ratio of good transcripts is bounded in Lemma 3. Our bound in Theorem 1 follows from them and the fundamental Lemma of the H-coefficient Technique [54], using w = r in the bounds.

Lemma 2

Let w ≥ r be a positive integer. It holds that

Pr[Θideal∈BadT]≤(σw)+2(qpw)2r(w-1)+σ22n+3qp2k+2σeqp+qpqc+2w⋅qp2c+s .

Proof

In the following, we upper bound the probabilities of the individual bad events.

bad₁: Multi-collision on X in encryption construction queries.

In the ideal world, the ciphertext blocks are sampled independently and uniformly at random from the strings of expected length. The internal values Xji can be computed by A once it is given the transcript including the internal chaining values Vji . It must hold that Xji←Cj-aii⊕slsbs(Vρ(i,j-ai)i) . The random sampling of C implies that the probability of the values Xji to take any specific r-bit value is 1/2^r. Note that in the case of a padded ciphertext block, each padded bit of Cmii is also sampled once randomly and given in the transcript. Hence, the probability for Xai+mii to take any r-bit value is also 2^−r in the ideal world. For fixed indices (i₁, j₁), (i₂, j₂),. . . , (i_w, j_w), it holds that

Pr[Xj1i1=Xj2i2=…=Xjwiw]≤2-r(w-1) .

Over all queries and blocks of τ_e, it follows that

Pr[bad1]≤(σw)2r(w-1) .

bad₂: Collision of two permutation inputs in encryption construction queries.

Here, we consider

Pr[(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′)] .

All ciphertext blocks and the internal chaining values Vρ(i,j-ai)i , j > aⁱ are sampled independently and uniformly at random. Moreover, padded bits of ciphertexts are sampled also independently and uniformly at random. Though, we have to consider two cases;

j = 0 ∧ j^′ = 0: since X0i and X0i′ contain nonces, and since we assume A to be nonce-respecting, the probability for a collision is zero in this case.
j > 0: In this case, Yji=Vji⊕const , where const ∈ {d_N, d_A, d_E} is a public constant. Moreover, Xji is derived from Cji ; so, both Xji and Yji are chosen independently and uniformly at random, and the probability for a collision in this case is at most 2⁻ⁿ.

Therefore, for fixed indices (i, j) ≠ (i^′, j^′), the probability is

Pr[(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′)]≤2-n .

Over all combinations of indices, it follows that

Pr[bad2]≤(σ2)2n .

bad₃: Collision of two permutation outputs in encryption construction queries.

This case is analogous to bad₂. The permutation outputs Vji are sampled randomly. In all cases, it holds that

Pr[(Uji ∥ Vji)=(Uj′i′ ∥ Vj′i′)]≤2-n .

Over all combinations of indices, it follows that

Pr[bad3]≤(σ2)2n .

bad₄: Collision of permutation inputs between a construction and a primitive query.

Again, we consider Xji←Cj-aii⊕slsbs(Vρ(i,j-ai)i) and Yji←Vji⊕const , where const is a public constant. The values Cj-aii and Vρ(i,j-ai)i are sampled randomly, the values (Xji ∥ Yji) take any value with probability at most 2⁻ⁿ.

Assume, the primitive query was asked before the construction query. If the construction query was in encryption direction, the collision probability for fixed queries is at most 2⁻ⁿ, for q_p · q_c combinations.
If the primitive query was asked after an encryption query, then, the latter one produced a tag. If the primitive query starts at any other block, A can see r − s bits. Hence, the probability is at most 2^−(c+s) for q_p · σ_e combinations. If the primitive query starts from the tag, the adversary sees c + s unmasked bits. Assuming bad¯1 , there are at most w equal tags over all encryption queries. So, the probability for a collision is 2^−(n−τ), for at most w · q_p combinations.

Over all combinations of indices, it follows that

Pr[bad4|bad¯1]≤max(σe⋅qp2n,σe⋅qp2c+s)+w⋅qp2c+s≤σe⋅qp2c+s+w⋅qp2c+s .

bad₅: Collision of permutation outputs between an encryption construction query and a primitive query.

Again, Uj+aii can be derived from Mji⊕Cji⊕slsbs(Vρ(i,j)i) and the values Cji and Vρ(i,j)i are sampled randomly. So, the values Uj+aii ∥ Vj+aii take any value with probability at most 2⁻ⁿ. If the primitive query starts at any other block, A can see r − s bits. Hence, the probability is at most 2^−(c+s) for q_p · σ_e combinations. Following a similar argument as for bounding bad₄ and excluding bad₁, we obtain over all combinations of indices that

Pr[bad5|bad¯1]≤σ⋅qp2c+s+w⋅qp2c+s .

bad₆: Initial-state collision with a primitive query.

Here, we know that the key is chosen uniformly at random. We distinguish between collisions depending on whether the primitive query was a forward query or a backward query.

If the primitive query was a forward query, it must hit the correct value of K ⊕_d d_N. So, the probability is at most q_p/2^k to collide with encryption construction queries. Considering also decryption queries, a nonce can repeat but change d_N. Since there exist at most three distinct values for d_N, the probability is at most 3q_p/2^k to collide.
If the primitive query was in backward direction, its response must hit any initial state of a construction query. If the construction query was asked before the primitive, A sees at best r − s bits of C₁. Then, the probability is at most q_c · q_p/2^c+s.
If the primitive query was asked before the construction query, A can use the nonce part of the primitive query’s result as a nonce. Though, a collision needs the key part to be correct, which holds with probability at most 3q_p/2^k.

Over all possible options, we obtain

Pr[bad6]≤3qp2k+qc⋅qp2c+s .

bad₇: Multi-collision in the rate of w outputs of forward primitive queries.

Since π is chosen randomly from the set of all permutations, the outputs are chosen randomly from a set of size 2ⁿ − (i − 1) for the i-th primitive query. So, the probability for w distinct queries to collide in their rate is at most 1/2^r(w−1) as for bad₇ in the nAE proof. Over all queries, the probability is upper bounded by

Pr[bad7]≤(qpw)2r(w-1) .

bad₈: Multi-collision in the rate of w outputs of backward primitive queries.

Following a similar argumentation as for bad₇, we obtain

Pr[bad8]≤(qpw)2r(w-1) .

Our bound in Lemma 2 follows from summing up all probabilities.

Lemma 3

Let τ ∈ GoodT. Then

Pr[Θreal=τ]Pr[Θideal=τ]≤1-(qd2τ+qd(qp+σe)2c+s+(σd+qd)⋅w2c+s) .

Proof

It remains to lower bound the ratio of real and ideal probability of obtaining a good transcript τ. Let τ = (τ_e, τ_d, τ_p) be an attainable transcript, where τ_d = ⊥_all contains only ⊥ for all responses. Since all ciphertext-block outputs and all internal chaining values in encryption queries are sampled independently and uniformly at random, their probability is 1/2 per bit. We define σ_distinct for the number of distinct calls to the permutation over all encryption and decryption queries. In the ideal world, it holds that

Pr[Θideal=τ]=Pr[K]⋅Pr[τe∧τp∧τd]=Pr[K]⋅Pr[τe]⋅Pr[τp]⋅Pr[τd]=12k⋅1(2n)σdistinct⋅1(2n)qp⋅1

since the outputs from encryption queries are sampled uniformly at random; so, the encryption and decryption transcripts τ_e and τ_d are independent from τ_p.

In the real world, the probabilities for choosing K as key and π as permutation are equal to those of the ideal world. We can separate the probability into

Pr[Θreal=τ]=Pr[K]⋅Pr[τe∧τp∧τd]=12k⋅Pr[τe,τd|τp]⋅Pr[τp]

since the encryption and the decryption transcript depend on the choice of the permutation π. Let ⊤_i denote that the i-th decryption query was a valid forgery. We can upper bound

where we define

ɛ=def∑i=1qdɛi and ɛi=defPr[⊤i|τe|τp]⋅Pr[τp] .

The probability of primitive queries is given by the fraction of all permutations π that would produce τ_p, which is

Pr[τp]=1(2n)qp ,

as in the ideal world. The ciphertext blocks Cji from encryption queries as well as the chaining values Vji are results from the permutation π and hence, depend on the permutation. Since τ is a good transcript, there are no undesired collisions, e.g., between primitive and construction queries. Hence, all internally computed values (Uji ∥ Vji) – note that Uji can be derived from Cj-aii⊕slsbs(Vρ(i,j-ai)i)⊕Mj-aii by the adversary – are results of fresh values or predefined in decryption queries from the result of previous encryption queries. Then, the probabilities for the outputs of π in construction queries are given by 1/(2ⁿ)_{σ_distinct}. It is not difficult to see that for positive σ_distinct, the ratio of the interpolation probabilities from Equation (1) can be bounded by

(Pr[τe|τp|Θreal]⋅Pr[τp|Θreal])(Pr[τe|Θideal])=1(2n)σdistinct1(2n)σdistinct=(2n)σdistinct(2n)σdistinct≥1 .

It remains to upper bound ε. For this purpose, we upper bound the values ε_i for transcripts that contain forgeries. Since τ is a good transcript, we assume that bad events do not hold. Hence, either ⊤_i does not hold, which yields ε_i = 0; in the opposite case, we have to consider a few mutually exclusive cases in the following. We assume that there exists a decryption query (Nⁱ, Aⁱ, Cⁱ, Tⁱ) s. t. Tⁱ is valid. In all cases, the tag can simply be guessed correctly if the block (Xai+mii ∥ Yai+mii) is fresh. Then, the probability for the tag to be correct is 2^−τ. So, we can concentrate on the cases where it is non-fresh in the following. Prior, we define (X_i₁, X_i₂, . . . , X_{i_w+1}) as a w-chain if there exist (Y_i₁, Y_i₂, . . . , Y_{i_w+1}) s. t. the following chain has been obtained from primitive queries:

π(Xi1 ∥ Yi1)=(Ui2 ∥ Vi2)=(Ui2 ∥ Yi2) ,π(Xi2 ∥ Yi2)=(Ui3 ∥ Vi3)=(Ui3 ∥ Yi3) , ⋮π(Xiw ∥ Yiw)=(Uiw+1 ∥ Viw+1)=(Uiw+1 ∥ Yiw+1) .

The cases are:

Case (A): Nⁱ is fresh; so, there is no earlier construction query i^′ ≠ i s. t. Nⁱ = N^{i^′}.
Case (B): Nⁱ is old, but (Nⁱ, Aⁱ) is fresh, i.e., there exists no earlier construction query i^′ ≠ i with (Nⁱ, Aⁱ) = (N^{i^′}, A^{i^′}).
Case (C): (Nⁱ, Aⁱ) is old, but (Nⁱ, Aⁱ, Cⁱ) is fresh, i.e., there exists no earlier construction query i^′ ≠ i with (Nⁱ, Aⁱ, Cⁱ) = (N^{i^′}, A^{i^′}, C^{i^′}), and no w-chain of primitive queries is hit.
Case (D): (Nⁱ, Aⁱ, Cⁱ) is old; (Nⁱ, Aⁱ, Cⁱ) a prefix of another construction query.
Case (E): (Nⁱ, Aⁱ) is old and there exists a w-chain of primitive queries that is hit.

Clearly, the cases cover all possible options. We assume that no previous bad events occur, in particular, no w-multi-collisions or collisions with the primitive queries occurred.

Case (A).

We excluded bad₆ in this case. The probability that (Nⁱ ‖ K) ⊕_d d_N hits any block (Xji′ ∥ Yji′) from another construction query so that the final block is old is at most

Pr[((Ni ∥ K)⊕ddN)=(Xji′ ∥ Yji′)]≤σe2c+s .

Case (B).

Let p ≤ aⁱ + mⁱ denote the length of the longest common prefix of the i-th query with all other queries. In Case (B), the probability that any block (Xji ∥ Yji) with j ≥ p + 1 matches the permutation input of any other encryption-query block or primitive query can be upper bounded by

Pr[(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′)]≤σe2c+s+qp2c+s .

Case (C).

A similar argument as for Case (B) can be applied in Case (C). The probability that there exists i^′ ≠ i, s. t. for some block indices, it holds that (j,j′):(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′) is at most 1/2^c+s. So, it holds that

Pr[(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′)]≤σe2c+s+qp2c+s .

Case (D).

This case needs that (Xai+mi+1i ∥ Yai+mi+1i) matches the permutation input of any other encryption-query block or primitive query. The probability can be upper bounded by

Pr[(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′)]≤σe2c+s+qp2c+s .

Case (E).

Assume that (Xp+1i ∥ Yp+1i) hits a w-chain of primitive queries. Under the assumption that no other bad events occurred, the probability is at most

Pr[(Xp+1i ∥ Yp+1i)|∧i=18bad¯i]≤(mi+1)⋅w2c+s .

Over all decryption queries, we obtain

ɛ≤∑i=1qd(12τ+σe2c+s+qp2c+s+(mi+1)⋅w2c+s)≤qd2τ+qd(qp+σe)2c+s+(σd+qd)⋅w2c+s .

Our claim in Lemma 3 follows.

7 Int-RUP Analysis

We use the same notations as in Section 6 but add some. Let q_d and σ_d be the number of decryption queries and blocks over decryption queries, respectively, and q_v and σ_v the analogs for verification queries. We replace π ↞ Perm(ℬ), assume K ↞ 𝒦, and denote Π[π]_K for Oribatida with π and K.

Theorem 2

(Int-RUP Security of Oribatida). Let A be a nonce-respecting adversary w.r.t. Π[π]_K. Then

AdvΠ[π]KInt-RUP(A)≤σe22n+4σeσd+4σqp+qcqp+qp+r(σd+qd)+3rqp2c+s +qd2+(qd+qv2)2c+(qer)2τ(r-1)+3qp2k+2(qpr)2r(r-1)+2qv2τ .

Proof

The Int-RUP analysis of Oribatida follows a similar strategy as our nAE analysis. However, this time, the adversary has access to three oracles for encryption, decryption and verification. Moreover, the encryption and decryption oracles are the same in both the real and the ideal world. Both worlds differ only in the verification oracle. To alleviate the task, we replace the oracles for encryption and decryption ℰ˜[π]K , 𝒟˜[π]K with a pair of consistent pseudo-random oracles $ℰ˜[π] and $𝒟˜[π] (we define our intent of consistency for encryption and decryption in a moment). The advantage between both settings can be upper bounded by ΔA1(ℰ˜[π]K,𝒟˜[π]K,𝒱˜[π]K,π±;$ℰ˜,$𝒟˜,⊥,π±) . Note that the oracles $ℰ˜ and $𝒟˜ differ from the independent random oracles in the stronger RUPAE notion. In the RUPAE notion, they sample independently from each other without considering common prefixes between queries, which would be impossible to achieve for an on-line AE scheme. Again, we consider the H-coefficient approach. So, we define several bad events and bad as well good transcripts. If any of the bad events occurs, the adversary aborts and is defined to win. Next, we consider the probability of forgeries under those idealized oracles. So, we can exclude the previous bad events and study the probability of forgeries. Finally, we study the ratio of interpolation probabilities for good transcripts.

Sampling Consistently in the On-line Phase

This on-line phase contains much from the off-line phase of the nAE analysis. We define the ideal encryption oracle as in the nAE proof: it samples the responses (Cⁱ, Tⁱ) uniformly at random from all bit strings of expected lengths for encryption queries. The ideal decryption oracle, however, must sample plaintext outputs consistently. For this purpose, the ideal encryption oracle has to sample also the internal chaining values Vji↞{0,1}c and Uji↞{0,1}r uniformly at random for all construction queries already in the on-line phase. It stores the values of Cji , Vji , and Uji also internally, but does not release Uji and Vji in this phase.

On each input (Nⁱ, Aⁱ, Cⁱ, Tⁱ), the ideal decryption oracle looks up the length of the longest common prefix of the query p ← LCP^{Nⁱ, Aⁱ} (Cⁱ, 𝒬) with all previous queries 𝒬. For all blocks in the common prefix 1 ≤ j ≤ p, it uses the same outputs Mji that have been fixed from previous queries. Since the oracle has sampled Vp+1i for the (p + 1)-th block, it can deduce all bits not fixed from previous query outputs. Assume, i ≠ i^′, (Nⁱ, Aⁱ) = (N^{i^′}, A^{i^′}), and p = LCP^{Nⁱ, Aⁱ} (Cⁱ, C^{i^′}) where p < mⁱ, m^{i^′}. Then, Cp+1i=Cp+1i⊕Δ is the block directly after the common prefix. Sampling Vji and deriving Uji ensures consistent sampling, i.e., Mp+1i=Mp+1i′⊕Δ for all such queries with non-empty common prefix.

Starting from the (p + 2)-th block, the ideal decryption oracle samples the responses Mji↞{0,1}r , Vji↞{0,1}c and Uji↞{0,1}r uniformly and independently at random from the bit strings of expected lengths, for p + 2 ≤ j ≤ aⁱ + mⁱ. Note that queries whose ciphertext lengths are not multiples of r bits are answered consistently since the oracle samples Vji , and all bits fixed from previous queries are used. For verification queries, the ideal verification oracle always outputs ⊥. For forward primitive queries Qⁱ, the ideal oracle forwards π(Qⁱ); for backward primitive queries Rⁱ, it returns π⁻¹(Rⁱ).

Off-line phase

Here, the ideal oracle releases the internal chaining values (Uji,Vji) , after the considered adversary made all queries, but before outputting the decision bit. The ideal oracle also reveals a random key K ↞ 𝒦 then.

Bad Events

Whenever we consider a non-trivial collision between blocks or chaining values at block indices j, j^′ of two messages, we assume that at least one of them exceeds the longest common prefix.

bad₁: Non-trivial collision of permutation inputs in construction queries: ∃ (i, j) ≠ (i^′, j^′) with i, i^′ ∈ [1..q_c], j ∈ [1..mⁱ], and j^′ ∈ [1..m^{i^′}] s.t. (Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′) .
bad₂: Non-trivial collision of permutation outputs in construction queries: ∃ (i, j) ≠ (i^′, j^′) with i, i^′ ∈ [1..q_c], j ∈ [1..mⁱ], and j^′ ∈ [1..m^{i^′}] s.t. (Uji ∥ Vji)=(Uj′i′ ∥ Vj′i′) .
bad₃: Multi-collision between w tags. For some w ≥ r, there exist i₁, i₂, . . . , i_w with i₁, i₂, . . ., i_w ∈ [1..q_e], s.t. T^i₁ = T^i₂ = . . . = T^i_w.
bad₄: Non-trivial collision of permutation inputs between construction and primitive query: ∃ (i, j, i^′) with i ∈ [1..q_c], j ∈ [1..mⁱ], and i^′ ∈ [1..q_p] s.t. (Xji ∥ Yji)=Qi′ .
bad₅: Non-trivial collision of permutation outputs between construction and primitive query: ∃ i ∈ [1..q_c], j ∈ [1..aⁱ + mⁱ] and i^′ ∈ [1..q_p] s.t. (Uji ∥ Vji)=Ri′ .
bad₆: Initial-state collision with a primitive query: ∃ i ∈ [1..q_c] and i^′ ∈ [1..q_p] s.t. (X0i ∥ Y0i)=Qi′ .
bad₇: Multi-collision in the rate of w outputs of forward primitive queries: for some w ≥ r, ∃ i₁, i₂, . . ., i_w ∈ [1..q_p] s.t. msb_r(R^i₁) = . . . = msb_r(R^i_w).
bad₈: Multi-collision in the rate of w outputs of backward primitive queries: for some w ≥ r, ∃ i₁, i₂, . . ., i_w ∈ [1..q_p] s.t. msb_r(Q^i₁) = . . . = msb_r(Q^i_w).
bad₉: Forgery in decryption queries if all blocks are old: There exists some i ∈ [1..q_d] s.t. for all blocks 0 ≤ j ≤ aⁱ +mⁱ, there exist indices i^′, j^′ with i^′ ∈ [1..q_c], j^′ ∈ [1..m^{i^′}] or i^′ ∈ [1..q_p] s.t. (Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′) or (Xji ∥ Yji)=Qi′ and the tag is valid: msbτ(π(Xai+mii ∥ Yai+mii))=Ti .

We define BADT to contain exactly the attainable transcripts τ for which at least one bad events occurred. All other attainable transcripts are in GoodT. Then Pr[Θideal∈BadT]≤∑i=19Pr[badi] . The probability of bad transcripts in the ideal world is treated in the proof of Lemma 4. The ratio of obtaining a good transcript is bounded in Lemma 5. Our bound in Theorem 1 follows from those and the fundamental Lemma of the H-coefficient Technique [54]. We apply w = r in the bound of Lemma 4.

Lemma 4

Let w ≥ r be a positive integer. It holds that

Pr[Θideal∈BadT]≤σe22n+3σeσd+3σqp+qcqp+qp+w(σd+qd)+3wqp2c+s +qd2+(qd+qv2)2c+(qew)2τ(w-1)+3qp2k+2(qpw)2r(w-1)+qv2τ .

Proof

In the following, we upper bound the probabilities of the individual bad events. For most of them, we differentiate between encryption and decryption queries.

bad₁: Collision of two permutation inputs in construction queries.

Among encryption queries only: Here, it holds that
Pr[(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′)]≤2-n .
Since there exist (σe2) block combinations, we obtain (σe2)/2n .
Dec-then-Enc: If we consider an encryption query block to collide with a block from a previous decryption query, the probability is at most 2^−(c+s) since A can see r − s bits that it can use as the nonce. We have q_eσ_d combinations of such blocks. For the remaining σ_eσ_d blocks, the probability is 2⁻ⁿ.
Among decryption queries only: w.l.o.g., we consider the first such collision. If A modifies the nonce in the later following query, the bound is the same as for encryption-only queries. So, we assume in the remainder of that the later query is a decryption query. Let j − 1 be the first modified block and assume it is in the message-processing part. If the block indices differ j ≠ j^′, the probability is 2^−(c+s). Otherwise, assume j = j^′ and Aⁱ = A^{i^′}. Then, the permutation output (Uji′ ∥ Vji′) is sampled randomly in the ideal world. If A leaves Cj-aii=Cj-aii′ , it automatically holds that
Xji=Cj-aii⊕lsbs(Vρ(i,j-ai)i)=Xji′=Cj-aii⊕lsbs(Vρ(i,j-ai)i) .
So, Xji=Xji′ . With probability 2^−c, it also holds for the capacity Vj+1i=Vj+1i′ and thus Yj+1i=Yj+1i′ . Note that this approach holds only for the first differing block, for which which yields a term of (qd2)/2c . If the collision does not hold, the masks beginning for the (j + 2)-th block will differ and the probability decreases to 2^−(c+s), which produces a term of (σd2)/2c+s .
Enc-then-Dec: It remains to consider collisions between an encryption query, followed by a decryption query. If the block indices j ≠ j^′ differ, the probability is again 2^−(c+s), for at most σ_e · σ_d combinations. Otherwise, if j = j^′, A can apply the strategy above for a collision. Then, the probability is 2^−c; though, the q_d queries can collide at most with one encryption query each since we consider the first collision, producing a term of q_d/2^c.

Over all cases, we obtain

Pr[bad1]≤(σe2)2n+max(qeσd2c+s+σeσd2c+s+qd2c)+(σd2)2c+s+(qd2)2c≤(σe2)2n+σeσd2c+s+(qd2)2c .

bad₂: Collision of two permutation outputs in encryption construction queries.

This case is analogous to bad₁. Over all combinations of indices, it follows that

Pr[bad2]≤(σe2)2n+σeσd2c+s+(qd2)2c .

bad₃: Multi-collision on w tags from encryption queries.

Since the tags are sampled uniformly and independently at random in the ideal world, it holds that

Pr[Tj1i1=Tj2i2=…=Tjwiw]≤(qew)2τ(w-1) .

bad₄: Collision of permutation inputs between a construction and a primitive query.

Assume, the primitive query was asked before the construction query. If the construction query was in encryption direction, the collision probability for fixed queries is at most 2⁻ⁿ, for q_p · q_c combinations.
Otherwise, if the construction query was a decryption query, A can see r − s bits. Hence, the probability is at most 2^−(c+s), for q_p · q_c combinations.
The same argument can be applied in the case when the primitive query was asked after a decryption query. Then, the adversary can see r − s unmasked bits of the rate from Cji . Again, the probability is at most 2^−(c+s) and we have q_p · q_c combinations.
If the primitive query was asked after an encryption query, then, the latter produced a tag. If the primitive query targets any other block, the argument is the same as in Case c). If the primitive query starts from the tag, the adversary sees τ − s unmasked bits. Assuming bad¯3 , there are at most w equal tags over all encryption queries. So, the probability for a collision is 2^−(c+s), for w · q_p combinations.

Over all combinations of indices, it follows that

Pr[bad4|bad¯3]≤max(σe⋅qp2n,σe⋅qp2c+s)+σd⋅qp2c+s+w⋅qp2c+s≤σ⋅qp2c+s+w⋅qp2c+s .

bad₅: Collision of permutation outputs between a construction and a primitive query.

Again, Uj+aii can be derived from Mji⊕Cji⊕slsbs(Vρ(i,j)i) ; the values Cji and Vρ(i,j)i are sampled randomly. This case is similar as bad₄. Over all index combinations

Pr[bad5|bad¯3]≤max(σe⋅qp2n,σe⋅qp2c+s)+σd⋅qp2c+s+w⋅qp2c+s≤σ⋅qp2c+s+w⋅qp2c+s .

bad₆: Initial-state collision with a primitive query.

Here, we distinguish between the cases whether the construction query was asked before or after the primitive query and whether the primitive query was in forward or backward direction.

Assume, the primitive query was asked after the construction query. If the primitive query was a forward query, it must hit the correct value of K ⊕_d d_N. This probability is at most q_p/2^k to collide when considering encryption construction queries. Considering also decryption queries, a nonce can repeat often; though, the initial state can take three different values for the same nonce, namely if the decryption query changes the length of associated data and message, affecting d_N. Since there exist at most three distinct values for d_N, the probability to collide is at most 3q_p/2^k.
If the primitive query was in backward direction, its response must hit any initial state of a construction query. If the construction query was asked before the primitive, A sees at best r − s bits of C₁. Then, the probability is at most q_c · q_p/2^c+s.
If the primitive query was asked before the construction query, A can use the nonce part of the primitive query’s result as the nonce. However, a collision must hit the key part, which holds with probability at most 3q_p/2^k.
If the primitive query was in backward direction, A sees at best r − s bits of C₁. Then, there is at most one starting state, assuming bad¯1 , which yields q_p/2^c+s.

Over all possible options, we obtain

Pr[bad6|bad¯1]≤3qp2k+max(qc⋅qp2c+s,qp2c+s)≤3qp2k+qc⋅qp2c+s+qp2c+s .

bad₇: Multi-collision in the rate of w outputs of forward primitive queries.

Since π is chosen randomly from the set of all permutations, the outputs are sampled uniformly at random from a set of size at least 2ⁿ − (i − 1) for the i-th query. So, the probability for w distinct queries to collide in their rate is upper bounded by

2c-12n-1⋅2c-22n-2⋅⋯⋅2c-(w-1)2n-(w-1)=∏i=1w-12c-i2n-i≤(2c2n)w-1=2-r(w-1) .

Over all primitive query indices, it holds that

Pr[bad7]≤(qpw)2r(w-1) .

bad₈: Multi-collision in the rate of w outputs of backward primitive queries.

Using a similar argumentation as for bad₇, we obtain

Pr[bad8]≤(qpw)2r(w-1) .

bad₉: Forgeries if all blocks are old.

It remains to bound the probability of a successful forgery of a verification query (Nⁱ, Aⁱ, Cⁱ, Tⁱ) s. t. Tⁱ is valid and where each block is old.

We consider the same five mutually exclusive cases as in the nAE proof. In all cases, the tag can simply be guessed correctly if the block (X^{aⁱ +mⁱ} ‖ Y^{aⁱ +mⁱ}) is fresh. Then, the probability for the tag to be correct is upper bounded by 2^−τ. We adopt the cases and the notions from the nAE proof and assume that no previous bad events occur, in particular no w-multi-collisions described earlier or collisions with primitive queries.

Case (A): Nⁱ is fresh; so, there is no earlier construction query i^′ ≠ i s. t. Nⁱ = N^{i^′}.
Case (B): Nⁱ is old, but (Nⁱ, Aⁱ) is fresh, i.e., there exists no earlier construction query i^′ ≠ i with (Nⁱ, Aⁱ) = (N^{i^′}, A^{i^′}).
Case (C): (Nⁱ, Aⁱ) is old, but (Nⁱ, Aⁱ, Cⁱ) is fresh, i.e., there exists no earlier construction query i^′ ≠ i with (Nⁱ, Aⁱ, Cⁱ) = (N^{i^′}, A^{i^′}, C^{i^′}), and no w-chain of primitive queries is hit.
Case (D): (Nⁱ, Aⁱ, Cⁱ) is old; (Nⁱ, Aⁱ, Cⁱ) is a prefix of another construction query.
Case (E): (Nⁱ, Aⁱ) is old and there exists a w-chain of primitive queries that is hit.

Clearly, the cases cover all possible options. We assume that no previous bad events occur, in particular no w-multi-collisions described earlier or collisions with primitive queries.

Case (A)

We excluded bad₄, i.e., collisions of permutation inputs between construction and primitive queries in this case. The probability that (Nⁱ ‖ K) ⊕_d d_N hits any block (Xji′ ∥ Yji′) from another construction query so that the final block is old is at most

Pr[((Ni ∥ K)⊕ddN)=(Xji′ ∥ Yji′)]≤σ+qp2c+s .

Cases (B)–(D)

Let p ≤ aⁱ + mⁱ denote the length of the longest common prefix of the i-th query with all other queries. The probability that any block (Xji ∥ Yji) with j ≥ p + 1 matches the permutation input of any other query block or primitive query can be upper bounded analogously as bad₁ and bad₄:

Pr[(Xji ∥ Yji)=(Xj′i′ ∥ Yj′i′)]≤σe+qd2c+s+σe⋅σd2c+s+(qd2)2c+σ⋅qp2c+s+w⋅qp2c+s .

Over all verification queries, we obtain

Pr[bad9|∧i=18bad¯i]≤qv2τ+σe⋅σd2c+s+(qd+qv2)2c+σ⋅qp2c+s+w⋅qp2c+s+w⋅(σd+qd)2c+s .

Case (E)

Assume that (Xp+1i ∥ Yp+1i) hits a w-chain of primitive queries. Under the assumption that no other bad events occurred, the probability is at most

Pr[(Xp+1i ∥ Yp+1i)]≤(mi+1)⋅w2c+s .

Over all verification queries, we obtain

Pr[bad9|∧i=18bad¯i]≤qv2τ+σe⋅σd2c+s+(qd+qv2)2c+σ⋅qp2c+s+w⋅qp2n-τ+w⋅(σd+qd)2c+s .

Our bound in Lemma 4 follows from summing up all probabilities.

Lemma 5

Let τ ∈ GoodT. Then

Pr[Θreal=τ]Pr[Θideal=τ]≤1-(qd2τ+σd(σe+qp)2c+s) .

Proof

It remains to bound the ratio of the probabilities for obtaining a good transcript τ in the real and the ideal world, respectively. The bound is similar to that of Lemma 3. The difference to the nAE proof is that the ideal decryption oracle also generates pseudorandom output blocks Mji beyond the longest common prefix. The nAE transcript also contained the sampled internal values, as does the transcript τ here. Since we assume that no bad events have occurred, we revisit the following cases for forgeries:

Case (A): The final input to π, (Xai+mii ∥ Yai+mii) is fresh, i.e., has not occurred before. Then, the probability that the authentication tag τⁱ is valid is at most 1/2^τ.
Case (B): The final input to π, (Xai+mii ∥ Yai+mii) is old, but there exists some block index j ∈ [1..aⁱ + mⁱ] s. t. (Xji ∥ Yji) is fresh. Since the input is old, the probability that the result of any of the next blocks is old is at most (σ+qp)2c+s .
Case (C): There exists no j ∈ [1..aⁱ + mⁱ] s. t. (Xji ∥ Yji) is fresh. The probability that all of those blocks are old is at most mi(σe+qp)2c+s .

It follows that

ɛi≤12τ+σe+qp2c+s+mi(σe+qp)2c+s .

Over all indices i ∈ [1..q_d], it follows that

ɛ≤∑i=1qdɛi≤qd2τ+σd(σe+qp)2c+s .

Our claim in Lemma 5 follows.

8 Comparison with Lightweight Int-RUP-secure Schemes

Among the submissions to the NIST lightweight competition [53], ESTATE [27], LAEM [61], LOTUS-AEAD and LOCUS-AEAD [24] claimed security in the Int-RUP model. Among these modes, ESTATE, LOTUS-AEAD, and LOCUS-AEAD were elected into the second round. This section compares our proposal to those; Table 4 gives a summary.

Table 4

Comparison of Oribatida with further Int-RUP-security claiming submissions to the NIST lightweight competition. n/t = block/tweak length of the primitive, m = #message segments, sec. = security, IF = inverse-free, •/– = feature is present/absent.

Construction	Sizes (bits)							Security		Features
Construction	\|N\|	\|K\|	\|T\|	n	t	State	Rate	nAE	Int-RUP	1-pass	IF
Oribatida-192 (v1.2) [21]	64	128	96	192	0	288	96	89	48	•	•
Oribatida-256 (v1.2) [21]	128	128	128	256	0	320	128	121	64	•	•
Oribatida v1.3-192 [This work]	64	128	96	192	0	288	96	121	48	•	•
Oribatida v1.3-256 [This work]	128	128	128	256	0	320	128	121	64	•	•

ESTATE [27]	128	128	128	128	4	260	64	64	64	–	•
LOCUS-AEAD [24]	128	128	64	64	4	324	32	64	64	•	–
LOTUS-AEAD [24]	128	128	64	64	4	384	32	64	64	•	•

8.1 Brief Description

ESTATE follows SIV [57]: the associated data and message are authenticated using a variant of CBC-MAC with a tweakable block cipher before the tag is used as an initial vector of CBC-like encryption. The intermediate values are used as keystream and added to the message blocks. LOCUS-AEAD and LOTUS-AEAD employ a variant of PMAC [23] to process the associated data with the tweakable block cipher. For encryption, LOTUS-AEAD uses a variant of OTR [50], a two-round, two-branch Feistel structure to process the message in double blocks. LOCUS-AEAD employs an encryption similar to OCB [56] and EME/EME* [38]. Both LOCUS-AEAD and LOTUS-AEAD employ a single pass over the message for encryption, but two calls to the primitive per message block. The intermediate values are summed to the associated-data hash and the final message block; the encrypted sum yields the tag.

8.2 Eflciency

Oribatida processes 96- or 128-bit message blocks per primitive call, whereas the size of the message processed in one primitive call is 64 bits for ESTATE and 32 for LOTUS-AEAD and LOCUS-AEAD. Thus, Oribatida offers higher throughput; moreover, the state size of Oribatida (288 and 320 bits, respectively) is smaller than those of LOTUS-AEAD (388 bits) and LOCUS-AEAD (324 bits). ESTATE has a state size of 260 bits; all three must process the message with two calls to the primitive. LOCUS-AEAD requires the inverse operation of the underlying block cipher to be available for the decryption. In sum, Oribatida possesses a smaller state size than LOCUS-AEAD and LOTUS-AEAD, and higher nAE security, as well as a higher rate, compared to its Int-RUP-secure competitors.

8.3 Security

All three competitors are based on tweakable block ciphers, with Int-RUP claims limited by the birthday bound of the internal primitive. ESTATE inherits Int-RUP security until the birthday bound from SIV, which has been considered in [7, Sect. 6.2]. While LOCUS-AEAD and LOTUS-AEAD share similarities to OCB and OTR, they use intermediate checksums as in EME designs in the tag-generation process. Informally, modifying any message block will result in new pseudorandom internal values and therefore a pseudorandom input to the tag computation.

9 Discussion of the Updated Variant Oribatida v1.3

This section discusses the update from Oribatida (v1.2) from [21] to Oribatida v1.3 in this work, that addresses the observation by Rohit and Sarkar [59] in a straight-forward manner. Here, we briefly discuss only the differences:

Oribatida v1.2 released the tag without masking. As a consequence, the adversary has seen the full rate and had to guess only the n − τ-bit hidden part to be able to invert the encryption process. To succumb this attack, Oribatida v1.3 masks the tag such that the adversary sees τ − s bits if s ≤ τ, which restores the complexity from q/2^n−τ to q/2^c+s. Figure 4 illustrates both tag-generation processes for comparison. The masking of the authentication tag is performed exactly as for ciphertext blocks, which streamlines this process.
Oribatida v1.2 employed two permutations P and P′, where the latter was intended to be a more efficient variant of the former. In practice, P′ was instantiated with a round-reduced version of P, which was only used for processing intermediate blocks of associated data. This was fine since an upper bound on the probability of differentials was sufficient for security and not pseudo-randomness. Oribatida v1.3 unified the process and uses P at every location.
Oribatida v1.2 used a different starting value V₀ for masking the ciphertexts when the associated data was empty and V₁ otherwise. The reason was simply efficiency since empty associated data did not yield a value V₁. In contrast, Oribatida v1.3 always pads the associated data such that there always exists an intermediate value V₁ that is not used as capacity in the message-processing step. This decision implies a slightly lower throughput for empty associated data but adds unification.
As a result of Aspect (3), Oribatida v1.3 uses slightly different and more domains to properly address also the additional case when the associated data was empty.

Figure 4

Tag generation of Oribatida.

Figure 5

Security of Oribatida using q_c = 2⁵⁰.

Among all changes, only Aspect (1) is crucial. All further aspects helped unify the design. The security effect of the additional tag masking is illustrated in Figure 5 for the maximum number of q_c = 2⁵⁰ construction queries as in the NIST guidelines. One can observe that it salvages the nAE security of the 192-bit version of Oribatida v1.3. Note that the figure cannot illustrate that many primitive (offiine) queries to the permutation are in practice much easier to obtain than construction queries.

10 Instantiation of Oribatida

This section specifies the permutation SimP. From a high-level view, SimP is a variant of the domain extender Ψ_r by Coron et al. [31]. We define SimP to use a round-reduced variant of the Simon [10] block cipher and its key schedule through four such steps. We briefly recall Ψ_r before we describe the details of Simon, provide an overview of existing cryptanalysis, and close with a discussion of the implications on SimP.

10.1 The Ψ_r Domain Extender

The Ψ_r family is a two-branch Feistel-like network that consists of r calls to (pairwise independent) block ciphers. An illustration of Ψ₄ is given at the top of Figure 6. Let BlockCipher(𝒦, ℬ) denote the set of all block ciphers with key space 𝒦 and block space ℬ. For Ψ_r, π1,π2,…,πr∈BlockCipher(𝔽2n×𝔽2n,𝔽2n) are independent block ciphers which use one branch Rⁱ as state input, and the other one, Lⁱ, as secret key. Coron et al. provide statements on the indifferentiability of their constructions.

$Figure 6 Top: The construction Ψ4 [31]. The blocks πi denote block ciphers over 𝔽2n {\rm{\mathbb F}}_2^n with key space 𝔽2n {\rm{\mathbb F}}_2^n . Bottom: High-level view of the construction Φ4 as a variant of Ψ4. The blocks φi represent the key schedules that produce the subkeys and which are externalized from the block ciphers πi in Φ4. φi feeds the subkeys to πi and outputs the final subkey Krs to become the next value Rirs.$

Figure 6

Top: The construction Ψ₄ [31]. The blocks π_i denote block ciphers over 𝔽2n with key space 𝔽2n . Bottom: High-level view of the construction Φ₄ as a variant of Ψ₄. The blocks φ_i represent the key schedules that produce the subkeys and which are externalized from the block ciphers π_i in Φ₄. φ_i feeds the subkeys to π_i and outputs the final subkey K^r_s to become the next value R^ir_s.

Theorem 3 ([31])

Let π1,π2,π3↞BlockCipher(𝔽2n) be pairwise independent permutations over 𝔽2n . The three-step construction Ψ₃ with an ideal block cipher is (t_D, t_S, q, ε)-indifferentiable from an ideal cipher with t_S = O(qn) and ε = 5q²/2ⁿ.

Intuitively, it follows that a four-step construction with a fourth independent permutation π₄ ↞ BlockCipher(𝒦, 𝔽2n ) inherits at least the security of the three-step construction.

10.2 Φ_r: A Variant of Ψ_r That Includes The Key Schedule

The Ψ_r construction has to store the state that is transformed through the block cipher π_i’s state transformation, plus the key of the current step. Internally, the block ciphers π_i also have to expand the secret key to subkeys that add to the total memory requirement. We propose a variant that avoids the need to store the current secret key input. For this purpose, we define the key-schedule permutation φi:𝔽2n→𝔽2n that takes an initial key K as input and outputs the subkeys K⁰, . . . , K^r_s for fixed number of rounds r_s of π_i. An illustration is given at the bottom of Figure 6. Hereafter, we call the construction Φ_r when it consists of r steps in total. Note that Φ_r omits the final swap of the halves for simplicity and since it does not affect the security.

10.3 Simon

The Simon family of block ciphers [10] belongs to the lightest block ciphers in terms of hardware area and energy efficiency. Its round function consists of only an XOR, three bit-wise rotations, and a bit-wise AND, which renders it particularly lightweight and flexible. Moreover, Simon has been analyzed intensively since its proposal; among others, e.g., [3, 29, 44, 55, 64] studied the security of Simon-96-96 and Simon-128-128. Considerably more works targeted the smaller-state variants of Simon, which has recently been standardized as part of ISO/IEC 29167-21:2018 [40]. For concreteness, Simon-96-96 uses a word size w = 48 bits and employs 52 rounds, whereas Simon-128-128 uses w = 64 bits and 68 rounds.

10.4 The SimP-n-θ Family of Permutations

SimP is an instantiation of Φ₄ that tries to adhere to the standard as close as possible, SimP-192 employs the round-reduced Simon-96-96 as π and its key schedule as φ. To form a 256-bit permutation, SimP-256 uses Simon-128-128 with its key schedule. One iteration of the round function of Simon-2w-2w and its key-update function side by side, as is used in SimP-n, is illustrated in Figure 7. Internally, the state of SimP-n-θ consists of four w-bit words (X0i,X1i,X2i,X3i) , where the superscript index i indicates the state after Round i. We denote by r_s the number of rounds per step, and index the steps from 1 to θ, and the rounds from 1 to θ · r_s. The plaintext is denoted as (X00,X10,X20,X30) ; the ciphertext is given as (X0θrs,X1θrs,X2θrs,X3θrs) .

After Round r_s, the state halves (X0rs,X1rs) and (X2rs,X3rs) are swapped; similarly, they are swapped also after Round 2r_s, . . . , θr_s. One round of the permutation is illustrated in Figure 7. Thus, SimP-192-θ uses Simon-96-96 and consists of four 48-bit words. SimP-256-θ employs the round function and the key-update function of Simon-128-128 as a 256-bit permutation. For SimP-256-θ, the state consists of four 64-bit words.

Figure 7

One iteration of the round function of SimP, which is equivalent to the key-update function (left) and the state-update function (right) of Simon-2w/2w, where w is the word size.

10.4.1 Round Function

Let w be a positive integer for the word size. for SimP-192, w = 48 bits; for SimP-256, w = 64 bits. Let f : 𝔽_2^w → 𝔽_2^w and g : 𝔽_2^w → 𝔽_2^w be defined as

f(x)=def(x⋘8)∧((x⋘1)⊕(x⋘2))andg(x)=def(x⋙3)⊕(x⋙4) .

10.4.2 Key-update Function

Let φ_j : (𝔽_2^w)² → (𝔽_2^w)², for 1 ≤ j ≤ θ be key-update functions. Let ℓ = (j − 1) · r_s. On input (X0𝓁,X1𝓁) , it derives r_s keys (X0𝓁+i,X1𝓁+i) , for 1 ≤ i ≤ r_s, as

X0𝓁+i←X1𝓁+i-1⊕g(X0𝓁+i-1)⊕c⊕z𝓁+i-1 and X1𝓁+i←X0𝓁+i-1 ,

for 1 ≤ i ≤ r_s. Note that c = 0xff...ffc is a w-bit constant.

10.4.3 State-update Function

We define the state-update function as π : (𝔽_2^w)^r_s × (𝔽_2^w)² → (𝔽_2^w)², where the first input considers the expanded subkeys. Let ℓ = (j − 1) · r_s. It takes r_s round keys (X0𝓁,…,Xrs-1𝓁) as key input, as well as (X2𝓁,X3𝓁) as state input, and computes (X2𝓁+rs,X3𝓁+rs) recursively as:

X2𝓁+i←f(X2𝓁+i-1)⊕X3𝓁+i-1⊕X1𝓁+i-1 and X3𝓁+i←X2𝓁+i-1 ,

for 1 ≤ i ≤ r_s.

10.4.4 Step Function

Let ρj:𝔽2w4→𝔽2w4 denote the step function, for 1 ≤ j ≤ θ. Define Li=(X0i,X1i) and Ri=(X2i,X3i) . The step transforms (Li,Ri)=(X0i,X1i,X2i,X3i) into (X0i+rs,X1i+rs,X2i+rs,X3i+rs) as

(Lrs,Rrs)=(X0i+rs,X1i+rs,X2i+rs,X3i+rs)ρj(X0i,X1i,X2i,X3i)=def(πj(X2i,X3i),φj(X0i,X1i)),

for 1 ≤ j < θ. One exception is the final step ρ_θ, which omits the final swap of the halves:

ρθ(X0i,X1i,X2i,X3i)=def(φθ(X0i,X1i),πθ(X2i,X3i)).

SimP-n-θ takes a plaintext (X00,X10,X20,X30) as input and outputs (Lr,Rr)=(X0r,X1r,X2r,X3r) , with r = θr_s as ciphertext.

10.4.5 Round Constants

The round constants are those of Simon-96-96 and Simon-128-128 [10], respectively. It holds that c = 0xff...ffc, i.e., all w bits except for the least significant two bits are 1. More precisely, for w = 48, it holds that

c=(1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1100)2.

For w = 64, it holds that c =

(1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1100)2.

For both SimP-192 and SimP-256, the constants z = z₀z₁. . . z₆₁ are defined as

z=(10 1011 1101 1100 0000 1101 0010 0110 0010 1000 0100 0111 1110 0101 1011 0011)2.

The sequence has a period of 62, so z_i = z_{i mod 62}, for non-negative integers i. Note that the order of the bits z_i is reversed.

10.4.6 Number of Steps θ

We consider only the choice of θ = 4 everywhere in our proposed construction.

10.4.7 Number of Rounds

SimP-192-4 consists of r_s = 26 rounds for each step, and therefore performs r = 4 · r_s = 104 rounds in total. SimP-256-4 consists of r_s = 34 rounds for each block, and therefore performs r = 4 · r_s = 136 rounds in total. For simplicity, we also denote SimP-n-4 as SimP-n. The algorithm for SimP-n-θ is given in Algorithm 3.

Algorithm 3

Specification of the encryption and decryption algorithms of SimP-n-θ.

101:	function SimP-n-θ(M)
102:	(X00,X10,X20,X30)←wM
103:	for i ← 1..θ do
104:	for j ← 1..r_s do
105:	ℓ ← (i − 1) · r_s + j
106:	X0𝓁←X1𝓁-1⊕g(X0𝓁-1)⊕c⊕z𝓁-1
107:	X1𝓁←X0𝓁-1
108:	X2𝓁←X3𝓁-1⊕f(X2𝓁-1)⊕X1𝓁-1
109:	X3𝓁←X2𝓁-1
110:	if i ≠ θ then
111:	(X0𝓁,X1𝓁,X2𝓁,X3𝓁)←swap(X0𝓁,X1𝓁,X2𝓁,X3𝓁)
112:	C←(X0θrs ∥ X1θrs ∥ X2θrs ∥ X3θrs)
113:	return C
121:	function f (X)
122:	return ((X ⋘ 1) ∧ (X ⋘ 8))
123:	⊕(X ⋘ 2)
131:	function g(X)
132:	return (X ⋙ 3) ⊕ (X ⋙ 4)
141:	function swAP(X₀, X₁, X₂, X₃)
142:	return (X₂, X₃, X₀, X₁)

10.4.8 The Byte Order in Oribatida

For the sake of clarity, Figure 8 visualizes the byte and word order of the inputs. Let SB denote the state S in bytes; for more clarity, we further write this ordering in type-writer font. The rate consists of the first r/8 bytes of the state: SB [0], ..., SB[r/8 - 1]. The capacity represents the last c/8 bytes SB[r/8], ..., SB[n/8 - 1]. Similarly, the rate of the state consists of the first words of the permutation input. If the state is interpreted as an n-bit value, the initial Byte 0 contains the most significant eight bits: SB[0] = (S[n − 1], S[n − 2], . . . , S[n − 8]). On the other side, the least significant eight bits are stored in Byte SB[n/8 - 1]: SB[n/8 - 1] = (S[7], S[6],. . . , S[0]).

So, the rate is used first as input to the key-update function; the capacity is used as input to the state-update function.

Figure 8

Byte and word orientation of inputs into and outputs from SimP as used in Oribatida.

Remark 2

Instantiating a scheme proven in an idealized model such as indifferentiability with a symmetric-key primitive is almost always a heuristic: there simply exist few provably secure instantiations. Using the full Simon-2w-2w for each step would be an option for a more secure, but considerably less performant scheme. Concerning SimP, our approach follows the prove-then-prune strategy from AEZ [39]. However, after replacing each step by at least half of the number of rounds, and always using four steps, our approach is far less aggressive than it, as outlined above, and seems to provide a sufficient security margin.

11 Security of SimP

The number of steps and rounds of SimP was chosen to resist known cryptanalysis techniques. This section provides a rationale for our choices from the existing works.

11.1 Requirements

Oribatida with a random permutation aims at nAE security of O(rσ_d/2^c+s) and Int-RUP security of O(qd2/2c) in the ideal-permutation model. The advantage of those bounds should be much higher than the complexity to recover or predict the key. An instantiation of P must be free of distinguishing properties that allow us to distinguish it from a random permutation with non-negligible advantage and << 2ⁿ queries. This strengthens the adversary compared to the use of P in Oribatida. There, it can inject nonce, associated data, or message blocks only into the rate and can observe ciphertext and tag outputs also only from that part, but masked. Concretely, we require from P the absence of (truncated, higher-order) differential characteristics with probability ≥ 2⁻ⁿ, linear approximations with squared correlation ≥ 2⁻ⁿ, or component functions of degree < n in SimP-4. Moreover, we require the absence of impossible-differential, zero-correlation, or integral distinguishers in SimP-4. However, we disregard rebound or other forms of inside-out attacks that are inapplicable in Oribatida, or splice-and-cut attacks when using SimP as a compression function.

11.2 Existing Cryptanalysis on Simon

Various works analyzed the Simon family of block ciphers since its proposal.

11.2.1 Differential Cryptanalysis

Cryptanalysis that appeared early after the proposal of Simon followed mainly heuristics for differential cryptanalysis: Abed et al. [3] followed a heuristic branch-and-bound approach that yielded differentials for up to 30 rounds of Simon-96. Biryukov et al. [22] studied more efficient heuristics, but considered the small variants with state sizes up to 64 bits. Dinur et al. [33] showed that distinguishers on Simon with k key words can be extended by at least k rounds. Interestingly, boomerangs seemed to be less a threat to Simon-like ciphers than pure differentials.

Kölbl et al. [42] redirected the research focus to the search for optimal characteristics. More recently, Liu et al. [44] employed a variant of Matsui’s algorithm [46] to find optimal differential characteristics. They found that characteristics with probability higher than 2⁻⁹⁶ covered at most 27 rounds. Moreover, they found at best 31-round differentials with an accumulated probability higher than 2⁻⁹⁶, i.e., of probability 2^−95.34. For Simon-128, they showed that optimal differential characteristics covered at most 37 rounds and found 41-round differentials with a cumulative probability of 2^−123.74.

11.2.2 Linear Cryptanalysis

Linear cryptanalysis is similarly effective for Simon-like ciphers as its differential counterpart. Alizadeh et al. [1, 4] reported multi-trail linear distinguishers on all variants of Simon. For Simon-96-96, they proposed a distinguisher on up to 31 rounds that could be extended by two rounds in a key-recovery attack. Similarly, they reported a 37-round distinguisher for Simon-128-128 that could be extendable by two rounds. Chen and Wang [29] proposed improved key-recovery attacks with the help of dynamic key guessing. To the best of our knowledge, their attacks are the most effective ones for our considered variants in terms of the number of covered rounds, with up to 37 rounds of Simon-96-96 and up to 49 rounds of Simon-128-128 in theory.

Similar as for differentials, Liu et al. studied also optimal linear approximations [45]. They found that the optimal linear approximations can reach at most 28 rounds for Simon-96, and at most 37 rounds for Simon-128. Moreover, they determined linear hulls with potential of 2^−93.8 for 31 rounds of Simon-96, and 2^−123.15 for 41 rounds of Simon-128.

11.2.3 Integral, Impossible-differential, and Zero-correlation Distinguishers

Integral attacks cover at most 22 rounds for Simon-96-96 and 26 rounds of Simon-128-128. Initially, Zhang et al. [65] found integral distinguishers on up to 21 and 25 rounds for Simon-96 and Simon-128. Their results were extended by one round each by Xiang et al. [64], and later by Todo and Morii [62]. The latter could show the absence of integrals for 25-round Simon-96, which was confirmed by Kondo et al. [43].

The maximal number of rounds that impossible-differential and zero-correlation distinguishers can cover is given by at most twice the length of the maximal diffusion. From the results by Kölbl et al. [42], full diffusion is achieved by 11 rounds for Simon-96 and 13 rounds for Simon-128-128. So, impossible-differential and zero-correlation distinguishers can cover at most 22 and 26 rounds in the single-key setting.

11.2.4 Related-key Distinguishers

Kondo et al. [43] searched for iterative key differences in Simon. This allowed them to extend previous results by four to 15 rounds. For Simon-96-96, the authors found iterative key differentials for up to 20 rounds. It remains unclear if this yields an impossible differential; in the best case, a key-iterated 20-round distinguisher could be extended by 2 + 2 + 2 wrapping rounds: two more blank rounds where one key word is not used, plus two rounds where the key difference can be canceled by the state differences, plus two outermost rounds since the result of the non-linear function is independent of the key and therefore predictable in Simon. So, an impossible-differential distinguisher could cover up to 26 rounds. Though, such an upper bound has not been formulated to an attack on the here-considered versions by Kondo et al.; therefore, it is not contained in the overview in Table 6.

Table 5

Parameters of SimP.

Variant	Word size (w)	#Steps (θ)	#Rounds/Step (r_s)
SimP-192-4	48	4	26
SimP-256-4	64	4	34

Table 6

Existing results of best distinguishers and best key-recovery attacks on Simon-96 in the single-key setting. – = not given; Pr. = probability; Pot. = linear potential; Deg. = degree

Type	#Rounds	Time	Data	Pr./Pot./Deg.	Ref
Simon-96-96 Distinguishers
Algebraic	14	–	20 CPs	–	[55]
Integral	22	2⁹⁵	2⁹⁵ CP	95	[64]
Differential	30	–	–	92.2	[3]
Differential	31	–	–	95.34	[44]
Linear	31	–	–	93.8	[45]

Simon-96-96 Key-recovery Attacks
Multiple Linear	33	2^94.42	2^94.42 KP	94.42	[2]
Linear Hull	37	2^88.0	2^95.2 KP	95.2	[29]

Simon-128-128 Distinguishers
Algebraic	16	–	20 CP	–	[55]
Integral	26	2¹²⁶	2¹²⁶ CP	126	[64]
Linear	37	–	–	128	[2]
Differential	41	–	–	123.74	[44]
Linear Hull	41	–	–	123.15	[45]

Simon-128-128 Key-recovery Attack
Linear Hull	49	2^127.6	2^127.6 CP	126.6	[29]

11.2.5 Algebraic Cryptanalysis

Algebraic attacks are unlikely to be a threat to Simon-like constructions for sufficiently many rounds. Raddum [55] pointed out that the large number of rounds is necessary. He demonstrated that the equation systems for up to 14 rounds of Simon-96-96 and up to 16 rounds of Simon-128 can be solved efficiently in a few minutes on an off-the-shelf laptop. Extensions to considerably more rounds are still unknown.

11.2.6 Meet-in-the-Middle Attacks

Meet-in-the-middle attacks are successful primarily on primitives that do not use parts of the key in sequences of several rounds. The Simon-2w-2w versions use every key bit in each sequence of two subsequent rounds, which limits the chances of meet-in-the-middle attacks drastically. Considering 3-subset meet-in-the-middle attacks, together with an initial structure and partial matching, the length of an attack is limited to roughly that of twice the full diffusion plus four rounds plus the maximal length of an initial structure plus two rounds for a splice-and-cut part, which yields 30 rounds as a rough upper bound. It is unlikely that such attacks cover 30 or more rounds on Simon-2w-2w.

11.2.7 Correlated Sequences

An interesting recent direction may be correlated sequences introduced by Rohit and Gong in [58]. Their technique requires only very few texts and claims to break 27 rounds of Simon-32 and SIMECK-32; thus, it might outperform all previous attacks by at least three rounds. Though, that approach needs further investigation and has seen application only to Simon-32-64 until now.

11.3 Implications to SimP

Since the key schedule of Simon is fully linear, the two state words that are transformed by the key schedule allow the prediction of differences, linear and algebraic properties through a full step. In any case, SimP transforms each input word through at least 2r_s rounds of Simon.

11.3.1 Related-key Differential Cryptanalysis

SimP needs cryptanalysis of related-key differential and linear characteristics. Existing methods such as the exhaustive search in [44] or SAT solvers [42], render such studies difficult due to the large state size since the known tools cannot scale appropriately. There exist peer-reviewed related-key results on Simon, e.g., by Wang et al. [63]. For the sake of feasibility, they restricted their search to related-key trails for the small variants, i.e., Simon-32, Simon-48, and Simon-64.

We conducted experiments using the SAT-based approach from [42] as well as with the branch-and-bound approach from [44] to search for optimal differential characteristics on SimP. Though, the related-key analysis of Simon-like constructions is computationally difficult because of the large state size. We obtained improved trails for only for up to seven rounds of Simon-96; starting from eight rounds, the best characteristics found possessed a zero key difference for up to 10 rounds, which suggests that differences in the few key words do not improve the best single-key characteristics. It seems that the probabilities of the existing optimal differential characteristics and linear trails for Simon-96-96 and Simon-128-128 also hold for SimP-192-1 and SimP-256-1 beyond that point. Table 7 compares the probabilities of optimal single- and related-key differential characteristics.

Table 7

Probabilities of optimal related-key differential characteristics for round-reduced variants of Simon-96-96 and Simon-128-128. p denotes the probability; SK = single-key model, RK = related-key model

#Rounds	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20
#Rounds	21	22	23	24	25	26	27	28	29	30	31	32	33	34	35	36	37
Simon-96-96
− log₂(p) (SK)	4	6	8	12	14	18	20	26	30	36	38	44	48	54	56	62	64	66
	68	72	74	78	80	86	90	96
− log₂(p) (RK)	0	2	4	10	12	18	20	26

Simon-128-128
− log₂(p) (SK)	4	6	8	12	14	18	20	26	30	36	38	44	48	54	56	62	64	66
	68	72	74	78	80	86	90	96	98	104	108	114	116	122	124	126	128
− log₂(p) (RK)	0	2	4	10	12	18	20	26

11.3.2 Differential Distinguishers

Differential distinguishers are not expected to cover more than two steps. For two steps, we can sketch the high-level idea for a potential distinguisher. Let α0,αrs,β0,βrs∈(𝔽2w)2 . Let (α₀, 0) → (β₀, α_{r_s}) be a differential through one step of SimP. α₀ → α_{r_s} is a probability-one differential through r_s rounds of the key-update function φ₁, and 0 → β₀ the related-key differential through r_s rounds of the state-update function π₁, keyed with K and K ⊕ α₀, respectively. Assume, p =^def Pr[(α₀, 0) → (β₀, α_{r_s})] > 2^−2w.

In the second step, the difference β₀ is transformed to β_{r_s} linearly, i.e., Pr[β₀ → β_{r_s}] = 1. We can assume that β₀ ≠ 0 and (β₀, α_{r_s}) will be transformed to an output difference (β_{r_s}, 0) after the second step with probability q ≈ 2^−2w by approximating π₂ by a random permutation and using the Markov assumption and the random-key hypothesis. In this case, it holds that

Pr[(α0,0)→p(β0,αrs)→2-2w(βrs,0)]=p⋅2-2w>2-4w .

Since π₁ is a round-reduced variant of Simon with 26 or 34 rounds, it is possible to have such trails. This setting is illustrated in Figure 9. However, an adversary would have to find a manyfold of related-key characteristics.

Figure 9

Setting of a differential attack with the step-reduced instance of SimP

11.3.3 Integral and impossible-differential Distinguishers

Integral and impossible-differential distinguishers are possible for up to two steps of SimP in general. We study an integral distinguisher in the following: Consider a structure of texts (Li0,Ri0) that use pairwise distinct values Ri0 and leave Li0 constant. After the first step, the value Rirs is also constant for all texts and all values Lirs are pairwise distinct. This property is preserved through the linear key schedule to Li2rs . However, the values of R^2r_s are, in general, unknown. Note that there is no word swap after the second step. The third step destroys that knowledge.

Impossible-differential distinguishers can use a similar strategy, by setting ΔL⁰ = 0 and testing if ΔL^2r_s = 0, which is impossible. Again, note that there is no word swap after the second step.

11.3.4 Cube-like Distinguishers

Cube (and integral) distinguishers exploit that the degree of some output-bit component functions is lower than the state size. As discussed above, the degree of each bit is at least w after more than 22 rounds for Simon-96 and more than 26 rounds for Simon-128. SimP transforms each input bit through at least two steps of Simon, that consist of 26 rounds for Simon-96 and 34 rounds for Simon-128. While two out of four steps do not increase the degree in the part that is used as the key-update function, each bit is transformed through full-round Simon-96 or Simon-192. Thus, cube-like and integral distinguishers are not expected to threaten the security of SimP.

11.3.5 Number of Steps and Rounds of SimP

SimP benefits from the intensive existing cryptanalysis of Simon. The usage of the key-update function of Simon seems to not promote considerably more effective differential or linear distinguishers compared to the single-key results on Simon. The usage of the 2w-word key appears not exploitable neither by differentials and linear characteristics nor by techniques that try to benefit from a larger state, such as meet-in-the-middle distinguishers. The reason seems to be mainly the diffusion in the key schedule together with the relatively large number of rounds.

The number of steps and the number of rounds in our employed instantiations of SimP have been chosen very conservatively, using the number of rounds per step r_s as half the number of rounds in Simon. This choice guarantees that each bit passes at least once through the full-round cipher, and therefore is expected to possess at least the algebraic degree of the full-round cipher. Moreover, the diffusion properties of Simon render impossible-differential, zero-correlation, or integral distinguishers implausible.

The design of SimP is very close to the original design of Simon. So, any considerable improvement in the cryptanalysis on SimP would most likely also be a higher threat on Simon-2w-2w. While such results are not impossible, the higher number of rounds in SimP provides an additional security margin.

12 FPGA Implementations

This section reports on FPGA implementations of SimP and Oribatida.

12.1 SimP

SimP is lightweight since its transformations are exactly the round function and the key-update function of Simon-96-96 or Simon-128-128, respectively. Both transformations are based on simple operations such as rotations, XORs, and ANDs that consume only routing resources and bit-wise logical operations. The area in GEs is approximately that of Simon-96 plus some overhead, which is caused by the need for additional input to both transformations due to the swapping after r_s rounds.

Unprotected implementations of Simon are vulnerable against differential power analysis attacks using the leakage generated by the transitions in the state register; the Hamming-distance model captures such leakage. Masking – in particular, Boolean masking (XORing a random value to the output of the round function) – is one countermeasure that can be applied to Simon easily. The simple structure of Simon components allows exploring other countermeasures such as unrolling rounds to achieve higher-order side-channel resistance.

SimP can be implemented in different levels of serialization, from fully serial implementations that update only a single bit per cycle up to round-based implementations that update the full state in one clock cycle. Depending on the choice, there is a broad implementation spectrum with a trade-off between throughput and area.

12.2 Oribatida

Hardware implementations of our proposed instance of Oribatida are relatively straight-forward. It can be implemented efficiently with little extra cost compared to the duplex sponge. Additional costs result from the use of a module to generate the constants for the domain separation, which can be held in ROM. In modern FPGAs, this module takes only four look-up tables (LUTs). For domain separation, only a four-bit XOR is necessary at the input to the capacity of the permutation. An additional 64-bit register to store a mask and a 64-bit XOR to add the mask to the ciphertext are required.

The use of SimP as its main building block allows us to directly transfer the same strategy of using different data-path sizes to Oribatida. Thus, the implementer can choose among various trade-offs between throughput, latency, area, and power consumption.

In terms of side-channel resistance, the same aspects that hold for SimP also hold for Oribatida. Thus, Oribatida does not introduce additional weaknesses of side channels. Table 8 lists our implementations results obtained from Xilinx Vivado 2018 optimizing for area. All results represent measurements after the place-and-route process.

Table 8

Implementation results for SimP-256 and Oribatida-256-64 encryption/decryption and only encryption on Virtex 7 FPGA. LUTs = lookup tables; Enc. = encryption; Dec. = decryption.

	LUTs	FF	#Slices	Frequency (MHz)	Cycles	Throughput (Mb/s)
SimP
SimP-256	495	340	148	580.51	137	542.37
SimP-192	383	259	122	581.98	105	532.10

Oribatida-256-64
Enc. and Dec.	940	599	298	554.16	138	514.00
Enc. only	805	595	253	560.71	138	520.08

In Table 8, we list two columns for the number of clock cycles and throughput, the former represents the results for the processing of associated data (with the step-reduced SimP), whereas the latter denotes the results for processing the message (with the non-reduced SimP). Our results leave still room for further improvements in the close future.

13 Conclusion

This work presented Oribatida, a nonce-based permutation-based AE scheme that masks the ciphertexts from preceding permutation outputs. As a result, the adversary cannot deduce the masked part of the internal state. Therefore, Oribatida can achieve O(qd2/2c) security against forgeries under the release of unverified plaintexts. Oribatida improves the best known Int-RUP security bound while the permutation can be kept as small as 64 + 128 bits for 121-bit nAE and 48-bit Int-RUP security.

We showed that even recent previous proposals with high AE security guarantees, such as Beetle or SPoC, succumb to attacks with complexity O(qdqp2c) . In contrast, the security bound of Oribatida does not depend primarily on the number of primitive queries. We showed that our bound is tight with a matching Int-RUP attack, generalized our masking approach by applying it also to Beetle or the NIST submission SPoC, and demonstrated its application with similar attacks on their masked variants.

Acknowledgement

We thank Raghvendra Rohit and Sumanta Sarkar for their observation on the NIST lightweight mailing list [59], as well as the anonymous reviewers of the Journal of Mathematical Cryptology for the fruitful comments. Eik List was supported by DFG Grant LU 608/9-1.

Absence of Conflict of Interest.
We the authors hereby declare that we have no conflict of interest in connection with evaluated manuscripts.

References

[1] M. A. Abdelraheem, J. Alizadeh, H. AlKhzaimi, M. R. Aref, N. Bagheri, P. Gauravaram, and M. M. Lauridsen. Improved Linear Cryptanalysis of Round Reduced SIMON. IACR Cryptology ePrint Archive, 2014:681, 2014.Search in Google Scholar

[2] M. A. Abdelraheem, J. Alizadeh, H. A. AlKhzaimi, M. R. Aref, N. Bagheri, and P. Gauravaram. Improved Linear Cryptanalysis of Reduced-Round SIMON-32 and SIMON-48. In A. Biryukov and V. Goyal, editors, INDOCRYPT, volume 9462 of LNCS, pages 153–179. Springer, 2015.10.1007/978-3-319-26617-6_9Search in Google Scholar

[3] F. Abed, E. List, S. Lucks, and J. Wenzel. Differential Cryptanalysis of Round-Reduced Simon and Speck. In C. Cid and C. Rechberger, editors, FSE, volume 8540 of LNCS, pages 525–545. Springer, 2014.10.1007/978-3-662-46706-0_27Search in Google Scholar

[4] J. Alizadeh, N. Bagheri, P. Gauravaram, A. Kumar, and S. K. Sanadhya. Linear Cryptanalysis of Round Reduced SIMON. IACR Cryptology ePrint Archive, 2013:663, 2013.Search in Google Scholar

[5] R. AlTawy, G. Gong, M. He, A. Jha, K. Mandal, M. Nandi, and R. Rohit. SpoC: An Authenticated Cipher. Technical report, Feb 24 2019. First-round submission to the NIST Lightweight Cryptography Competition. https://csrc.nist.gov/CSRC/media/Projects/Lightweight-Cryptography/documents/round-1/spec-doc/spoc-spec.pdf.Search in Google Scholar

[6] E. Andreeva, B. Bilgin, A. Bogdanov, A. Luykx, B. Mennink, N. Mouha, and K. Yasuda. APE: Authenticated Permutation-Based Encryption for Lightweight Cryptography. In C. Cid and C. Rechberger, editors, FSE, volume 8540 of LNCS, pages 168–186. Springer, 2014.10.1007/978-3-662-46706-0_9Search in Google Scholar

[7] E. Andreeva, A. Bogdanov, A. Luykx, B. Mennink, N. Mouha, and K. Yasuda. How to Securely Release Unverified Plaintext in Authenticated Encryption. In P. Sarkar and T. Iwata, editors, ASIACRYPT I, volume 8873 of LNCS, pages 105–125. Springer, 2014.10.1007/978-3-662-45611-8_6Search in Google Scholar

[8] E. Andreeva, J. Daemen, B. Mennink, and G. V. Assche. Security of Keyed Sponge Constructions Using a Modular Proof Approach. In G. Leander, editor, FSE, volume 9054 of LNCS, pages 364–384. Springer, 2015.10.1007/978-3-662-48116-5_18Search in Google Scholar

[9] J. Aumasson, P. Jovanovic, and S. Neves. NORX: Parallel and Scalable AEAD. In M. Kutylowski and J. Vaidya, editors, ESORICS II, volume 8713 of LNCS, pages 19–36. Springer, 2014.10.1007/978-3-319-11212-1_2Search in Google Scholar

[10] R. Beaulieu, D. Shors, J. Smith, S. Treatman-Clark, B. Weeks, and L. Wingers. The SIMON and SPECK Families of Lightweight Block Ciphers. IACR Cryptology ePrint Archive, 2013:404, 2013.Search in Google Scholar

[11] M. Bellare, A. Boldyreva, L. R. Knudsen, and C. Namprempre. Online Ciphers and the Hash-CBC Construction. In J. Kilian, editor, CRYPTO, volume 2139 of LNCS, pages 292–309. Springer, 2001.10.1007/3-540-44647-8_18Search in Google Scholar

[12] M. Bellare and P. Rogaway. Random Oracles are Practical: A Paradigm for Designing Eflcient Protocols. In D. E. Denning, R. Pyle, R. Ganesan, R. S. Sandhu, and V. Ashby, editors, ACM CCS, pages 62–73. ACM, 1993.10.1145/168588.168596Search in Google Scholar

[13] G. Bertoni, J. Daemen, S. Hoffert, M. Peeters, G. V. Assche, and R. V. Keer. Farfalle: parallel permutation-based cryptography. IACR Trans. Symmetric Cryptol., 2017(4):1–38, 2017.10.46586/tosc.v2017.i4.1-38Search in Google Scholar

[14] G. Bertoni, J. Daemen, M. Peeters, and G. V. Assche. On the Indifferentiability of the Sponge Construction. In N. P. Smart, editor, EUROCRYPT, volume 4965 of LNCS, pages 181–197. Springer, 2008.10.1007/978-3-540-78967-3_11Search in Google Scholar

[15] G. Bertoni, J. Daemen, M. Peeters, and G. V. Assche. Duplexing the Sponge: Single-Pass Authenticated Encryption and Other Applications. In A. Miri and S. Vaudenay, editors, SAC, volume 7118 of LNCS, pages 320–337. Springer, 2011.10.1007/978-3-642-28496-0_19Search in Google Scholar

[16] G. Bertoni, J. Daemen, M. Peeters, and G. V. Assche. Permutation-based encryption, authentication and authenticated encryption. Directions in Authenticated Ciphers, 2012.Search in Google Scholar

[17] G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche. Sponge functions. In ECRYPT hash workshop, volume 2007, 2007.Search in Google Scholar

[18] G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche. On the security of the keyed sponge construction. In SHA-3 competition (round 3), volume 2011, 2011.Search in Google Scholar

[19] G. Bertoni, J. Daemen, M. Peeters, G. van Assche, and R. van Keer. Ketje v2. 2016. Submission to the CAESAR competition http://competitions.cr.yp.to/caesar-submissions.html.Search in Google Scholar

[20] G. Bertoni, J. Daemen, M. Peeters, G. van Assche, and R. van Keer. Keyak v2. 2016. Submission to the CAESAR competition http://competitions.cr.yp.to/caesar-submissions.html.Search in Google Scholar

[21] A. Bhattacharjee, E. List, C. M. López, and M. Nandi. The Oribatida Family of Lightweight Authenticated Encryption Schemes Version v1.2. Technical report, Sep 27 2019. Second-round submission to the NIST Lightweight Cryptography Competition. https://csrc.nist.gov/CSRC/media/Projects/lightweight-cryptography/documents/round-2/spec-doc-rnd2/oribatida-spec-round2.pdf.Search in Google Scholar

[22] A. Biryukov, A. Roy, and V. Velichkov. Differential Analysis of Block Ciphers SIMON and SPECK. In C. Cid and C. Rechberger, editors, FSE, volume 8540 of LNCS, pages 546–570. Springer, 2014.10.1007/978-3-662-46706-0_28Search in Google Scholar

[23] J. Black and P. Rogaway. A Block-Cipher Mode of Operation for Parallelizable Message Authentication. In L. R. Knudsen, editor, EUROCRYPT, volume 2332 of LNCS, pages 384–397. Springer, 2002.10.1007/3-540-46035-7_25Search in Google Scholar

[24] A. Chakraborti, N. Datta, A. Jha, C. M. Lopez, M. Nandi, and Y. Sasaki. LOTUS-AEAD and LOCUS-AEAD. Technical report, Feb 26 2019. First-round submission to the NIST Lightweight Cryptography Competition. https://csrc.nist.gov/CSRC/media/Projects/Lightweight-Cryptography/documents/round-1/spec-doc/lotus-aead-and-locus-aead-spec.pdf.Search in Google Scholar

[25] A. Chakraborti, N. Datta, M. Nandi, and K. Yasuda. Beetle Family of Lightweight and Secure Authenticated Encryption Ciphers. IACR Trans. Cryptogr. Hardw. Embed. Syst., 2018(2):218–241, 2018. Updated version at https://eprint.iacr.org/2018/805.10.46586/tches.v2018.i2.218-241Search in Google Scholar

[26] A. Chakraborti, N. Datta, M. Nandi, and K. Yasuda. Beetle Family of Lightweight and Secure Authenticated Encryption Ciphers. http://eprint.iacr.org/2018/805, 2018.10.46586/tches.v2018.i2.218-241Search in Google Scholar

[27] A. Chakraborti, A. Jha, C. M. Lopez, M. Nandi, and Y. Sasaki. ESTATE. Technical report, Mar 29 2019. First-round submission to the NIST Lightweight Cryptography Competition. https://csrc.nist.gov/CSRC/media/Projects/Lightweight-Cryptography/documents/round-1/spec-doc/spoc-spec.pdf.Search in Google Scholar

[28] D. Chang, M. Dworkin, S. Hong, J. Kelsey, and M. Nandi. A Keyed Sponge Construction with Pseudorandomness in the Standard Model. In The Third SHA-3 Candidate Conference, volume 3, page 7, March 2012.Search in Google Scholar

[29] H. Chen and X. Wang. Improved Linear Hull Attack on Round-Reduced Simon with Dynamic Key-Guessing Techniques. In T. Peyrin, editor, FSE, volume 9783 of LNCS, pages 428–449. Springer, 2016.10.1007/978-3-662-52993-5_22Search in Google Scholar

[30] S. Chen and J. P. Steinberger. Tight Security Bounds for Key-Alternating Ciphers. In P. Q. Nguyen and E. Oswald, editors, EUROCRYPT, volume 8441 of LNCS, pages 327–350. Springer, 2014. Full version at https://eprint.iacr.org/2013/222.10.1007/978-3-642-55220-5_19Search in Google Scholar

[31] J. Coron, Y. Dodis, A. Mandal, and Y. Seurin. A Domain Extender for the Ideal Cipher. In D. Micciancio, editor, TCC, volume 5978 of LNCS, pages 273–289. Springer, 2010. Full version at https://eprint.iacr.org/2009/356.10.1007/978-3-642-11799-2_17Search in Google Scholar

[32] J. Daemen, B. Mennink, and G. V. Assche. Full-State Keyed Duplex with Built-In Multi-user Support. In T. Takagi and T. Peyrin, editors, ASIACRYPT II, volume 10625 of LNCS, pages 606–637. Springer, 2017.10.1007/978-3-319-70697-9_21Search in Google Scholar

[33] I. Dinur, O. Dunkelman, M. Gutman, and A. Shamir. Improved Top-Down Techniques in Differential Cryptanalysis. In K. E. Lauter and F. Rodríguez-Henríquez, editors, LATINCRYPT, volume 9230 of LNCS, pages 139–156. Springer, 2015.10.1007/978-3-319-22174-8_8Search in Google Scholar

[34] C. Dobraunig, M. Eichlseder, F. Mendel, and M. Schläffer. Ascon v1.2 Submission to the CAESAR Competition. September 15 2016. Submission to the CAESAR competition. http://competitions.cr.yp.to/caesar-submissions.html.Search in Google Scholar

[35] C. Dobraunig and B. Mennink. Security of the Suflx Keyed Sponge. IACR Trans. Symmetric Cryptol., 2019(4):223–248, 2019.10.46586/tosc.v2019.i4.223-248Search in Google Scholar

[36] M. Dworkin. FIPS 202: SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions. Technical report, 2015.10.6028/NIST.FIPS.202Search in Google Scholar

[37] P. Gaži, K. Pietrzak, and S. Tessaro. The Exact PRF Security of Truncation: Tight Bounds for Keyed Sponges and Truncated CBC. In R. Gennaro and M. Robshaw, editors, CRYPTO I, volume 9215 of LNCS, pages 368–387. Springer, 2015.10.1007/978-3-662-47989-6_18Search in Google Scholar

[38] S. Halevi. EME*: Extending EME to Handle Arbitrary-Length Messages with Associated Data. In A. Canteaut and K. Viswanathan, editors, INDOCRYPT, volume 3348 of LNCS, pages 315–327. Springer, 2004.10.1007/978-3-540-30556-9_25Search in Google Scholar

[39] V. T. Hoang, T. Krovetz, and P. Rogaway. Robust Authenticated-Encryption AEZ and the Problem That It Solves. In E. Oswald and M. Fischlin, editors, EUROCRYPT (1), volume 9056 of LNCS, pages 15–44. Springer, 2015. Full version at https://eprint.iacr.org/2014/793.10.1007/978-3-662-46800-5_2Search in Google Scholar

[40] ISO/IEC. Information technology – Automatic identification and data capture techniques – Part 21: Crypto Suite SIMON Security Services for Air Interface Communications. https://www.iso.org/standard/70388.html, Oct 2018.Search in Google Scholar

[41] P. Jovanovic, A. Luykx, and B. Mennink. Beyond 2^c/2 Security in Sponge-Based Authenticated Encryption Modes. In P. Sarkar and T. Iwata, editors, ASIACRYPT I, volume 8873 of LNCS, pages 85–104. Springer, 2014.10.1007/978-3-662-45611-8_5Search in Google Scholar

[42] S. Kölbl, G. Leander, and T. Tiessen. Observations on the SIMON Block Cipher Family. In R. Gennaro and M. Robshaw, editors, CRYPTO, volume 9215 of LNCS, pages 161–185. Springer, 2015.10.1007/978-3-662-47989-6_8Search in Google Scholar

[43] K. Kondo, Y. Sasaki, Y. Todo, and T. Iwata. On the Design Rationale of SIMON Block Cipher: Integral Attacks and Impossible Differential Attacks against SIMON Variants. IEICE Transactions, 101-A(1):88–98, 2018.10.1007/978-3-319-39555-5_28Search in Google Scholar

[44] Z. Liu, Y. Li, and M. Wang. Optimal Differential Trails in SIMON-like Ciphers. IACR Trans. Symmetric Cryptol., 2017(1):358–379, 2017.10.46586/tosc.v2017.i1.358-379Search in Google Scholar

[45] Z. Liu, Y. Li, and M. Wang. The Security of SIMON-like Ciphers Against Linear Cryptanalysis. IACR Cryptology ePrint Archive, 2017:576, 2017.Search in Google Scholar

[46] M. Matsui. On Correlation Between the Order of S-boxes and the Strength of DES. In A. D. Santis, editor, EUROCRYPT, volume 950 of LNCS, pages 366–375. Springer, 1994.10.1007/BFb0053451Search in Google Scholar

[47] U. M. Maurer, R. Renner, and C. Holenstein. Indifferentiability, Impossibility Results on Reductions, and Applications to the Random Oracle Methodology. In M. Naor, editor, TCC, volume 2951 of LNCS, pages 21–39. Springer, 2004.10.1007/978-3-540-24638-1_2Search in Google Scholar

[48] B. Mennink. Key Prediction Security of Keyed Sponges. IACR Trans. Symmetric Cryptol., 2018(4):128–149, 2018.10.46586/tosc.v2018.i4.128-149Search in Google Scholar

[49] B. Mennink, R. Reyhanitabar, and D. Vizár. Security of full-state keyed sponge and duplex: Applications to authenticated encryption. In T. Iwata and J. H. Cheon, editors, ASIACRYPT II, volume 9453 of LNCS, pages 465–489. Springer, 2015.10.1007/978-3-662-48800-3_19Search in Google Scholar

[50] K. Minematsu. Parallelizable Rate-1 Authenticated Encryption from Pseudorandom Functions. In P. Q. Nguyen and E. Oswald, editors, EUROCRYPT, volume 8441 of LNCS, pages 275–292. Springer, 2014. Full version at https://eprint.iacr.org/2013/628.pdf.10.1007/978-3-642-55220-5_16Search in Google Scholar

[51] N. Mouha, B. Mennink, A. V. Herrewege, D. Watanabe, B. Preneel, and I. Verbauwhede. Chaskey: An Eflcient MAC Algorithm for 32-bit Microcontrollers. In A. Joux and A. M. Youssef, editors, SAC, volume 8781 of LNCS, pages 306–323. Springer, 2014.10.1007/978-3-319-13051-4_19Search in Google Scholar

[52] Y. Naito and K. Yasuda. New Bounds for Keyed Sponges with Extendable Output: Independence Between Capacity and Message Length. In T. Peyrin, editor, FSE, volume 9783 of LNCS, pages 3–22. Springer, 2016.10.1007/978-3-662-52993-5_1Search in Google Scholar

[53] NIST. Submission Requirements and Evaluation Criteria for the Lightweight Cryptography Standardization Process. https://csrc.nist.gov/CSRC/media/Projects/Lightweight-Cryptography/documents/final-lwc-submission-requirements-august2018.pdf, August 27 2018.Search in Google Scholar

[54] J. Patarin. The “Coeflcients H” Technique. In R. M. Avanzi, L. Keliher, and F. Sica, editors, SAC, volume 5381 of LNCS, pages 328–345. Springer, 2008.Search in Google Scholar

[55] H. Raddum. Algebraic Analysis of the Simon Block Cipher Family. In K. E. Lauter and F. Rodríguez-Henríquez, editors, LATINCRYPT, volume 9230 of LNCS, pages 157–169. Springer, 2015.10.1007/978-3-319-22174-8_9Search in Google Scholar

[56] P. Rogaway, M. Bellare, J. Black, and T. Krovetz. OCB: a block-cipher mode of operation for eflcient authenticated encryption. In M. K. Reiter and P. Samarati, editors, ACM-CCS, pages 196–205. ACM, 2001.10.1145/501983.502011Search in Google Scholar

[57] P. Rogaway and T. Shrimpton. A Provable-Security Treatment of the Key-Wrap Problem. In S. Vaudenay, editor, EUROCRYPT, volume 4004 of LNCS, pages 373–390. Springer, 2006.10.1007/11761679_23Search in Google Scholar

[58] R. Rohit and G. Gong. Correlated Sequence Attack on Reduced-Round Simon-32/64 and Simeck-32/64. IACR Cryptology ePrint Archive, 2018:699, 2018.Search in Google Scholar

[59] R. Rohit and S. Sarkar. [lwc-forum] ROUND 2 OFFICIAL COMMENT: Oribatida. NIST lwc forum mailing list, 17 September 17:09 2019.Search in Google Scholar

[60] C. E. Shannon. Communication theory of secrecy systems. The Bell system technical journal, 28(4):656–715, 1949.10.1002/j.1538-7305.1949.tb00928.xSearch in Google Scholar

[61] H. Sui, W. Wu, L. Zhang, and D. Zhang. LAEM (Lightweight Authentication Encryption Mode). Technical report, Mar 25 2019. First-round submission to the NIST Lightweight Cryptography Competition. https://csrc.nist.gov/CSRC/media/Projects/Lightweight-Cryptography/documents/round-1/spec-doc/LAEM-spec.pdf.Search in Google Scholar

[62] Y. Todo and M. Morii. Bit-Based Division Property and Application to Simon Family. In T. Peyrin, editor, FSE, volume 9783 of LNCS, pages 357–377. Springer, 2016.10.1007/978-3-662-52993-5_18Search in Google Scholar

[63] X. Wang, B. Wu, L. Hou, and D. Lin. Automatic Search for Related-Key Differential Trails in SIMON-like Block Ciphers Based on MILP. In L. Chen, M. Manulis, and S. Schneider, editors, ISC, volume 11060 of LNCS, pages 116–131. Springer, 2018.10.1007/978-3-319-99136-8_7Search in Google Scholar

[64] Z. Xiang, W. Zhang, Z. Bao, and D. Lin. Applying MILP Method to Searching Integral Distinguishers Based on Division Property for 6 Lightweight Block Ciphers. In J. H. Cheon and T. Takagi, editors, ASIACRYPT I, volume 10031 of LNCS, pages 648–678, 2016.10.1007/978-3-662-53887-6_24Search in Google Scholar

[65] H. Zhang, W. Wu, and Y. Wang. Integral Attack Against Bit-Oriented Block Ciphers. In S. Kwon and A. Yun, editors, ICISC, volume 9558 of LNCS, pages 102–118. Springer, 2015.10.1007/978-3-319-30840-1_7Search in Google Scholar

Received: 2020-04-06

Accepted: 2020-10-20

Published Online: 2021-01-29

This work is licensed under the Creative Commons Attribution 4.0 International License.

The Oribatida v1.3 Family of Lightweight Authenticated Encryption Schemes

Abstract

1 Introduction

1.1 Permutation-based Modes

1.2 Research Gap

1.3 Contribution

Remark 1

1.4 Outline

2 Preliminaries

2.1 General Notations

2.2 Nonce-based Authenticated Encryption

Definition 1

Definition 2

2.3 H-coefficient Technique

Lemma 1

3 Int-RUP Attacks on Existing AE Schemes

3.1 Int-RUP Attack on The Duplex Mode

3.2 Int-RUP Attack on Beetle

3.3 Int-RUP Attack on SPoC

3.4 Int-RUP Attack on A Hybrid of Beetle and SPoC

3.5 Discussion

4 Specification of Oribatida

4.1 Initialization

4.2 Processing Associated Data

4.3 Encryption

4.4 Decryption

4.5 Domain Separation

5 Int-RUP Attacks on Schemes with Masked Ciphertexts

5.1 The Generic Int-RUP Attack on Oribatida (Masked Duplex)

5.2 Int-RUP Attack on The Masked Beetle

5.3 Int-RUP Attack on The Masked SPoC

6 nAE Security Analysis

Theorem 1

Proof

Sampling

Bad Events

Lemma 2

Proof

Lemma 3

Proof

7 Int-RUP Analysis

Theorem 2

Proof

Sampling Consistently in the On-line Phase

Off-line phase

Bad Events

Lemma 4

Proof

Case (A)

Cases (B)–(D)

Case (E)

Lemma 5

Proof

8 Comparison with Lightweight Int-RUP-secure Schemes

8.1 Brief Description

8.2 Eflciency

8.3 Security

9 Discussion of the Updated Variant Oribatida v1.3

10 Instantiation of Oribatida

10.1 The Ψr Domain Extender

Theorem 3 ([31])

10.2 Φr: A Variant of Ψr That Includes The Key Schedule

10.3 Simon

10.4 The SimP-n-θ Family of Permutations

10.4.1 Round Function

10.4.2 Key-update Function

10.4.3 State-update Function

10.4.4 Step Function

10.4.5 Round Constants

10.4.6 Number of Steps θ

10.4.7 Number of Rounds

10.4.8 The Byte Order in Oribatida

Remark 2

11 Security of SimP

11.1 Requirements

11.2 Existing Cryptanalysis on Simon

11.2.1 Differential Cryptanalysis

11.2.2 Linear Cryptanalysis

11.2.3 Integral, Impossible-differential, and Zero-correlation Distinguishers

11.2.4 Related-key Distinguishers

10.1 The Ψ_r Domain Extender

10.2 Φ_r: A Variant of Ψ_r That Includes The Key Schedule