1 Introduction

Password-authenticated key exchange (PAKE) is an important cryptographic primitive, which allows two entries, usually a client and a server, to authenticate each other and establish a high-entropy key based on a shared low-entropy password. It is a very convenient and practical approach to protecting personal communications security, because the complicated public key infrastructure (PKI) and dedicated hardware memory for storing high-entropy symmetric keys is no longer required. Since the pioneering work of Bellovin and Merrit [1], much attention has been paid to the design and analysis of PAKE protocols, such as [2,3,4,5,6,7].

Traditionally, in a PAKE protocol, a client sends his identity information explicitly in order to help the server to identify who he is and to decide which password should be utilized. However, along with the growing concern about privacy risks on the network, people tend to avoid all applications they believe do not protect their privacy appropriately. They alternatively switch attentions to schemes with enhanced security on private information [8,9,10,11,12,13,14]. Therefore, it is desirable to reinforce conventional PAKE with additional privacy protection. For example, there are some settings in which a user might want to hide his actual identity from the server; whereas a server still wants to make sure that the party in communication is a legitimate user through authentication [15]. To meet these needs, anonymous password-authenticated key exchange (APAKE) was proposed [16], in which a client establishes a session key with the server authentically and anonymously. The server is sure that the client is a legitimate member of some pre-determined group, but it does not know the client’s actual identity. As a very convenient solution to privacy protection, the study of APAKE protocols has attracted wide interest in recent years [16,17,18,19]. Further more, the International Organization for Standardization (ISO) has been working on a draft of APAKE standard denoted as ISO/IEC 20009-4 [20], which was proceeded to FDIS stage in February 2017 by the ISO/IEC JTC 1/SC 27 work group.

1.1 Related Work

For APAKE protocols without relying on public key infrastructure, two types of schemes are distinguished: password-only APAKE and storage-aided APAKE. In the former, a client needs nothing else but a password for authentication; In the latter, a client uses not only the password but also an auxiliary storage device (e.g., smart phones, public directories, etc.), which is employed for storing password-based credentials.

1.1.1 Password-Only APAKE Protocols

In 2005, the first password-only APAKE protocol, as well as its threshold extension, was presented by Viet et al. [16], via combining an oblivious transfer (OT) protocol with a traditional PAKE protocol. Later, Shin et al. [21] pointed out that the threshold version of Viet et al.’s protocol is vulnerable to off-line dictionary attacks, and gave out an improved threshold APAKE protocol. However, both Viet et al.’s and Shin et al.’s protocols are still under the threat from impersonation attacks and off-line dictionary attacks. Yang and Zhang [22] put forward a new APAKE protocol based on the well-known SPEKE protocol [23]. Aiming to further improve the computation efficiency, Shin et al. [24] proposed an APAKE protocol based on the protocol of Yang and Zhang [22], whose computation cost is reduced if the pre-computation is allowed. To design APAKE protocols that can provide universally composable security, Hu et al. formalized and realized a new password-only APAKE protocol [17] within the well-known universal composability (UC) framework.

1.1.2 Storage-Aided APAKE Protocols

For the sake of better scalable performance, Yang et al. [25, 26] presented a different approach to the construction of APAKE, which was further developed by Qian et al. [27], Shin et al. [28] and Zhang et al. [18]. In their protocols, each client registers his identity to the server and gets back a structured value as its authentication credential. The client then uses his password to protect this credential and stores it on some extra (possibly public) storage, such as a smart card, a mobile phone or some public directories. The login phase requires both the client’s password and the corresponding password protected credential. This approach greatly reduces the burden of storage and computation on the server side, and thus gains better scalability. We note that, however, an auxiliary storage device is needed for a client at any time, which is not always available in some real world applications. Therefore, a storage-aided APAKE is less convenient than a password-only APAKE from the client’s point of view.

1.2 Contribution

Although much attention has been putting on APAKE, almost all existing protocols are designed and analyzed in the random oracle model, especially for the more convenient password-only case. Note that when random oracles are instantiated with concrete hash functions, the resulting protocols might not guarantee the same security level as proven [29, 30]. There are indeed uninstantiable cryptographic schemes which are proven secure in the random oracle model [31]. From a theoretical point of view, as well as for stronger security guarantee, researchers have been developing cryptographic protocols with provable security in the standard model, such as [4,5,6, 32].

In this paper, we focus on the need for constructing password-only APAKE protocols in the standard model. Inspired by the well-known PAKE protocols [5, 33] provably secure in the standard model, we propose the first password-only APAKE protocol (called APAKE-S) without utilizing random oracle heuristic. Our constructions make novel use of smooth projective hash functions to achieve client anonymity, which is essentially similar to the technique adopted by Benhamouda et al. in constructing the oblivious transfer (OT) protocol [6] with proven security in the standard model. The resulting APAKE-S protocol guarantees AKE security, client anonymity and mutual authentication against an outsider adversary as well as the honest-but-curious server. Additionally, it needs only three flows, achieving the optimal communication complexity of APAKE protocols with mutual authentication. Its computation overhead is also fairly acceptable.

1.3 Organization

In Sect. 2, we briefly review the security model for APAKE protocols. In Sect. 3, the cryptographic primitives used in the construction of our protocol are introduced. Section 4 presents the concrete construction of the APAKE-S protocol. The rigorous security proof in the standard model is provided in Sect. 5. Finally, conclusions are given in Sect. 6.

2 Security Model

In this section, we present a formal security model for APAKE protocols. This model was originally proposed by Viet et al. in [16], and later improved in [22, 24].

2.1 Communication Model

Protocol participants The participants of an APAKE protocol involves two sets: the sets of n clients \(\varvec{\Gamma } =\{{{C}_{1}},\ldots ,{{C}_{n}}\}\) and the set of trusted servers \(\mathbf{S}\). For simplicity, we assume that the set \(\varvec{\Gamma }\) is fixed and the set \(\mathbf{S}\) contains only one trusted server \(\mathbf{S}=\{S\}\).

Long-lived keys Here we consider password-only APAKE protocols, in which the password is the only long-term secret that a client needs. In particular, each client \({{C}_{i}}\in \varvec{\Gamma }\) holds a password \(pwd_i\) that is drawn independently and uniformly from a dictionary D. The server S holds a password file as a list \(\mathbf{pwd}_{S}={{\{pwd_i\}}_{{{C}_{i}}\in \Gamma }}\).

Protocol execution During the protocol execution, a client \({{C}_{i}}\) authenticates itself to the trusted server S that he is a legitimate user in group \(\varvec{\Gamma }\), and establishes a shared high-entropy key with the server for subsequent secure communications. For each participant \(U \in \varvec{\Gamma } \cup \mathbf{S}\) that is modeled as a probabilistic polynomial time (PPT) Turing machine, there might exist many instances, called sessions, involved in different and possibly concurrent executions of the protocol. We denote by \(U^{\delta }\) the \(\delta\)-th instance of U.

An adversary \(\mathcal{A}\), which is also modeled as a PPT Turing machine, has full control of the communication network. In the formal model, interactions between the adversary and protocol participants are captured via oracle queries, representing corresponding attacks mounted by the adversary in the real world. More specifically, we provide the adversary with access to the following oracle queries.

  • \(\texttt{Execute}( C_i, {\rho }, S, {\delta } )\): This oracle models a passive attack in which the adversary gets access to an honest execution among the client instance \(C_i^{\rho }\) and the server instance \(S^{\delta }\) by eavesdropping. The output consists of the complete transcript of messages exchanged by the instances throughout the execution.

  • \(\texttt{Send}(U, {\delta },m)\): This oracle models an active attack against the instance \(U^{\delta }\) for \(U \in \varvec{\Gamma } \cup \mathbf{S}\), in which the adversary sends an arbitrary message m to this instance. The adversary gets back whatever instance \(U^{\delta }\) publicly outputs upon receiving message m according to the protocol.

  • \(\texttt{Reveal}(U, {\delta })\): This oracle models the leakage of a session key to the adversary. It is allowed to be called only when the instance \(U^{\delta }\) actually possesses a session key. In that case, the session key is returned to the adversary \(\mathcal{A}\).

2.2 Security Definitions

Notation. For each participant U, we denote by \(pid_U^{\delta }\) the partner identifier with whom \(U^{\delta }\) believes it is interacting, \(sid_U^{\delta }\) the session identifier which is the ordered concatenation of the partial transcript (except possibly for the final message) sent and received by \(U^{\delta }\), and \(acc_U^{\delta }\) the state whether the instance accepts this session as completed.

Matching session. A client session \(C_i^{\rho }\) and a server session \(S^{\delta }\) are matching if: (1) Both \(C_i^{\rho }\) and \(S^{\delta }\) shares the same session identifier \(sid_{C_i}^{\rho } = sid_{S}^{\delta } \ne \perp\); (2) \(pid_{C_i}^{\rho } = S\) and \(pid_{S}^{\delta } = \varvec{\Gamma }\) such that \(C_i \in \varvec{\Gamma }\).

AKE security The semantic security of the session key is measured by how much information on the key can be learned by the adversary with PPT complexity. In order to capture this security notion, the adversary has access to the following additional oracle query.

  • \(\texttt{Test}(U, {\delta })\): This oracle captures the adversary’s ability in distinguishing the real session keys from uniformly random ones. A bit \(b \in \{0,1\}\) is selected uniformly at random. The oracle returns the session key held by instance \(U^{\delta }\) if \(b=1\), or a random value of the same size if \(b=0\).

In the challenge game, the adversary is allowed to issue as many Execute, Send, Reveal queries as it wants, and is allowed to asked Test query only once. We restrict that the instance \(U^{\delta }\) with respect to \(\texttt{Test}(U^{\delta })\) should be kept fresh, which means that no Reveal queries should be asked to this session or its matching session (if the latter exists). At the end of the game, the adversary outputs a bit \(b'\), represents its guess of the random bit b. We define the advantage of the adversary \(\mathcal{A}\) in violating the AKE security of an APAKE protocol P as

$$\begin{aligned} Adv_{P,D}^{AKE}( \mathcal{A} ) = 2 \Pr \{b=b'\}-1, \end{aligned}$$

where the probability is taken over all the randomness chosen by the adversary and the oracles. An APAKE protocol P is said to be AKE-secure if \(Adv_{P,D}^{AKE}( \mathcal{A} )\) is only negligible larger than \(q_{send}/|D|\), for any PPT adversary calling Send queries at most \(q_{send}\) times.

Authentication Because participants use only low-entropy passwords as their login credentials, password-only APAKE protocols are usually under the threat of undetectable online dictionary attacks and impersonation attacks. For the sake of resistance to such attacks, we consider mutual authentication for APAKE protocols.

In the game defining mutual authentication, the adversary is allowed to call as many Execute, Send, Reveal oracles as it wants, and its goal is to impersonate a legitimate user or the trusted server. We first deal with the authentication of the server to clients. Denote by \(Succ_{P,D}^{S-C}(\mathcal{A})\) the probability that \(\mathcal{A}\) successfully impersonates the trusted server to a client while the client doesn’t detect it, i.e.,

$$\begin{aligned} Succ_{P,D}^{S-C}(\mathcal{A})= & {} \Pr \left\{ {\text{a client session}}\, C_i^{\rho }\, {\text{accepts with}}\right. \\&\left. {\text{no matching session of}}\, S \right\} . \end{aligned}$$

For authentication of clients to the server, the definition is a little different in the anonymous scenario. The anonymity requirement implies that the server can only know that the client is a legitimate member from a group \(\varvec{\Gamma }\). Henceforth, the adversary is declared success if it impersonates any legitimate client in the group \(\varvec{\Gamma }\) to the server while the server doesn’t detect it. We denote by \(Succ_{P,D}^{C-S}(\mathcal{A})\) the probability of this event, i.e.,

$$\begin{aligned} Succ_{P,D}^{C-S}(\mathcal{A})= & {} \Pr \left\{ {\text{a server session}}\, S^{\delta }\, {\text{accepts with}}\right. \\&\left. {\text{no matching session of any}}\, C_i \in \Gamma \right\} . \end{aligned}$$

Then, the advantage of the adversary \(\mathcal{A}\) in violating mutual authentication of an APAKE protocol P is

$$\begin{aligned} Adv_{P,D}^{MA}( \mathcal{A} ) =\max \left\{ Succ_{P,D}^{S-C}(\mathcal{A}), Succ_{P,D}^{C-S}(\mathcal{A})\right\} . \end{aligned}$$

An APAKE protocol P is said to achieve mutual authentication if, for any PPT adversary \(\mathcal{A}\), \(Adv_{P,D}^{MA}( \mathcal{A} )\) is only negligible larger than \(q_{send}/|D|\).

Anonymity The client anonymity property requires that neither an outsider adversary nor the trusted server could distinguish which client is involved in the target session. Recall that the trusted server knows the passwords for all the clients. As in [16, 24], we assume that the trusted server behaves in an honest-but-curious way. That is, the server acts according to the protocol specification honestly, but tries to figure out the client’s actual identity.

Let \(\varPi _i\) denote the transcript of the protocol P executed between a client \(C_i\) and the trusted server S. We say that the protocol P achieves computational client anonymity against the honest-but-curious server, if for any two clients \(C_{i_0}, C_{i_1} \in \varvec{\Gamma }\), the random variables \(\varPi _{i_0}\) and \(\varPi _{i_1}\) are computationally indistinguishable. In particular, for any PPT adversary \(\mathcal{A}\), it holds

$$\begin{aligned} \texttt{Dist}(\varPi _{i_0}, \varPi _{i_1})= \left| \Pr \{\mathcal{A}(\varPi _{i_0}) = 1\} - \Pr \{\mathcal{A}(\varPi _{i_1}) = 1\}\right| \end{aligned}$$

is negligible in the security parameter k.

3 Cryptographic Primitives

In this section, we briefly review the cryptographic primitives to be used in our construction: CPA secure and (labeled) CCA2 secure public key encryption schemes, smooth projective hash functions, and one-time signature schemes. Notice that, as illustrated lately, all these primitives can be instantiated efficiently in the standard model.

3.1 Public Key Encryption Scheme

We use both CPA secure and (labeled) CCA2 secure public key encryption schemes [34] in our construction. A public key encryption scheme \(\mathcal{E}\) is defined by a triple of algorithms (GenEncDec). The key generation algorithm Gen takes the security parameter \(1^k\) as input and outputs a pair of keys (pksk). The encryption algorithm Enc, given the public key pk, a message m and a random string r, outputs a ciphertext \(c = Enc_{pk}( m; r)\). The decryption algorithm Dec takes the secret key sk and a ciphertext c as input, and outputs a plaintext m. We say that an encryption scheme \(\mathcal{E}\) is secure under chosen-plaintext attacks (CPA secure), if for any PPT adversary \(\mathcal{A}\) who knows the public key pk, its advantage in distinguishing the ciphertexts of two challenging messages \(m_0\) and \(m_1\) is negligible in the security parameter k.

A labeled CCA2 secure public key encryption scheme \(\mathcal{E}^{\prime }=(Gen^{\prime },Enc^{\prime },Dec^{\prime })\) is similar to the standard notion of CCA2 Secure public key encryption scheme, with the additional property that an arbitrary label can be bounded to the ciphertext in a non-malleable way. More precisely, for any key pair \((pk',sk')\) generated by the key generation algorithm \(Gen^{\prime }\), the encryption algorithm \(Enc^{\prime }\) takes as input the public key \(pk'\), a message m, a label l and a random string r, and outputs a ciphertext \(c = Enc^{\prime }_{pk'}( m;l; r)\). The decryption algorithm takes as input the secret key \(sk'\), a ciphertext c and a label l, and outputs \(Dec^{\prime }_{sk'}(c;l)\), which is either the message m encrypted into the ciphertext c under the label l, or \(\perp\) in the other cases.

The standard CCA2 attack experiment is modified as below, in which the adversary chooses the challenge messages \(m_0,m_1\) as well as a label \(l^*\). The decryption oracle is defined as \(ODec(c,l)=Dec^{\prime }_{sk'}(c;l)\), with the restriction that \((c^*,l^*)\) should not be queried.

$$\begin{aligned}&Exp_{\mathcal{E}, \mathcal{A}}^{ind-cca-b}(1^k)\\&(pk',sk')\leftarrow Gen(1^k)\\&(l^{*},m_0,m_1;state) \leftarrow \mathcal{A}^{ODec(\cdot ,\cdot )}(pk')\\&c^{*} = Enc_{pk'}(m_b;l^{*})\\&b' \leftarrow \mathcal{A}^{ODec(\cdot ,\cdot )}(pk';state)\\&\mathrm {return}\ b' \\ \end{aligned}$$

We say that a labeled public key encryption scheme \(\mathcal{E}^{\prime }\) is secure under chosen-cipertext attacks (CCA2 secure), if for any PPT adversary \(\mathcal{A}\), its advantage

$$\begin{aligned} Adv_{\mathcal{E}^{\prime }}^{ind-cca}(\mathcal{A}) = \Pr \left[ Exp_{\mathcal{E}^{\prime }, \mathcal{A}}^{ind-cca-1}(1^k) =1 \right] - \Pr \left[ Exp_{\mathcal{E}^{\prime }, \mathcal{A}}^{ind-cca-0}(1^k) =1 \right] \end{aligned}$$

is negligible in the security parameter k.

3.2 Smooth Projective Hash Functions

The smooth projective hash function family was first introduced by Cramer and Shoup [35], and later developed in [6, 32, 36,37,38]. Roughly speaking, it is a family of hash functions which admits two keys. One key can be used to compute the hash values for all messages in the hash domain efficiently; the other key could be used to compute the hash values on some specified subset properly but gives almost no information to the hash values for messages derived from the outside the subset.

Aiming for our construction, we recall the notion of a smooth projective hash function family associated with a CPA secure encryption scheme. Let \(\mathcal{E} = (Gen,Enc,Dec)\) be a CPA-secure public key encryption scheme and \(\mathcal{D}\) be an efficiently recognizable message space. Fix a pair of key (pksk), let \(\mathcal{C}\) be the set of valid ciphertext with respect to the public key pk, which should be efficiently recognizable with only the knowledge of pk. Define \(X = \{(c,m)|c\in \mathcal{C}, m \in \mathcal{D}\}\) and

$$\begin{aligned} L_m = \{(c,m)\in X| Dec_{sk}(c) = m\}, L = \bigcup _{m \in \mathcal{D}}{L_m}. \end{aligned}$$

A family of smooth projective hash functions, for language \(L \subset X\) onto a set \(\{0,1\}^{l_h}\) consists of the following four algorithms \(\mathcal{H} = (\mathrm {HashKG}, \mathrm {ProjKG}, \mathrm {Hash}, \mathrm {ProjH})\). The key generation algorithm \(\mathrm {HashKG}\) produces a hash key hk for language L as \(hk \mathop {\leftarrow }\limits ^{\$} \mathrm{HashKG}(L)\). The key projection algorithm takes a hash key hk and a word \((c,m)\in L\) as input, generates a projected hash key as \(hp = \mathrm{ProjKG}(hk;c,m)\). The hash algorithm takes a hash key hk and a word \((c,m)\in L\) as input, and outputs a hash value \(\mathrm{Hash}(hk;c,m)\). The projected hash algorithm takes the projected hash key hp, a word \((c,m)\in L\) and a witness w of the fact that \((c,m)\in L\) as input, and returns a hash value as \(\mathrm{ProjH}(hp;c,m;w)\).

The smooth projective hash function should satisfy both correctness and smoothness properties. The correctness property guarantees that if \((c,m)\in L\) and w is the corresponding witness, then it holds \(\mathrm{Hash}(hk;c,m) = \mathrm{ProjH}(hp;\) cmw). The smoothness property, which also defines the security of the family, assures that for any \((c,m) \in X\backslash L\), \(\mathrm{Hash}(hk;c,m)\) is statistically indistinguishable from a uniform random element from the range the hash function, even the projected key hp is given. That is,

$$\begin{aligned} \{c,m,hp, \mathrm{Hash}(hk;c,m)\}_{hk \mathop {\leftarrow }\limits ^{\$} \mathrm{HashKG}(L)} \mathop {\equiv }\limits ^{s} \{c,m,hp, h\}_{hk \mathop {\leftarrow }\limits ^{\$} \mathrm{HashKG}(L),h \mathop {\leftarrow }\limits ^{\$} \{0,1\}^{l_h}}. \end{aligned}$$

3.3 One-Time Signature Scheme

A signature scheme \(\varSigma\) consists of a tuple of algorithms (SignKGSignVerify). The key generation algorithm SignKG takes the security parameter \(1^k\) as input and outputs a pair of verification key and signing key (VKSK). The signing algorithm Sign takes the signing key SK and a message m as input, and outputs a signature \(\sigma \leftarrow Sign_{SK} (m)\). The verification algorithm Verify takes as input the verification key VK, a message m, and a signature \(\sigma\) and outputs a single bit \(b = Verify_{VK}(m; \sigma )\). We say that a one-time signature scheme \(\varSigma\) is secure, if for any PPT adversary \(\mathcal{F}\) which makes only a single query to its signing oracle \(Sign_{SK}(\cdot )\), its advantage in generating a valid signature \(\sigma\) on message m while \(\sigma\) was not previously output by \(Sign_{SK}(\cdot )\) on input m, i.e.,

$$\begin{aligned} \Pr \{(VK,SK)\leftarrow SignKG(1^k), (m,\sigma ) \leftarrow&\mathcal{F}^{Sign_{SK}(\cdot )}(VK): \\&Verify_{VK}(m;\sigma )=1 \} \end{aligned}$$

is negligible in the security parameter k.

4 The Proposed Protocol

In this section, we first describe our APAKE-S protocol relying on standard cryptographic primitives and intuitively explained the rationale behind its design. Next, we give out a kind of concrete instantiation of the underlying building blocks. Then, the proposed protocol is compared with other related schemes, in terms of security, computation and communication.

4.1 The Description of the APAKE-S Protocol

Let k denote the security parameter. Assume that \(\mathcal{E} = (Gen,Enc,Dec)\) is a CPA-secure public key encryption scheme, \(\mathcal{E}^{\prime } = (Gen',Enc',Dec')\) is a labeled CCA2-secure public key encryption scheme, and \(\varSigma = (SignKG, Sign, Verify)\) is a secure one-time signature scheme. We also use a family of smooth projective hash functions \(\mathcal{H} = (\mathrm {HashKG},\) \(\mathrm {ProjKG}, \mathrm {Hash}, \mathrm {ProjH})\) associated with the CPA-secure public key encryption scheme \(\mathcal{E}\), which are of appropriate output length, such as \(l_h = 3k\). Like other password-based construction in the standard model, our construction relies on the common reference string model. The common reference string consists of the public keys \(pk,pk'\) for \(\mathcal{E},\mathcal{E}^{\prime }\) respectively, and the parameters for the smooth projective hash function family \(\mathcal{H}\). We stress that no participants of the protocol need to know the secret keys with respect to the public keys in the common reference string.

Supposing that a client \(C_i\) from a group \(\varvec{\Gamma } = \{C_1,C_2,\ldots ,C_n\}\) wants to authenticate itself to the server anonymously and establish a key with the server. However, we assume that the client and server do not directly use password \(pwd_i\) draw independently and randomly from the dictionary D. Instead, every client \(C_i, 1\le i \le n\) binds his index i in \(\varvec{\Gamma }\) and his password \(pwd_i\) together with a collision-resistant hash function \(\mathcal{G}(\cdot )\) as \(pw_i = \mathcal{G}(i, pwd_i)\). From here on we denote by \(pw_i, \mathbf{pw}_s = \{pw_i\}\) the hashed passwords held by the client \(C_i\) and the trusted password S respectively. The depiction of the APAKE-S protocol is presented in Fig. 1, and the concrete steps are as follows.

  1. 1.

    The client \(C_i\) picks uniformly at random a string \(r\mathop {\leftarrow }\limits ^{\$} \{0,1\}^k\), computes a ciphertext of its password \(pw_i\) as \(c = Enc_{pk}(pw_i; r)\). Then, it sends the message \(\langle S,c \rangle\) to the trusted server S.

  2. 2.

    Upon receiving the message \(\langle S,c \rangle\), the sever first chooses independently n hash keys \(\mathbf{hk} = (h{k_1},h{k_2}, \ldots ,h{k_n})\) according to the key generation algorithm \(\mathrm {HashKG}\). Next, for every \(j=1,2,\ldots ,n\), it computes \(h{p_j}=\mathrm{{ProjKG}}(h{k_j};\) \(c,p{w_j})\), \(t{k_j}||t{p_j} = \mathrm{Hash}(h{k_j}; c,p{w_j})\), \({\delta _j} = t{p_1} \oplus t{p_j}\), where \({tk_j} \in \{0,1\}^k,\) \(t{p_j}\in \{0,1\}^{2k}\). The server sets \(\mathbf{hp} = (h{p_1},\ldots ,h{p_n}, {\delta _2},\ldots ,{\delta _n})\) and \({\tau _S}||s{k_S} = tp_1\). Then, it generates a key pair \((VK,SK)\leftarrow SignKG(1^k)\) for a one-time signature scheme, sets \(label = S||c||\mathbf{hp}||VK\), computes a tuple of ciphertext \(\mathbf{c}^{\prime } = ({{c}^{\prime }_1},{{c}^{\prime }_2}, \ldots ,{{c}^{\prime }_n})\) such that \({{c}^{\prime }_j} =\) \(Enc^{\prime }_{pk'}(p{w_j};label;t{k_j})\), and computes \(\sigma = Sign_{SK}(\mathbf{c}' )\). At last, the server sends the message \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) to the client.

  3. 3.

    When the message \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) is received by the client \(C_i\), it first verifies that \(\sigma\) is a valid signature. Then, it selects from \(\mathbf{hp}\) the projected key \(hp_i\) corresponding to \(C_i\)’s index in \(\varvec{\Gamma }\). Through using the password \(pw_i\) and the random string r associated with the ciphertext c, it computes the hash value through the projected hash algorithm \(t{k_i}||t{p_i} = \mathrm{{ProjH}}(h{p_i};c,p{w_i};{r})\). If the client \(C_i\) is the first user in group \(\varvec{\Gamma }\), then it sets \(tp = tp_1\); if the client \(C_i\) is not the first user in group \(\varvec{\Gamma }\), then it sets \(tp = t{p_i} \oplus {\delta _i}\). After that, the client \(C_i\) sets \({\tau _U}||s{k_U} = tp\) and \(label = (S,c,\mathbf{hp})\), and verifies that for the i-th ciphetext in \(\mathbf{c}'\), it holds \({c^{\prime }_i} = Enc^{\prime }_{pk'}(p{w_i};label;t{k_i})\). If the verification is failed, the client simply aborts this session; otherwise, it sends the message \(\langle \tau _U\rangle\) to the server, sets the state as accepted and outputs \(sk_U\) as its session key.

  4. 4.

    Upon receipt of the message \(\langle \tau _U\rangle\) from the client, the server verifies that whether \(\tau _U = \tau _S\). If the verification is failed, then the server simply aborts this session; otherwise, it sets the state as accepted and outputs the session key \(sk_S\).

Fig. 1
figure 1

The anonymous PAKE protocol APAKE-S

Correctness It is straightforward to see that, according to the correctness of the smooth projective hash functions, a client and a server who have matching password will compute the same hash value \(tk_i || tp_i\), thus the equivalent session key \(sk_U = sk_S\).

Remark 1

In the above protocol, every client uses a hashed password \(\mathcal{G}(i,pwd_i)\) which binds the index i to its original password \(pwd_i\). This technique is employed to restrict the adversary’s advantage in impersonating a client to send the first message. If the original password is used, an active adversary would have an advantage n times larger in impersonating a legitimate client, since the server would accept whenever it guesses right any one of the passwords in the list \(\mathbf{pwd}_S = \{pwd_i\}_{1\le i \le n}\).

4.2 Design Rationale

The idea of our construction is inspired from the construction of 1-out-of-n oblivious transfer (OT) protocol recently introduced by Abdalla et al. [32]. In their protocol, a chooser C who want to obtain the i-th string from a database \((m_1, \ldots , m_n)\) first sends a ciphertext \(c = Enc_{pk}(i;r)\) to the database owner S. Then S selects n hash keys \(\{hk_j\}\), computes the projective keys \(\{hp_j\}\) as well as the hash values \(\{K_j = Hash(hk_j, c, j)\}\), and masks the database as \(\{M_j = K_j \oplus m_j\}\). Upon receiving \(\{hp_j\}, \{M_j\}\), the chooser recompute \(K_i\) through the witness r. Therefore, he can recover the i-th string as \(m_i = M_i \oplus K_i\), while all the other strings are kept secret from him according to smoothness property. Unfortunately, since the proof of their protocol depends on the existence of authenticated channels, it cannot be applied directly to obtain an APAKE protocol.

Notice that the PAKE protocols proposed in [5, 33] establish a session key for both a client and a server in an authenticated way. One might want to combine these two protocols directly, where the PAKE protocol generates the secure channel and the OT protocol provides the client anonymity. However, it is also not trivial. The main difficulty is that, in the above PAKE protocols [5, 33], the server needs the client’s identity information explicitly and essentially in guaranteeing the security of the PAKE protocol, which cannot be removed simply.

We solve this problem by adopting the following techniques. Firstly, in order to providing server authentication while preserving the secrecy of the password, we have the server compute a ciphertext \(c^{\prime }_j\) for each \(pw_j, j=1,2,\ldots , n\). Since a legitimate client can merely compute correctly one hash value among those computed by the server, then only the ciphertext \(c^{\prime }_i\) can be verified by the client \(C_i\). The remainder ciphertexts leakage no information about other clients’ passwords, even under active attacks, due to the CCA security of the encryption scheme. Secondly, a one-time signature scheme is applied to bind all the ciphertexts in \(\mathbf{c}^{\prime }\) together, which prevents the adversary from modifying partial authenticated message sent by the server. To illustrate this point, let’s consider the subsequent attack while the signature scheme is removed. An adversary, who wants to figure out whether the client in communication is someone (say \(C_i\)) it is interested in, can simply modify the response message from the server by changing all the ciphertexts \(\{ c^{\prime }_{j} \}_{1\le j \le n, j \ne i}\) but \(c'_{i}\). If the client \(C_i\) happened to be the user in communication, then no inconsistent will be detected in such case and a normal response message would be sent by the client; if not, an error will be reported by the client.

With respect to the client side, a CPA secure ciphertext is sent as the first message to protect the client’s password and identity. At last, each hashed value is divided into two sub-strings as \(tk_j || tp_j\) for different purpose. These values are computationally independent if the client does not know the corresponding password due to the smoothness property of hash functions, but are related with each other if both parties shared the same password. Therefore, the client is authenticated if it sends back a valid authenticator \(\tau _S\).

4.3 Instantiation of the Primitives

The APAKE-S protocol is essentially a generic construction, since it is based on generic building blocks, such as public key encryption scheme and smooth projective hash function family, instead of specific number-theoretic primitives. In Sect. 5, we also reduce the security of the protocol to the cryptographic properties of these generic primitives. Consequently, as these underlying primitives could be efficiently realized based on either the decisional Diffie–Hellman (DDH), Quadratic Residuosity or N-Residuosity assumptions [36], the proposed protocol can also be instantiated under such assumptions, generating a series of new APAKE protocols provably secure in the standard model.

In order to illustrate the feasibility of our construction, in the following we demonstrate a kind of instantiation based on the DDH assumption.

4.3.1 Public Key Encryption Scheme

The CPA secure public key encryption scheme \(\mathcal{E} = (Gen,Enc,Dec)\) can be instantiated with ElGamal encryption scheme [39]. It consists of the following algorithms. The algorithm Gen takes as input the security parameter \(1^k\), generates a group \(\mathbb {G}\) of primitive order p with a generator g. It also chooses a random value x from \(\mathbb {Z}_p^{*}\) and sets \(h=g^x\). The public key is \(pk=(\mathbb {G},p,g,h)\) and the secret key is \(sk = x\). To encrypt a message \(m \in \mathbb {G}\) with a random value r, we compute \({c}=Enc_{pk}(m;r)=(u = g^r, v=h^r\cdot m)\). The decrypt algorithm computes \(m=v/u^x\).

Next, the CCA secure public key encryption scheme \(\mathcal{E}^{\prime } = (Gen',Enc',\) \(Dec')\) could be instantiated using the Cramer-Shoup encryption scheme [35], with slight variation to obtain a labeled scheme. The key generation algorithm \(Gen^{\prime }\) generates a group \(\mathbb {G}\), selects at random \((g_1,g_2)\in \mathbb {G}^2\), \((x_1,x_2,y_1,y_2,z)\in \mathbb {Z}_{p}^{5}\), sets \(c=g_1^{x_1}g_2^{x_2}, d=g_1^{y_1}g_2^{y_2}, f=g_1^z\). It also chooses randomly a hash function \(H_k(\cdot )\) in a collision-resistant hash family. The public key is \(pk^{\prime } = (\mathbb {G},p,g,g_1,g_2,c,d,f,H_k(\cdot ))\) and the secret key is \(sk^{\prime }=(x_1,x_2,y_1,y_2,z)\). The encryption algorithm computes the ciphertext as \(c = Enc^{\prime }_{pk^{\prime }}(m;l;r) = (u_1=g_1^{r},u_2=g_2^r,e=f^r\cdot m, v=(cd^\xi )^r)\), where \(\xi = H_k(l,u_1,u_2,e)\). The decryption algorithm \(Dec^{\prime }\) takes \(sk', l, c\) and computes \(\xi = H_k(l,u_1,u_2,e)\) at first, then checks whether \(u_1^{x_1+\xi y_1} \cdot u_2^{x_2+\xi y_2} = v\). If the verification is valid, it outputs \(m = e/u_1^z\); otherwise, it outputs \(\perp\).

4.3.2 Smooth Projective Hash Functions

The smooth projective hash functions associated with the ElGamal encryption scheme was first proposed by Gennaro and Lindell [36]: the hash key consists of two random values \(hk = (\eta ,\theta )\in \mathbb {Z}_{p}^{2}\). The corresponding projection key is \(hp=g^{\eta }h^{\theta }\in \mathbb {G}\). For every tuple \((c,m)\in X\), one can compute the hash value via the hash key hk as \(Hash(hk;c,m) = u^\eta \cdot (v/m)^\theta\). For every valid cipher text c of m with witness r, one can also compute the hash value through the projection key hp as \(ProjH(hp;c,m;r) = hp^r\). One can easily verify that, when \(c = Enc_{pk}(m;r) = (g^r, h^r \cdot m)\), it holds \(Hash(hk;c,m) = u^\eta \cdot (v/m)^\theta = hp^r= ProjH(hp;c,m;r)\).

4.3.3 One Time Signature

Recall that one time signatures are digital signature schemes that only need to satisfy a weak security notion, i.e., unforgeability under a one time chosen-message attack. They could be constructed from any one-way function [40], as opposed to trapdoor functions that are required in the construction of public key signatures like DSA. As a result, the signing and verification algorithms of one time signatures are usually very efficient in term of computation complexity. For the instantiation of our protocol, we choose to use the one time signature scheme that was first introduced by Lamport [41]. Lamport signatures can be implemented by any cryptographically-secure hash functions. Since that the security of Lamport signatures depend only on the one way property of these hash functions, they are cryptographic primitives secure in the standard model.

4.3.4 Implementation of the Scheme

We implement the above instantiated APAKE-S protocol in Charm [42], which is a framework developed by Akinyele et al. for rapidly prototyping advanced cryptographic schemes and protocols. Based on the Python language, the charm framework was designed from the ground up to minimize development time and code complexity while promoting the reuse of components.

The implementation of our APAKE-S protocol is executed on a personal computer equipped with Intel Core\(^\mathrm{TM}\) i7 CPU [email protected] GHZ, 4.0 GB RAM, Ubuntu 14.04, Python 3.6 and Charm-crypto framework v0.43. We run our tests on \(\texttt{integergroup}\)s with different security patameters, i.e. with different sizes of primes of the form \(p=2q+1\), as well as different sizes of legitimate groups \(\varvec{\Gamma }\) of clients. The implementing results are illustrated in Table 1, where the running times are obtained by executing the test routine 100 times and taking the average value. As shown in the table, the APAKE-S protocol is fairly practical, especially on the client side.

Table 1 Average running times of the APAKE-S protocol (unit: second)

4.4 Performance Comparisons

In this subsection, we compare our APAKE-S protocol with other password-only APAKE schemes, in terms of security, computational and communication efficiency.

Security Comparisons As shown in Table 2, the APAKE-S protocol is more secure than the existed protocols [16, 22, 24, 43]. More specifically, the APAKE-S protocol is proven secure in the standard model, i.e., without random oracle heuristic. However, all the existing APAKE protocols are analyzed in random oracle model. Note that random oracle is used as an ideal abstraction for the cryptographic hash function in these schemes. It requires such strong randomness assumptions that no function computable by a PPT algorithm can instantiate a true random oracle [29]. We thus believe that protocols with provable security in the standard model will provide stronger security guarantees than those counterpart only proven secure in the random oracle model. Additionally, the APAKE-S protocol achieves explicitly mutual authentication between the anonymous client and the server, which would prevent outside attackers from successfully impersonating either the server or the clients.

Another potential impact on the security of these protocols is the underlying assumption. We note that, the existing protocols [16, 22, 24, 43] depends on the computational Diffie–Hellman (CDH) assumption or its variations, such as the square computational Diffie–Hellman (SCDH) assumption [22], the decisional inverted-additive Diffie–Hellman (DIADH) assumption [22] and the chosen target CDH assumption [24]. In comparison, the APAKE-S protocol can be proven secure under either the decisional Diffie–Hellman (DDH) assumption, the quadratic residuosity (QR) assumption, or the N-residuosity (NR) assumption. More options would be better in the view of system management.

Table 2 Comparisons of security among APAKE protocols

Efficiency Comparisons With respect to computational complexity, we only list the most time-consuming operations such as modular exponentiation, the computation of hash and projective hash with respect to a smooth projective hash function family, et al. In particular, Exp indicates the time needed for the computation of one modular exponentiation, Enc, Dec, Sign, Verify respectively denote the time used in the computation of symmetric encryption, decryption, one-time signature and verification operations. The detailed comparisons are illustrated in Table 3.

First note that the APAKE-S protocol is quite efficient in terms of communication, which requires only three rounds, while providing (explicitly) mutual authentication. Recall that this achieves the optimal bound for mutually authenticated PAKE protocols, even without anonymity property. The SL-APAKE protocol [43] needs 4 rounds of communication. The VEAP protocol requires only 3 rounds, but a client has to retrieve some information from the server’s public bulletin board before the actual communication.

Table 3 Comparisons of efficiency among APAKE protocols

As an password-based protocol with proven security in the standard model, the APAKE-S protocol provides stronger security guarantee but is less efficient than those protocols secure in random oracle models. However, we emphasize that it is still fairly practical. On the one hand, compared to existing APAKE protocols, the APAKE-S protocol has lower expansion rate in term of computation complexity comparing with associated two-party PAKE protocols (see Table 4). Particular on the client side, it is of complexity almost the same as the underlying PAKE protocols presented by Jiang et al. [5] and Groce et al. [33] in the standard model. While by contrast, in the previously proposed APAKE protocols [16, 22, 24], which are respectively based on underlying PAKE protocols AuthA [44] or SPEKE [23], a client would has to preform 2 or 3 times more computation. Similar to the previously proposed APAKE protocols [16, 22, 24], the computation cost on the server side is relatively high but is linear with the total number of users n. On the other hand, when all the operations are instantiated as in Sect. 4.3, the client side has computational complexity only roughly 2 times greater than the solutions [22] in random oracle model.

Table 4 Comparison of expansion rates of computation complexity

5 Security of the Protocol

In this section, we show that the APAKE-S protocol described in Sect. 4.1 provides AKE security, mutual authentication and client anonymity.

5.1 AKE Security

Informally, AKE security means that a PPT adversary cannot tell apart a real session key and a random key, unless it guesses correctly the password used by the session.

Theorem 1

Let \(\mathcal{E} = (Gen,Enc,Dec)\) be a CPA-secure public key encryption scheme, \(\mathcal{E}^{\prime } = (Gen',Enc',Dec')\) be a labeled CCA2-secure public key encryption scheme, and \(\varSigma = (SignKG, Sign, Verify)\) be a secure one-time signature scheme. Assume that \(\mathcal{H} = (\mathrm {HashKG},\) \(\mathrm {ProjKG}, \mathrm {Hash}, \mathrm {ProjH})\) is a family of smooth projective hash functions associated with the encryption scheme \(\mathcal{E}\). Then the protocol APAKE-S in Fig. 1 achieves AKE-security as defined in Sect. 2.

Proof

Let \(\mathcal{A}\) be an adversary attacking the AKE-security of the protocol, with time complexity at most t and asking \(q_{exe},q_{send},q_{rev}\) queries respectively to the Execute, Send, Reveal oracles. We would bound the advantage of \(\mathcal{A}\) through a hybrid argument. In detail, we will view the protocol execution as a game played between the adversary and a simulator who simulates all the honest oracle instances, and incrementally define a sequence of games \(G_0, G_1, \ldots , G_{11}\). Denote by \(Adv_{i}(\mathcal{A})\) the advantage of adversary \(\mathcal{A}\) in game \(G_i\). The upper bound of \(Adv_{0}(\mathcal{A})\) will then be estimated through bounding the difference between the adversary’s advantages in successive games as well as its advantage in the last game.

Game \(G_0\) This game is corresponding to the real attack, in which all the oracle instances as specified in the security model are available to the adversary. By definition,

$$\begin{aligned} Adv_{{\texttt{APAKE-S}},D}^{AKE}(\mathcal{A}) = Adv_{0}(\mathcal{A}). \end{aligned}$$
(1)

Game \(G_1\) In this game, the Execute oracles are modified as follows. If a query of the form \(\texttt{Execute}( C_i, {\rho }, S, {\delta } )\) is called, we compute \(c = Enc_{pk}(pw_0)\) instead, where \(pw_0\) is a password not in the dictionary D but in the plaintext space of the encryption scheme \(\mathcal{E}\) (Such a \(pw_0\) is called a dummy password). In order to keep the consistency, the values \(\tau _{U}||sk_{U}\) on the client side are set to be equal to \(\tau _{S}||sk_{S}\) computed by the server. The remainder of the transcript is computed as in the previous game.

Lemma 1

\(\left| Adv_{1}(\mathcal{A}) - Adv_{0}(\mathcal{A})\right| \le negl(k).\)

Proof

First observe that in the \(\texttt{Execute}\) oracles, the verification procedures can be removed as they would always be successful except for negligible probability. Next, we define \(G_0^{(\eta )}\) (\(0 \le \eta \le q_{exe}\)) to be a sequence of hybrid variants of game \(G_0\) such that the first \(\eta\) \(\texttt{Execute}\) queries are answered according to \(G_1\), and the rest queries are replied in line with \(G_0\). It is clear that games \(G_0^{(0)}\) and \(G_0^{(q_{exe})}\) are equivalent to \(G_0\) and \(G_1\) respectively. If the adversary’s advantage gap between games \(G_0\) and \(G_1\) is non-negligible, there would exist an \(\eta\) such that the adversary’s advantage gap between \(G_0^{(\eta -1)}\) and \(G_0^{(\eta )}\) is non-negligible. Then we build an adversary \(\mathcal{M}\) against the encryption scheme \(\mathcal{E}\) with non-negligible probability as follows.

Upon receiving the public key pk of the encryption scheme \(\mathcal{E}\) from its challenger, adversary \(\mathcal{M}\) initializes the public parameters for the APAKE-S protocol, chooses passwords for all clients and a bit \(b \in \{0,1\}\) for answering the \(\texttt{Test}\) oracle. Then, it simulates \(\texttt{Execute}, \texttt{Send}, \texttt{Reveal}\) and \(\texttt{Test}\) oracles for the adversary \(\mathcal{A}\) exactly as in game \(G_0^{(\eta )}\) except for the \(\eta\)-th \(\texttt{Execute}\) oracle, say \(\texttt{Execute}( C_i, {\rho }, S, {\delta } )\). In response to this query, \(\mathcal{M}\) provides \(pw_i\) and \(pw_0\) to its challenger to obtain a challenging ciphertext \(c^{*}\) that is either \(Enc_{pk}(pw_i)\) or \(Enc_{pk}(pw_0)\), and uses this value as the ciphertext c computed by the client to answer the \(\texttt{Execute}( C_i, {\rho }, S, {\delta } )\) query. At last, \(\mathcal{M}\) checks whether \(\mathcal{A}\) succeeds or not. If \(\mathcal{A}\) succeeds in this hybrid game, then \(\mathcal{M}\) outputs 1; otherwise, it outputs 0.

It is obvious that the distinguishing advantage of \(\mathcal{M}\) is exactly equal to the adversary \(\mathcal{A}\)’s advantage gap between \(G_0^{(\eta -1)}\) and \(G_0^{(\eta )}\). Then, the lemma is a direct consequence of the fact that the encryption scheme \(\mathcal{E}\) is CPA secure.

Game \(G_2\) In this game, we continue to modify the response to the Execute oracles. Recall that first messages of the transcripts in these oracles have been changed into ciphertexts of the dummy password \(pw_0\) in the last game. Now, we choose \(\{ tk_j || tp_j \}_{1 \le j \le n}\) as independently random strings, rather than as smooth projective hash values computed from the hash key \(hk_j\) and the word \((c,pw_j)\).

Lemma 2

\(\left| Adv_{2}(\mathcal{A}) - Adv_{1}(\mathcal{A})\right| \le negl(k).\)

Proof

The proof of this lemma counts on the properties of the smooth projective hash function family. Since the ciphertexts c computed on the client side in these Execute oracles have been replaced by ciphertexts of the dummy password \(pw_0\), then for every \(j \in \{1,2,\ldots , n\}\), it holds that \((c,pw_j) \notin L_{pw_j}\) and therefore the output of the hash function \(\mathrm{Hash} (hk_j; c, pw_j)\) is negligible close to uniform even the projection key \(hp_j\) is given. Since that at most \(q_{exe}\) Execute oracles are handled in this game and \(q_{exe}\) is polynomial in the security parameter k, the lemma follows immediately.

Game \(G_3\) We modify game \(G_2\) to \(G_3\) with the following difference in \(\texttt{Execute}\) queries. When the ciphertext \(c^{\prime }_j\) is needed, we compute it as \(c^{\prime }_j = Enc^{\prime }_{pk^{\prime }}(pw_0;\) \(label;tk_j)\) instead of as an encryption of the password \(pw_j\) for every \(j \in \{1,2, \ldots , n\}\). Recall that \(tk_j\) has been replaced by uniformly random previously. By utilizing a standard hybrid argument similar to that in the proof of Lemma 1 with reduction to the CPA security of the public key encryption scheme \(\mathcal{E}^{\prime }\), it is easy to get the following lemma.

Lemma 3

\(\left| Adv_{3}(\mathcal{A}) - Adv_{2}(\mathcal{A})\right| \le negl(k).\)

Till now, we have finished modifying \(\texttt{Execute}\) oracles, such that all the transcripts generated by these oracles as well as their session keys are independent of the passwords shared between the clients and the server. From the next game on, we begin to deal with those oracles which receive \(\texttt{Send}\) queries. Before we start, let us introduce the following notations. Let \(\texttt{Send}_0(U,\delta ,S)\) denote an initial query that activates the client instance \(U^{\delta }\) to initiate the protocol with the server S. Denote by \(\texttt{Send}_d(U,\delta ,m)\) the \(\texttt{Send}(U,\delta ,m)\) query in which m is sent to instance \(U^{\delta }\) as the d-th round message, where \(d=1,2,3\).

Consider a query \(\texttt{Send}_2(C_i,\rho ,\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle )\) with respect to some client instance \(C_i^{\rho }\). If there exists a server instance \(S^{\delta }\) that has output the message \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) in response to a query \(\texttt{Send}_1(S,\delta ,\langle S,c \rangle )\), and that \(\langle S,c \rangle\) is exactly the message sent by instance \(C_i^\rho\) previously, we say that the message \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) in the query is match-generated; otherwise, it is called non-match-generated.

Game \(G_4\) Before we start to deal with Send oracles, we first rule out the possibility of forging a valid one-time signature. In detail, if at any point of a game, the adversary outputs a forged one-time signature with respect to any \(\texttt{Send}_1\)-oracle-generated verification key, we abort the game. We claim that, this event can only occurs with negligible probability due to the security of the one-time signature scheme, the adversary’s advantages in this game and the last one differs with negligible probability.

Lemma 4

\(\left| Adv_{4}(\mathcal{A}) - Adv_{3}(\mathcal{A})\right| \le negl(k).\)

Proof

It is straightforward to reduce the result of this lemma to the security of the one-time signature scheme \(\varSigma\). Consider the following forger \(\mathcal{F}\), who is provided with a verification key VK of \(\varSigma\) and the associated signing oracle \(Sign_{SK}(\cdot )\). Denote by \(q_s\) the upper bound on the number of \(\texttt{Send}_1\) oracles queried by the adversary \(\mathcal{A}\). The forger \(\mathcal{F}\) picks uniformly at random \(s \in \{ 1,2, \ldots , q_s \}\). Then, it simulates the protocol execution for \(\mathcal{A}\) exactly as in the game \(G_3\), except that the s-th \(\texttt{Send}_1\) query is modified as follows. The invocation of the SignKG algorithm in this oracle query of \(\mathcal{A}\) is answered with VK. And when a signature is needed for this oracle, \(\mathcal{F}\) query its own signing oracle one time to obtain this signature. If the adversary \(\mathcal{A}\) outputs a forging signature with respect to the verification key VK of the s-th \(\texttt{Send}_1\) oracle, the forger \(\mathcal{F}\) output it as its own forgery.

It easy to see that \(\mathcal{F}\) perfectly simulates the protocol execution as in game \(G_3\). According to the fact that s is selected uniformly at random at the beginning of the simulation, the probability that the s-th session is chosen by \(\mathcal{A}\) is at least \(1/q_s\). Therefore, if the adversary \(\mathcal{A}\)’s advantage gap between game \(G_3\) and \(G_4\) is a non-negligible value \(\epsilon\), the advantage that \(\mathcal{F}\) violating the security of the one-time signature scheme \(\varSigma\) is at least \(\epsilon /q_s\) and is non-negligible. This completes the proof of the lemma.

Game \(G_5\) In this game, we consider the client instances that receive \(\texttt{Send}_2\) queries. From here on we record the secret keys \(sk,sk'\) when the corresponding public keys \(pk,pk'\) in the common reference string are generated. Then, if a query \(\texttt{Send}_2(C_i,\rho ,\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle )\) is asked and \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) is non-match-generated, we first verify the signature \(\sigma\) in this message. If this signature is invalid, we abort this session as usual. Otherwise, if this signature is valid, we set \(label = S||c||\mathbf{hp}||VK\) and decrypts \(c^{\prime }_{i}\) to get \(pw_i^{\prime } = Dec^{\prime }_{sk'} (c^{\prime }_{i};label)\). If \(pw_i^{\prime }\) equals to the password \(pw_i\) held by client \(C_i\), we announce the success of the adversary \(\mathcal{A}\) and halt the game; otherwise, we let the client instance \(C_i^{\delta }\) reject this query as in the previous games.

Lemma 5

\(Adv_{4}(\mathcal{A}) \le Adv_{5}(\mathcal{A}).\)

Proof

The game \(G_{5}\) differs from the game \(G_3\) only in the situation that \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) is non-match-generated, but the signature is valid and \(c^{\prime }_{i}\) decrypts to the right password. In such case, the verification procedure in game \(G_3\) might be failed while \(tk_i\) is not correctly recovered; however, the adversary is always declared successful in game \(G_{5}\). Therefore, the adversary’s advantage can only increase.

Game \(G_{6}\) In this game, we make some technical rewrite on how temporary values \(tk_i || tp_i\) in client instances are generated. If a query \(\texttt{Send}_2(C_i,\rho ,\langle \mathbf{hp},VK,\) \(\mathbf{c}',\sigma \rangle )\) is asked to some client instance \(C_i^{\rho }\) and the message \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) is match-generated by some server instance \(S^{\delta }\), then we set \(tk_i || tp_i\) on the client side \(C_i^{\rho }\) to be equal to the values computed by the server instance \(S^{\delta }\). It is evident that this modification does not change the adversary’s view, whose advantage is exactly equivalent to that in game \(G_{5}\).

Lemma 6

\(Adv_{6}(\mathcal{A}) = Adv_{5}(\mathcal{A}).\)

Game \(G_{7}\) This game modifies the way \(Send_0\) queries are handled. If a \(Send_0\) query, say \(\texttt{Send}_0(C_i,\rho ,S)\), is asked, we compute c as a ciphertext of the dummy password, i.e., \(c = Enc_{pk}(pw_0)\), rather than as a ciphertext of the actual password \(pw_i\) owned by \(C_i\).

Lemma 7

\(\left| Adv_{7}(\mathcal{A}) - Adv_{6}(\mathcal{A})\right| \le negl(k).\)

Proof

We use a standard hybrid argument similar to that was used in the proof of lemma 1, while the total number of hybrids here is upper bounded by \(q_{send}\) since only \(\texttt{Send}_0\) queries are treated. Define \(G_{6}^{(\eta )}\), where \(0 \le \eta \le q_{send}\), to be the hybrid in which the first \(\eta\) \(Send_0\) queries are answered as in \(G_{7}\) and the rest queries are answered according to \(G_{6}\). It follows \(G_{6}^{(0)} = G_{6}\) and \(G_{6}^{(q_{send})} = G_{7}\). Assume that there exists some \(\eta\) such that \(\mathcal{A}\)’s advantage in game \(G_{6}^{(\eta -1)}\) differs from that in \(G_{6}^{(\eta )}\), we construct an adversary \(\mathcal{M}\) against the CPA security of \(\mathcal{E}\) as follows.

Upon receiving the public key pk for \(\mathcal{E}\), \(\mathcal{M}\) generates the key pair \((pk',sk')\) for the encryption scheme \(\mathcal{E}'\) on its own, records \(sk'\) and uses \((pk,pk')\) as the CRS string for the protocol. It then initializes all the other parameters and simulates all the honest oracles exactly as in the hybrid \(G_{6}^{(\eta )}\) except for the \(\eta\)-th \(\texttt{Send}_0\) oracle. In response to it, \(\mathcal{M}\) queries its own challenging oracle with the message pair consisting of the real password \(pw_i\) and the dummy password \(pw_0\). When a ciphertext \(c^{*}\) is returned, it uses it in place of c to answer the \(\texttt{Send}_0\) query. Note that all the other oracles can be perfectly simulated since that \(\mathcal{M}\) knows \(sk'\) and all the passwords. At the end of the game, \(\mathcal{M}\) outputs 1 if A succeeds and 0 otherwise.

The adversary \(\mathcal{A}\)’s advantage gap between \(G_{6}^{(\eta -1)}\) and \(G_{6}^{(\eta )}\) is equivalent to \(\mathcal{M}\)’s advantage. Thus, the CPA security of the encryption scheme \(\mathcal{E}\) yields the lemma.

Game \(G_{8}\) In this game, we modify the way \(\texttt{Send}_1\) queries are answered. When a query \(\texttt{Send}_1(S,\delta ,\langle S,c \rangle )\) is called such that \(\langle S,c \rangle\) is not generated by some client instance as a response to some \(\texttt{Send}_0\) or \(\texttt{Execute}\) query, we compute \(pw^{*} = Dec_{sk}(c)\) and check if \(pw^{*}\) is equal to some password in the list \(\mathbf{pw}_S\). If so, the adversary is directly announced successful and the game is halted. Obviously, the modification made in this game simply increases the probability that an adversary succeeds. Then, we have

Lemma 8

\(Adv_{7}(\mathcal{A}) \le Adv_{8}(\mathcal{A}).\)

Game \(G_{9}\) This game deals with those \(\texttt{Send}_1\) queries that have not been changed in the last game. Precisely, if a query \(\texttt{Send}_1(S,\delta ,\langle S,c \rangle )\) is called and the adversary is not declared successful as in game \(G_{8}\), we replace \(\{ tk_j || tp_j \}_{1 \le j \le n}\) in the server instance \(S^{\delta }\) by independently random strings.

Lemma 9

\(\left| Adv_{9}(\mathcal{A}) - Adv_{8}(\mathcal{A})\right| \le negl(k)\).

Observe that all honestly generated c have been computed as ciphertexts of the dummy password \(pw_0\) in game \(G_{7}\). It follows that \((c,pw_j) \notin L_{pw_j}\) always holds in game \(G_{9}\) and henceforth the value \(\mathrm{Hash} (hk_j; c, pw_j)\) is close to uniform for every such c even under the condition that the projective key \(hp_j\) is known. Combining with the fact that there are at most \(q_{send}\) server instance are modified here, we deduce the lemma result.

Game \(G_{10}\) In this game, we again modify the server instances that receive a \(\texttt{Send}_1\) query. More specifically, if a query of the form \(\texttt{Send}_1(S,\delta ,\langle S,c \rangle )\) is called, we now compute \(c^{\prime }_j\) in the server instance \(S^{\delta }\), for every \(1 \le j \le n\), as a ciphertext of the dummy password \(c^{\prime }_j = Enc^{\prime }_{pk'}(pw_0;label)\), rather than as a ciphertext of the actual password \(pw_j\).

Lemma 10

\(\left| Adv_{10}(\mathcal{A}) - Adv_{9}(\mathcal{A})\right| \le negl(k).\)

Proof

The proof of this lemma is similar to the proof of Lemmas 1 and 7. However, the main difference is that we have to resort to the CCA2 security of the encryption scheme \(\mathcal{E}'\), since the decryption oracle is needed in response to the \(\texttt{Send}_2\) queries. Details are as follows. Recall that there are at most \(n\cdot q_{send}\) ciphertexts of the scheme \(\mathcal{E}'\) needed in response to the \(\texttt{Send}_1\) queries. Denote by \(G_{9}^{(\eta )}\), where \(0 \le \eta \le n\cdot q_{send}\), the hybrid game in which the first \(\eta\) ciphertexts of \(\mathcal{E}^{\prime }\) in response to \(\texttt{Send}_1\) queries are computed as in game \(G_{10}\) and the remainder are computed as in \(G_{9}\). We now construct an adversary \(\mathcal{M}\) against the CCA2 security of \(\mathcal{E}'\).

When \(\mathcal{M}\) is given the public key \(pk'\) of the scheme \(\mathcal{E}'\), it generates (pksk) for \(\mathcal{E}\), records sk, sets the CRS string as \((pk,pk')\) and simulates all the honest oracles as specified in hybrid \(G_{9}^{(\eta )}\) except for the \(\eta\)-th ciphertext in response to \(\texttt{Send}_1\) queries. It asks its own challenging oracle with \(label^{*} = S||c||\mathbf{hp}||VK\) and a message pair consisting of the actual password \(pw_j\) and the dummy password \(pw_0\). Once a challenging ciphertext \(c^{*}\) is returned, it would use it in place of \(c^{\prime }_j\) in this oracle. When a \(\texttt{Send}_2\) query is asked, the adversary \(\mathcal{M}\) asks its decryption oracle to conduct the modified verification procedure that is introduced since game \(G_{5}\). Note that \(\mathcal{M}\) would never request to decrypt the pair \((label^{*},c^{*})\) since that the decryption is only needed when a non-match-generated message \(\langle \mathbf{hp},VK, \mathbf{c}',\sigma \rangle\) is received. At the end of the game, \(\mathcal{M}\) outputs 1 if \(\mathcal{A}\) succeeds and outputs 0 otherwise. The adversary \(\mathcal{A}\)’s advantage gap between \(G_{9}^{(\eta -1)}\) and \(G_{9}^{(\eta )}\) is equivalent to \(\mathcal{M}\)’s advantage. Since that the encryption scheme \(\mathcal{E}^{\prime }\) is CCA2 secure, we can conclude that the lemma’s result holds.

Now, let’s bound the adversary’s advantage in game \(G_{10}\). If the adversary has not guessed the right password, it can only succeed in outputting the right bit b used by the \(\texttt{Test}\) oracle as defined in Sect. 2. Notice that all the session keys as well all the transcripts generated by the honest oracles have been modified to be independent of the actual passwords owned by the clients. It is obvious that the probability of the adversary’s success in this scenario is exactly 1 / 2. On the other hand, It is easy to see that the adversary has a probability at most 1 / |D| in guessing correctly the password per \(\texttt{Send}_1\) or \(\texttt{Send}_2\) query as in game \(G_{5}\) and \(G_{8}\), under the assumption that the hash function \(\mathcal{G}\) is collision-resistant. consequently, its advantage gained from such cases is bounded by \(q_{send}{/}|D|\). Therefore, \(Adv_{10}(\mathcal{A}) \le q_{send}{/}|D|\). Combining with the results from Lemmas 1 to 10, we conclude that

$$\begin{aligned} Adv_{{\texttt{APAKE-S}},D}^{AKE}(\mathcal{A}) \le \frac{ q_{send}}{|D|} + negl(k). \end{aligned}$$
(2)

It means that the adversary’s advantage is only negligibly bigger than the advantage obtained from on-line guessing attacks. This completes the proof of the Theorem 1.

5.2 Mutual Authentication

The mutual authentication property implies that no one can impersonate a legitimate participant, either an honest client or the trusted server, unless it guesses the correct password. Particularly, by means of authentication, a client is sure that he is communicating with the trusted server; the server is convinced that the party with whom he is interacting belongs to some pre-specified group.

Theorem 2

Under the same assumption as in Theorem 1, the APAKE-S protocol provides explicit mutual authentication between clients and the trusted server.

Proof

Using the same argument as in the proof of Theorem 1, we can easily carry out the proof of this theorem. That is, the same sequence of games, from the real game to a game in which the adversary’s advantage is easily bounded, is employed. The difference is that the ways we consider the adversary as successful in the last game. More precisely, in cases that the adversary has guessed the correct password and sent fake ciphertexts as in the game \(G_5\) and \(G_8\), it is considered to be successful as before. However, if the adversary has not guessed the right password, it is still considered to be successful if the following event FakeAuth happens.

  • FakeAuth: A query of the form \(\texttt{Send}_3 (S, \delta , \tau )\) is asked and is accepted, but the authenticator \(\tau\) was not output by any matching session of the instance \(S^\delta\).

When \(\tau\) was not output by any matching session of the instance \(S^\delta\), it must be independent of the adversary’s view since it has been changed into a uniformly random string in the game \(G_9\). It means that the equation \(\tau = \tau _S\), thus the event FakeAuth, happens with negligible probability. Consequently, for any PPT adversary \(\mathcal{A}\) against mutual authentication of the APAKE-S protocol, it holds \(Adv_{10}(\mathcal{A}) \le q_{send}/|D| + negl(k)\) and thus

$$\begin{aligned} Adv_{{\texttt{APAKE-S}},D}^{MA}(\mathcal{A}) \le \frac{ q_{send}}{|D|} + negl(k). \end{aligned}$$
(3)

5.3 Client Anonymity

The anonymity property guarantees that, although a high entropy session key could be established by means of a low-entropy password, no information about the client’s actual identity is revealed to the server, except that he is a legitimate user in the group.

Theorem 3

Under the same assumption as above, the APAKE-S protocol achieves client anonymity against the trusted server.

Proof

The proof of this theorem is similar to the proof of Theorem 1, and is more simpler. According to the definition of client anonymity, we only have to consider an honest-but-curious server who follows the APAKE-S protocol honestly but tries to distinguish two transcripts \(\varPi _{i_0}\) and \(\varPi _{i_1}\). That is to say, it is only needed to consider the modification of the \(\texttt{Execute}\) oracles as in game \(G_1\) to \(G_3\). Detailed modifications are as follows. Let us starts with the transcript \(\varPi _{i}\), with either \(i = i_0\) or \(i = i_1\). We first modify the simulation of the \(\texttt{Execute}\) oracles by replacing the ciphertext \(c=Enc_{pk}(pw_{i})\) by \(c=Enc_{pk}(pw_0)\), then change \(\{tk_j||tp_j\}_{1\le j \le n}\) into uniformly random values, and at last compute \(c^{\prime }_j = Enc^{\prime }_{pk'}(pw_0; label; tk_j)\) for every \(j \in \{1, 2, \ldots , n\}\). It is obvious that the distributions of the resulting transcripts are identical, no matter from which transcript we are start. Also note that, according to Lemmas 13, the resulting distribution is computational indistinguishable with the distribution of both \(\varPi _{i_0}\) and \(\varPi _{i_1}\). Combining together the above facts, we obtain that \(\varPi _{i_0}\) and \(\varPi _{i_1}\) are computational indistinguishable, and thus have proved the theorem.

6 Conclusions

In this paper, we presented a generic construction for password-only APAKE and proven its security in the standard model. It achieves AKE security, keeps the client anonymity, as well as provides mutual authentication.