DID-Based End-to-End Encrypted Communication Protocol

1. Background

End-to-End Encryption (E2EE) is an encryption method that ensures information remains encrypted during transmission between the sender and receiver, thereby preventing third parties (including network service providers, man-in-the-middle attackers, and servers themselves) from accessing unauthorized plaintext content.

In the Agent Network Protocol Technical White Paper, we proposed an end-to-end encrypted communication technology based on DID. This article details the implementation of this technology.

2. Solution Overview

This solution draws on TLS, blockchain, and other high-security technologies that have been proven in practice. By combining these technologies, we designed a DID-based end-to-end encrypted communication solution that can be used for secure encrypted communication between users on different platforms.

We designed a set of DID-based message routing mechanisms and short-term key negotiation mechanisms on top of the WebSocket protocol. Parties holding DIDs can use the public keys in their DID documents along with their private keys to negotiate short-term keys using ECDHE (Elliptic Curve Diffie-Hellman Ephemeral), and then use these keys to encrypt messages for secure communication during the key’s validity period. ECDHE ensures that messages cannot be maliciously decrypted even if they are forwarded through third-party message proxies or other intermediaries.

We chose the WebSocket protocol because it is widely used on the internet with many available infrastructure components, which is crucial for early promotion of the solution. At the same time, because we designed end-to-end encryption on top of WebSocket, we don’t need to use the WebSocket Secure protocol, avoiding the problem of redundant encryption and decryption.

Our current solution essentially replaces transport layer encryption with application layer encryption, allowing us to reduce the difficulty of protocol promotion while leveraging existing infrastructure.

The overall process is shown in the following diagram:

end-to-end-encryption

Note: The third-party Message service may not exist; users can use their own message services.

Currently, we only support the WebSocket protocol because it is a bidirectional protocol. In the future, we will consider supporting the HTTP protocol to expand to more scenarios. At the same time, we will also consider implementing our end-to-end encryption solution at the transport layer, which will allow it to be used in more scenarios.

3. Encrypted Communication Process

Suppose there are users from two platforms, A (DID-A) and B (DID-B), both can obtain each other’s DID documents through the DID SERVER, and the DID documents contain their respective public keys.

For A and B to communicate securely, they first need to initiate a short-term key creation process. This process is similar to how TLS generates temporary encryption keys. The key has a validity period, and before it expires, the short-term key creation process needs to be initiated again to generate and update new keys.

Once A and B have the negotiated short-term keys, if A wants to send a message to B, A can encrypt the message using the key and then use the message sending protocol to send it to B through the message server. Upon receipt, B uses the key ID in the message to find the short-term key stored previously, and then decrypts the encrypted message. If the corresponding key is not found, or the key has expired, an error message is sent to notify A to initiate the short-term key update process. After the short-term key is updated, the message is sent again.

Client (A)                                      Client (B)
|                                                 |
| -- Initiate Short-term Key Creation Process --> |
|                                                 |
|      (Create Temporary Encryption Key)          |
|                                                 |
| <---- Temporary Key Created ----                |
|                                                 |
|       (Key has an expiration time)              |
|                                                 |
|      (Monitor Key Validity)                     |
|                                                 |
|   (Before expiration, restart creation process) |
|                                                 |
| (A and B now have a negotiated short-term key)  |
|                                                 |
| ---- Encrypted Message ---->                    |
|                                                 |
|     (Encrypt message using short-term key)      |
|     (Send via message server)                   |
|                                                 |
| <---- Receive Encrypted Message ----            |
|                                                 |
|     (Find stored key using key ID)              |
|     (Decrypt message)                           |
|      (If key not found or expired)              |
|                                                 |
| <---- or Error Message ----                     |
|                                                 |
|      (Notify A to update short-term key)        |
|                                                 |

4. Short-term Key Negotiation Process

The short-term key creation process is basically similar to the process of exchanging encryption keys in TLS 1.3, using ECDHE (Elliptic Curve Diffie-Hellman Ephemeral), a key exchange protocol based on elliptic curves. It is a variant of the Diffie-Hellman key exchange protocol. It combines Elliptic Curve Cryptography (ECC) and ephemeral keys to securely exchange encryption keys over insecure networks, enabling secure communication.

Brief description of the ECDHE process:

Key Pair Generation:
- Both the client and server each generate a temporary elliptic curve key pair, including a private key and a public key.
Public Key Exchange:
- The client sends its generated public key to the server.
- The server sends its generated public key to the client.
Shared Key Calculation:
- The client uses its private key and the server’s public key to calculate a shared key.
- The server uses its private key and the client’s public key to calculate a shared key.
- Due to the properties of the elliptic curve Diffie-Hellman algorithm, these two calculations result in the same shared key.

Differences between our process and the TLS process:

The entire process has only three messages: SourceHello, DestinationHello, and Finished, corresponding to TLS’s ClientHello, ServerHello, and Finished. This is because in our process, there are no clients and servers, only sources and destinations.
Other messages such as EncryptedExtensions, Certificate, and CertificateVerify are not needed. Among them:
- EncryptedExtensions is not needed for now but may be added later to transmit encrypted extensions.
- Certificate and CertificateVerify are not needed. The main purpose of these two messages is to ensure the security of the server’s public key. We verify the correctness of the public key corresponding to the DID through the mapping relationship between the DID address and the public key, meaning that a specific public key has one and only one DID, and a specific DID has one and only one public key.
Finished no longer hashes and encrypts handshake messages, as SourceHello and DestinationHello already include their respective signatures, which can ensure the integrity of the messages.
Source and Destination can initiate multiple short-term key negotiations simultaneously, and multiple keys can exist at the same time, used for encrypting different types of messages.

The overall flow diagram is as follows:

Client (A)                                          Server (B)
   |                                                    |
   |  ---------------- SourceHello ---------------->    |
   |                                                    |
   |                (includes public key and signature) |
   |                                                    |
   |                                                    |
   |  <------------- DestinationHello ------------      |
   |                                                    |
   |                (includes public key and signature) |
   |                                                    |
   |                                                    |
   |  -------- Finished (includes verify_data) ----->   |
   |                                                    |
   |  <-------- Finished (includes verify_data) ----    |
   |                                                    |
   |                                                    |

5. Protocol Definition

Our protocol is designed based on WebSocket, using JSON format. A DID user’s message receiving address is stored in the DID document, in the “service” field’s endpoint with type “messageService”. (See DID Method Design Specification)

5.1 SourceHello Message

The SourceHello message is used to initiate encrypted communication handshake. It includes the source’s identity information, public key, supported encryption parameters, session ID, version information, and message signature to ensure message integrity and identity verification.

Message Example

{
  "version": "1.0",
  "type": "sourceHello",
  "timestamp": "2024-05-27T12:00:00.123Z",
  "messageId": "randomstring",
  "sessionId": "abc123session",
  "sourceDid": "did:example:123456789abcdefghi",
  "destinationDid": "did:example:987654321abcdefghi",
  "verificationMethod": {
    "id": "did:example:987654321abcdefghi#keys-1",
    "type": "EcdsaSecp256r1VerificationKey2019",
    "publicKeyHex": "04a34b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6"
  },
  "random": "b7e4b4d5f6c4e4f7a6b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6",
  "supportedVersions": ["1.0", "0.9"],
  "cipherSuites": [
    "TLS_AES_128_GCM_SHA256",
    "TLS_AES_256_GCM_SHA384",
    "TLS_CHACHA20_POLY1305_SHA256"
  ],
  "supportedGroups": [
    "secp256r1",
    "secp384r1",
    "secp521r1"
  ],
  "keyShares": [
    {
      "group": "secp256r1",
      "expires": 864000,
      "keyExchange": "0488b21e000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
    },
    {
      "group": "secp384r1",
      "expires": 864000,
      "keyExchange": "0488b21e000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
    }
  ],
  "proof": {
    "type": "EcdsaSecp256r1Signature2019",
    "created": "2024-05-27T10:51:55Z",
    "verificationMethod": "did:example:987654321abcdefghi#keys-1",
    "proofValue": "eyJhbGciOiJFUzI1NksifQ..myEaggpdg0-GflPHibRZWfDEdDOqzZzBcBM5TKvaUzCUSv1_7anUvtgdFXMd12E_qM6RmAAaSWWBGwLY-Srvyg"
  }
}

5.1.1 Field Descriptions

version: String, version number of the current protocol.
type: String, message type, e.g., “SourceHello”.
timestamp: Message sending time, ISO 8601 formatted UTC time string, accurate to milliseconds.
messageId: Unique message id, 16-character random string.
sessionId: String, session id, 16-character random string, valid during one short-term session negotiation.
sourceDid: String, the message source, i.e., the sender’s DID; always fill in the sender’s own DID here.
destinationDid: String, the destination, i.e., the message recipient’s DID; always fill in the recipient’s DID here.
verificationMethod: The public key corresponding to the message sender’s own DID.
- id: String, verification method id.
- type: String, type of public key, refer to DID specification definition.
- publicKeyHex: String, hexadecimal representation of the public key.
random: String, 32-character random string, ensuring uniqueness of the handshake process, participates in key exchange.
supportedVersions: Array, list of protocol versions supported by the sender.
cipherSuites: Array, list of supported cipher suites. Currently supports TLS_AES_128_GCM_SHA256.
supportedGroups: Array, supported elliptic curve groups.
keyShares: Array, contains multiple public key information for key exchange.
- group: String, elliptic curve group used. Currently supports secp256r1.
- keyExchange: String, hexadecimal representation of the public key generated by this end for key exchange.
- expires: Number, validity period of the final encryption key, in seconds. Each side informs the other of the validity period, negotiation will not fail due to validity period.
proof:
- type: String, type of signature.
- created: String, time when the signature was created, ISO 8601 formatted UTC time string, accurate to seconds.
- verificationMethod: ID of the verification method used for signing, used to find the verificationMethod at the first level of the JSON.
- proofValue: Uses the private key of sourceDid to sign the message, ensuring message integrity.

5.1.2 Process for Generating proofValue

Construct all fields of the sourceHello message, excluding the proofValue field in proof.
Convert the JSON to be signed to a JSON string, using commas and colons as separators, and sorting by keys.
Encode the JSON string as UTF-8 bytes.
Use the Elliptic Curve Digital Signature Algorithm (EcdsaSecp256r1Signature2019) and SHA-256 hash algorithm to sign the byte data.
Add the generated signature value proofValue to the proofValue field in the proof dictionary of the message JSON.

# 1. Create all fields of the json message, excluding the proofValue field
msg = {
    # Other necessary fields
    "proof": {
        "type": "EcdsaSecp256r1Signature2019",
        "created": "2024-05-27T10:51:55Z",
        "verificationMethod": "did:example:123456789abcdefghi#keys-1"
        # proofValue field excluded
    }
}

# 2. Convert msg to JSON string, sort by keys, and use commas and colons as separators
msg_str = JSON.stringify(msg, separators=(',', ':'), sort_keys=True)

# 3. Encode the JSON string as UTF-8 bytes
msg_bytes = UTF8.encode(msg_str)

# 4. Use private key and ECDSA algorithm to sign the byte data
signature = ECDSA.sign(msg_bytes, private_key, algorithm=SHA-256)

# 5. Add the signature value to the proof field of the json message
msg["proof"]["proofValue"] = Base64.urlsafe_encode(signature)

5.1.3 Verifying SourceHello Message

Parse the message: The recipient parses the SourceHello message and extracts each field.
Verify DID and public key: Read the sourceDid and the public key in verificationMethod, use the method in the DID method specification to generate a DID from the public key, and confirm whether it matches the sourceDid.
Verify signature: Use the public key corresponding to sourceDid to verify if the signature in the proof field is correct.
Verify other fields: Check the randomness of the random field to prevent replay attacks. Check the created field in proof to ensure the signature time has not expired.

5.2 DestinationHello Message

DestinationHello is a handshake message sent by the destination for key exchange. It includes the destination’s identity information, public key, negotiated encryption parameters, session ID, version information, and message signature to ensure message integrity and identity verification.

Message Example

{
  "version": "1.0",
  "type": "destinationHello",
  "timestamp": "2024-05-27T12:00:00Z",
  "messageId": "randomstring",
  "sessionId": "abc123session",
  "sourceDid": "did:example:987654321abcdefghi",
  "destinationDid": "did:example:123456789abcdefghi",
  "verificationMethod": {
    "id": "did:example:987654321abcdefghi#keys-1",
    "type": "EcdsaSecp256r1VerificationKey2019",
    "publicKeyHex": "04a34b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6"
  },
  "random": "e4b4d5f6c4e4f7a6b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6b4c8d2e48f37a6c6c6f6d7b7a6e4b4d5f6c4e4f7a6",
  "selectedVersion": "1.0",
  "cipherSuite": "TLS_AES_128_GCM_SHA256",
  "keyShare": {
    "group": "secp256r1",
    "expires": 864000,
    "keyExchange": "0488b21e000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  },
  "proof": {
    "type": "EcdsaSecp256r1Signature2019",
    "created": "2024-05-27T10:51:55Z",
    "verificationMethod": "did:example:987654321abcdefghi#keys-1",
    "proofValue": "eyJhbGciOiJFUzI1NksifQ..myEaggpdg0-GflPHibRZWfDEdDOqzZzBcBM5TKvaUzCUSv1_7anUvtgdFXMd12E_qM6RmAAaSWWBGwLY-Srvyg"
  }
}

5.2.1 Field Descriptions

The meaning of fields in DestinationHello message is basically the same as in SourceHello message. Below are the key differences:

version: String, version number of the current protocol.
type: String, message type, e.g., “DestinationHello”.
timestamp: Message sending time, ISO 8601 formatted UTC time string, accurate to milliseconds.
messageId: Unique message id, 16-character random string.
sessionId: String, session id, uses the sessionId from the sourceHello message.
sourceDid: String, the message source, i.e., the sender’s DID; always fill in the sender’s own DID here.
destinationDid: String, the destination, i.e., the message recipient’s DID; always fill in the recipient’s DID here.
verificationMethod: The public key corresponding to the message sender’s own DID.
- id: String, verification method id.
- type: String, type of public key, refer to DID specification definition.
- publicKeyHex: String, hexadecimal representation of the public key.
random: String, 32-character random string, ensuring uniqueness of the handshake process, participates in key exchange.
selectedVersion: Selected protocol version number.
cipherSuite: Selected cipher suite, currently supports TLS_AES_128_GCM_SHA256.
keyShare: Information from the destination for key exchange.
- group: String, elliptic curve group used. Currently supports secp256r1.
- keyExchange: String, hexadecimal representation of the public key generated by this end for key exchange.
- expires: Number, validity period set by the destination for the key. If the validity period exceeds that in sourceHello, the key negotiation initiator can still use their own validity period and refuse to accept messages encrypted with this key after their own set validity period expires, sending an error and re-initiating negotiation.
proof:
- type: String, type of signature.
- created: String, time when the signature was created, ISO 8601 formatted UTC time string, accurate to seconds.
- verificationMethod: ID of the verification method used for signing, used to find the verificationMethod at the first level of the JSON.
- proofValue: Uses the private key of sourceDid to sign the message, ensuring message integrity.

5.3 Finished Message

In TLS 1.3, the content of the Finished message is the hash value of all previous handshake messages, processed with HMAC (Hash-based Message Authentication Code), to ensure that both parties’ handshake messages have not been tampered with and to prevent replay attacks.

In our process, both sourceHello and destinationHello messages carry signatures, which can ensure that messages cannot be tampered with. The main function of the finished message in our process is to prevent replay attacks. The specific approach is to concatenate the random numbers from the sourceHello and destinationHello messages, hash them to obtain the key ID, and then encrypt it and include it in the finished message. This allows us to prevent replay attacks by decrypting the message and checking if the key IDs match.

Message Example

{
  "version": "1.0",
  "type": "finished",
  "timestamp": "2024-05-27T12:00:00Z",
  "messageId": "randomstring",
  "sessionId": "abc123session",
  "sourceDid": "did:example:987654321abcdefghi",
  "destinationDid": "did:example:123456789abcdefghi",
  "verifyData": {
    "iv": "iv_encoded",
    "tag": "tag_encoded",
    "ciphertext": "ciphertext_encoded"
  }
}

5.3.1 Field Descriptions

version: String, version number of the current protocol.
type: String, message type.
timestamp: Message sending time, ISO 8601 formatted UTC time string, accurate to milliseconds.
messageId: Unique message id, 16-character random string.
sessionId: String, session id, uses the sessionId from the sourceHello message.
sourceDid: String, the message source, i.e., the sender’s DID; always fill in the sender’s own DID here.
destinationDid: String, the destination, i.e., the message recipient’s DID; always fill in the recipient’s DID here.
verifyData: Verification data, in AES-GCM mode it carries iv and tag.
- iv: Initialization Vector, a random or pseudorandom byte sequence, typically 12 bytes (96 bits) in length for AES-GCM mode.
- tag: An authentication code generated by AES-GCM mode, used to verify the integrity and authenticity of the data. The tag is typically 16 bytes (128 bits).
- ciphertext: Encrypted data, carries the short-term encryption key ID. Detailed generation method in 5.3.2

5.3.2 verifyData Generation Method

Encrypt the following JSON using the short-term encryption key negotiated in the process to obtain the ciphertext:

{
    "secretKeyId":"0123456789abcdef"
}

Python code example for generating verifyData:

# TLS_AES_128_GCM_SHA256 encryption function
def encrypt_aes_gcm_sha256(data: bytes, key: bytes) -> Dict[str, str]:
    # Ensure key length is 16 bytes (128 bits)
    if len(key) != 16:
        raise ValueError("Key must be 128 bits (16 bytes).")
    
    # Generate random IV
    iv = os.urandom(12)  # For GCM, recommended IV length is 12 bytes
    
    # Create encryption object
    encryptor = Cipher(
        algorithms.AES(key),
        modes.GCM(iv),
        backend=default_backend()
    ).encryptor()
    
    # Encrypt data
    ciphertext = encryptor.update(data) + encryptor.finalize()
    
    # Get tag
    tag = encryptor.tag
    
    # Encode as Base64
    iv_encoded = base64.b64encode(iv).decode('utf-8')
    tag_encoded = base64.b64encode(tag).decode('utf-8')
    ciphertext_encoded = base64.b64encode(ciphertext).decode('utf-8')
    
    # Create JSON object
    encrypted_data = {
        "iv": iv_encoded,
        "tag": tag_encoded,
        "ciphertext": ciphertext_encoded
    }
        
    return encrypted_data

secretKeyId is the short-term encryption key ID between sourceDid and destinationDid. Later, when transmitting encrypted messages, this key ID will be included to indicate which key was used to encrypt the data. This key ID is only effective during the key’s validity period; when the key exceeds its validity period, the key ID must be discarded.

Key ID (secretKeyId) generation method:

Concatenate the random numbers from sourceHello and destinationHello into a string, with sourceHello first, then destinationHello, with no connector in between.
Convert the string to a byte sequence using UTF-8 encoding.
Initialize HKDF (HMAC-based Extract-and-Expand Key Derivation Function), which is a key derivation function based on HMAC (Hash-based Message Authentication Code), specifying SHA-256 as the hash algorithm; no salt value specified, default is no salt; context information is empty; using default encryption backend. Python example code:

hkdf = HKDF(
    algorithm=hashes.SHA256(),  # Ensure using hash algorithm instance from cryptography library
    length=8,  # Generate 8-byte key
    salt=None,
    info=b'',  # Optional context information to distinguish keys for different purposes
    backend=default_backend()  # Use default encryption backend
)

Use the derive method of HKDF to derive an 8-byte sequence from the input byte sequence.
Encode the derived 8-byte sequence as a 16-character hexadecimal string, which is the secretKeyId.

Python code for secretKeyId generation steps:

from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.backends import default_backend

def generate_16_char_from_random_num(random_num1: str, random_num2: str):
    content = random_num1 + random_num2
    random_bytes = content.encode('utf-8')
    
    # Use HKDF to derive 8 bytes of random number
    hkdf = HKDF(
        algorithm=hashes.SHA256(),  # Ensure using hash algorithm instance from cryptography library
        length=8,  # Generate 8-byte key
        salt=None,
        info=b'',  # Optional context information to distinguish keys for different purposes
        backend=default_backend()  # Use default encryption backend
    )
    
    derived_key = hkdf.derive(random_bytes)
    
    # Encode derived key as hexadecimal string
    derived_key_hex = derived_key.hex()
    
    return derived_key_hex

5.3.3 Finished Message Verification

Decrypt the encrypted data in the message using the negotiated short-term encryption key, extract the secretKeyId from the message, and check if it matches the locally generated secretKeyId.

5.4 Short-term Encryption Key Generation Method

When both the source and destination ends have the sourceHello and destinationHello, they can calculate the short-term encryption key.

Obtain the other party’s public key:
- Extract the other party’s elliptic curve public key from the hexadecimal string (keyExchange).
Generate shared secret:
- Use local private key and the other party’s public key to generate a shared secret through ECDH (Elliptic Curve Diffie-Hellman) algorithm. This step ensures both parties can calculate the same shared secret without directly transmitting private keys.
Determine key length:
- According to the selected cipher suite, determine the length of the encryption key to be generated. For example, TLS_AES_128_GCM_SHA256 corresponds to a key length of 128 bits (16 bytes).
Generate encryption and decryption keys:
- Initialize HKDF extraction phase:
  - First, initialize the HKDF extractor using a specified hash algorithm (such as SHA-256) and an initial salt value (all zero bytes). The HKDF extractor is used to extract a pseudorandom key from the shared secret.

Summary

This solution proposes a DID-based end-to-end encrypted communication technology. By combining TLS, blockchain, and other high-security technologies, we designed a DID-based short-term key negotiation mechanism. This solution ensures secure communication between two end users, preventing third parties from accessing unauthorized plaintext content.

Specifically, this solution implements end-to-end encrypted communication on top of the WebSocket protocol, utilizing ECDHE (Elliptic Curve Diffie-Hellman Ephemeral) for short-term key negotiation, ensuring that messages cannot be decrypted even if they are forwarded through intermediaries. We detailed the encrypted communication process, short-term key negotiation process, and protocol definition, including the generation and verification of SourceHello, DestinationHello, and Finished messages.

Although the current version is based on the WebSocket protocol to leverage existing infrastructure, in the future, we plan to launch an end-to-end encryption solution based on TCP or UDP transport layer, further improving transport efficiency and application range.

Through this solution, we expect to achieve secure and efficient encrypted communication between users on different platforms, and provide a reliable technical foundation for decentralized identity authentication. Subsequent work will include optimizing the existing protocol, adding more security features, and extending to more application scenarios.

Copyright Notice

Copyright (c) 2024 GaoWei Chang
This file is published under the MIT License, you are free to use and modify it, but you must retain this copyright notice.

DID-Based Message Service Protocol