DID-Based Message Service Protocol

1. Background

In the article “DID-Based End-to-End Encrypted Communication Protocol,” we introduced how to perform identity authentication and end-to-end encrypted communication based on DID. When negotiating short-term keys and conducting encrypted communication, two DID users need to be able to find each other and send handshake messages or encrypted messages to each other. This article explains how DID users connect with each other and send messages.

1. Process and Architecture

The overall structure is shown in the figure below:

DID-Based Message Communication Architecture

The above figure shows the overall architecture of DID-based message communication. There are three main participants:

User Server: The user’s backend service, responsible for managing the user’s DID document and helping users send and receive messages.
DID Server: DID document hosting service, responsible for DID creation, query, and other services.
Message Proxy: Message proxy, responsible for providing message sending and receiving services for users.

In our DID method design specification, we mentioned that DID hosting services (DID server) and message services (Message Proxy) are optional, and users can use self-built services, but the process is essentially the same.

2. Connection Between User Server and Message Proxy

User Server can select a Message service provider. After selection, apply for an API key and service endpoint on the service provider’s website. The API key is used for connection authentication between the User Server and Message Proxy, and the service endpoint can be written into the DID document, indicating that this DID uses the message service of this endpoint.

Message services generally use WSS (WebSocket Secure) long connections, requiring the User Server to actively connect to the Message Proxy and maintain heartbeats.

After the User Server connects to the Message Proxy, it needs to provide the routing information for this connection (the router in the DID document, which is also a DID) and relevant signatures to demonstrate ownership of the route. The router is used to inform the Message Proxy about the routing information carried by this connection. The Proxy will bind the routing information, WSS connection, and DID together. Afterward, all messages for this DID will be sent to the Server through this WSS connection. The benefit of this design is that after the Server connects to the Proxy, it only needs to register a few routers, without having to send all DIDs to the Proxy. This is particularly useful when there are a large number of DIDs.

One routing information can be bound to multiple WSS connections, with load balancing performed by the Proxy when sending.

The process of router registration, binding, and sending messages based on DID is as follows:

Router Registration, Binding, and DID-Based Message Sending Process

3. Message Proxy Authentication Mechanism

Message Proxy uses API keys to authenticate the User Server. The overall process is similar to the authentication process when the User Server connects to the DID Server (DID Method Design Specification).

3.1 WSS HTTP Request Parameters

Request Headers:

Content-Type: application/json
Authorization: Supports two authentication methods: API Key and token

3.2 WSS HTTP User Authentication

When calling the API, two authentication methods are supported:

Using an API Key for authentication
Using an authentication token for authentication

3.2.1 Obtaining an API Key

Log in to the service provider’s API Keys page to get the latest generated user API Key. The API Key includes both the “user ID” and the “signature key secret”, in the format {id}.{secret}.

3.2.2 Making Requests Using an API Key

Users need to place the API Key in the Authorization header of the HTTP request.

3.2.3 Assembling a Token Using JWT for Requests

The client needs to import the relevant JWT utility classes and assemble the header and payload parts of the JWT as follows.

Header Example

{
  "alg": "HS256",
  "sign_type": "SIGN"
}

alg: Attribute indicating the algorithm used for signing, defaulting to HMAC SHA256 (written as HS256).
sign_type: Attribute indicating the type of token, for JWT tokens it is uniformly written as SIGN.

Payload Example

{
  "api_key": "{ApiKey.id}",
  "exp": 1682503829130,
  "timestamp": 1682503820130
}

api_key: Attribute indicating the user ID, which is the {id} part of the user’s API Key.
exp: Attribute indicating the expiration time of the generated JWT, controlled by the client, in milliseconds.
timestamp: Attribute indicating the current timestamp, in milliseconds.

Example: Token Assembly Process in Python

import time
import jwt
 
def generate_token(apikey: str, exp_seconds: int):
    try:
        id, secret = apikey.split(".")
    except Exception as e:
        raise Exception("invalid apikey", e)
    
    payload = {
        "api_key": id,
        "exp": int(round(time.time() * 1000)) + exp_seconds * 1000,
        "timestamp": int(round(time.time() * 1000)),
    }
    
    return jwt.encode(
        payload,
        secret,
        algorithm="HS256",
        headers={"alg": "HS256", "sign_type": "SIGN"},
    )

Placing the Authentication Token in the HTTP Request Header Users need to place the generated authentication token in the Authorization header of the HTTP request:
- Authorization: Bearer <your_token>

4. Connection Between Message Proxies

Service Proxies also need to connect with each other, especially when different platforms use different services. Services can connect to each other via WSS, and after connecting, notify each other of their respective service endpoints, so that they can find each other during routing.

5. Message Sending and Receiving Process

This section describes how a user A can successfully send a message to user B based on user B’s DID. The overall process is shown in the following diagram:

Message Sending and Receiving Process

Process Explanation:

User A obtains user B’s DID through WeChat, SMS, offline channels, etc., and wants to send a message to B. First, A initiates a request to A’s server.
User A’s server receives the request to send a message to B, then sends a request to A Server’s Message proxy, carrying B’s DID.
A’s Message Proxy receives the message sending request, looks up B’s DID document from the DID Server based on B’s DID to obtain B’s message service endpoint, and caches B’s DID document for faster connection in the future.
A’s Message Proxy initiates a connection to B’s message service endpoint (B’s Message Proxy) and sends the message over.
B’s Message Proxy receives the request, looks up B’s DID document based on B’s DID to obtain B’s message routing information (the router field in the DID document). Based on the message routing, it queries for B Server’s WSS connection and forwards the message.
B’s Server receives the message, verifies and processes it, and sends it to B.
The process for B to subsequently send messages to A is the same as the process for A sending to B.

Note: When accessing a new DID for the first time, the system may not have a cache, and the query process takes longer, but subsequent accesses can directly access the cache, which takes less time.

6. Protocol Definition

As described in the DID-based end-to-end encrypted communication technology, our protocol uses WSS for transport and JSON format.

6.1 Message Service Registration Message

Used for user Server to register with Message Proxy. Multiple routers can be carried during registration. Each router includes router DID, creation time, nonce (used to prevent replay attacks), and signature (the signature is for the JSON sub-object of the router).

Note: This is a full-content interface that needs to carry all routers; routers not included in the registration message will be deleted.

Example:

{
  "version": "1.0",
  "type": "register",
  "timestamp": "2024-05-27T12:00:00.123Z",
  "messageId": "randomstring",
  "routers": [
    {
      "router": "did:all:14qQqsnEPZy2wcpRuLy2xeR737ptkE2Www",
      "nonce": "randomNonceValue1",
      "proof": {
        "type": "EcdsaSecp256r1Signature2019",
        "created": "2024-05-27T10:51:55Z",
        "verificationMethod": "did:example:14qQqsnEPZy2wcpRuLy2xeR737ptkE2Www#keys-1",
        "proofValue": "z58DAdFfa9SkqZMVPxAQpic7ndSayn1PzZs6ZjWp1CktyGesjuTSwRdoWhAfGFCF5bppETSTojQCrfFPP2oumHKtz"
      }
    },
    {
      "router": "did:all:14qQqsnEPZy2wcpRuLy2xeR737ptkE2Www",
      "nonce": "randomNonceValue1",
      "proof": {
        "type": "EcdsaSecp256r1Signature2019",
        "created": "2024-05-27T10:51:55Z",
        "verificationMethod": "did:example:14qQqsnEPZy2wcpRuLy2xeR737ptkE2Www#keys-1",
        "proofValue": "z58DAdFfa9SkqZMVPxAQpic7ndSayn1PzZs6ZjWp1CktyGesjuTSwRdoWhAfGFCF5bppETSTojQCrfFPP2oumHKtz"
      }
    }
  ]
}

6.1.1 Field Descriptions

version: Protocol version
type: Request type, indicating that this is a request to register a router, with the value “register”.
timestamp: Request timestamp, indicating when the request was sent, an ISO 8601 formatted UTC time string, accurate to milliseconds
messageId: Unique message id, 16-character random string
routers: An array containing multiple router objects, each router object includes the following fields:
- router: The DID of the router, used to identify the router.
- nonce: Random value, 32-byte string, used to prevent replay attacks, ensuring the uniqueness of each request.
- proof: Same as the proof field in the SourceHello of the DID-based end-to-end encrypted communication technology. Signs only the current router.

6.1.2 Proof Signature Generation Steps

The process is basically the same as the sourceHello message signing method in the DID-based end-to-end encrypted communication technology. The difference is that the signature here only protects a single router.

Convert the router object to be signed into a JSON string (excluding the proofValue field), using commas and colons as separators, and sorting by keys.
Encode the JSON string as UTF-8 bytes.
Use the Elliptic Curve Digital Signature Algorithm (EcdsaSecp256r1Signature2019) and SHA-256 hash algorithm to sign the byte data.
Add the generated signature value proofValue to the proofValue field in the proof dictionary of the message json.

# 1. Create all fields of the json message, excluding the proofValue field
router = {
    # Other necessary fields
    "proof": {
        "type": "EcdsaSecp256r1Signature2019",
        "created": "2024-05-27T10:51:55Z",
        "verificationMethod": "did:example:123456789abcdefghi#keys-1"
        # excluding proofValue field
    }
}

# 2. Convert to JSON string, sort by keys, and use commas and colons as separators
router_str = JSON.stringify(router, separators=(',', ':'), sort_keys=True)

# 3. Encode the JSON string as UTF-8 bytes
router_bytes = UTF8.encode(router_str)

# 4. Use private key and ECDSA algorithm to sign the byte data
signature = ECDSA.sign(router_bytes, private_key, algorithm=SHA-256)

# 5. Add the signature value to the proof field of the json message
router["proof"]["proofValue"] = Base64.urlsafe_encode(signature)

6.1.3 Proof Signature Verification Process

Obtain the DID document based on the router’s DID, get the corresponding public key from the document based on the verificationMethod, and verify the signature using the public key and the signing process described above.

6.1.4 Nonce Generation and Verification Steps:

The purpose of the nonce is to prevent replay attacks.

Generating a Nonce: The client generates a random number or unique identifier (nonce), and a new nonce should be used for each request.
Recording the Nonce: The server records the nonce when processing the request.
Verifying the Nonce: The server verifies whether the nonce has been used within a certain time period. If it has been used, the request is rejected.

6.1.5 Timestamp Verification:

Timestamp: Include a timestamp in the request, indicating when the request was generated.
Validity Check: The server checks whether the timestamp is within a reasonable time range (e.g., within 5 minutes). Requests outside this time range will be rejected.

6.2 Connection Between Message Proxies

After a Proxy A resolves the message service endpoint B of the target DID, it can initiate a WSS connection to this endpoint B, and then send messages directly. Registration may not be necessary, but heartbeats are needed to keep the connection alive. B can establish a new WSS connection to send messages to A, without using the previous connection from A to B. This way, one connection is only used for the connection initiator to send messages.

6.3 Heartbeat Message

Heartbeat messages are initiated by the connecting party. The server will periodically (every 60 seconds) clear connections that do not have keepalive messages.

Example:

{
  "version": "1.0",
  "type": "heartbeat",
  "timestamp": "2024-06-04T12:34:56Z",
  "messageId": "randomstring",
  "message": "ping"
}

6.3.1 Field Descriptions

version: Protocol version
type: Request type, indicating that this is a heartbeat request, with the value “heartbeat”.
timestamp: Request timestamp, indicating when the request was sent, an ISO 8601 formatted UTC time string, accurate to milliseconds
messageId: Unique message id, 16-character random string
message: The heartbeat message sent in the request is “ping”, and the response message is “pong”.

6.4 Encrypted Communication Message

After two users have negotiated short-term encryption keys through DID, they can send encrypted messages through the message service.

Example:

{
  "version": "1.0",
  "type": "message",
  "timestamp": "2024-06-04T12:34:56.123Z",
  "messageId": "randomstring",
  "sourceDid": "did:example:987654321abcdefghi",
  "destinationDid": "did:example:123456789abcdefghi",
  "secretKeyId": "abc123session",
  "encryptedData": {
    "iv": "iv_encoded_base64",
    "tag": "tag_encoded_base64",
    "ciphertext": "ciphertext_encoded_base64"
  }
}

6.4.1 Protocol Fields

version: String, version number of the current protocol.
type: String, message type.
timestamp: Message sending time, ISO 8601 formatted UTC time string, accurate to milliseconds.
messageId: Unique message id, 16-character random string
sourceDid: String, the DID of the message source, i.e., the sender; always fill in the sender’s own DID here.
destinationDid: String, the DID of the destination, i.e., the message recipient; always fill in the recipient’s DID here.
secretKeyId: ID of the short-term encryption key, through which the previously negotiated symmetric encryption key algorithm, encryption key, and other information can be found, type: string. See DID-Based End-to-End Encrypted Communication Protocol for details.
encryptedData: Encrypted data, which may include different data depending on the encryption algorithm. Below is the data required for the TLS_AES_128_GCM_SHA256 encryption suite: includes iv (initialization vector), tag (authentication tag, depending on the encryption algorithm). ciphertext (encrypted text) exists in all encryption algorithms.
- iv: Initialization Vector, a random or pseudorandom byte sequence, typically 12 bytes (96 bits) in length for AES-GCM mode.
- tag: An authentication code generated by AES-GCM mode, used to verify the integrity and authenticity of the data. The tag is typically 16 bytes (128 bits).
- ciphertext: Encrypted data, the encrypted ciphertext is Base64 encoded and the encoding result is converted to a UTF-8 string.
- The generation method of encryptedData is the same as the generation method of verifyData in the finished message: DID-Based End-to-End Encrypted Communication Protocol.

6.5 Response Message

For all WSS messages, there is a common response message, the main purpose of which is to notify exceptions in the message processing, including short-term key negotiation messages, WSS registration messages, encrypted communication messages, etc. This message is not used for receipt confirmation at the WSS JSON message level. If an application layer sends a message and needs to confirm whether the message is correctly received by the other party, a guarantee mechanism needs to be designed in the application layer protocol.

{
  "version": "1.0",
  "type": "response",
  "timestamp": "2024-06-04T12:34:56.123Z",
  "messageId": "randomstring",
  "sourceDid": "did:example:987654321abcdefghi",
  "destinationDid": "did:example:123456789abcdefghi",
  "originalType": "register",
  "originalMessageId": "randomstring",
  "code": 200,
  "detail": "invalid json"
}

6.5.1 Field Descriptions

originalType: Original message type
originalMessageId: Original message ID
code: Error code, basically designed the same as HTTP error codes, below are common errors:
- 200: Normal
- 404: DID not found
- 403: Authentication failure
- 4000: ENCRYPTION_ERROR: Error occurred during encryption.
- 4001: DECRYPTION_ERROR: Error occurred during decryption.
- 4002: INVALID_ENCRYPTION_KEY: Encryption key is invalid or does not match.
- 4003: INVALID_DECRYPTION_KEY: Decryption key is invalid or does not match.
- 4004: ENCRYPTION_KEY_EXPIRED: Encryption key has expired.
- 4005: DECRYPTION_KEY_EXPIRED: Decryption key has expired.
detail: Detailed description of the error

6.6 DID Update Notification

When internal fields of the DID are modified, or when the private key corresponding to the DID is leaked and the original DID document needs to be abandoned, a DID notification message needs to be sent to notify other associated DIDs to re-query the DID document in a timely manner. To defend against potential malicious attacks, DID update notifications will be deployed together with the multi-signature mechanism of DID documents, planned for support in the next version. This way, even if one key is leaked, we can still securely notify associated parties.

Copyright Notice

Copyright (c) 2024 GaoWei Chang
This file is published under the MIT License, you are free to use and modify it, but you must retain this copyright notice.

DID-Based End-to-End Encrypted Communication Protocol