Internet-Draft The R5N Distributed Hash Table November 2021
Schanzenbach, et al. Expires 3 June 2022 [Page]
Workgroup:
Independent Stream
Internet-Draft:
draft-schanzen-r5n-00
Published:
Intended Status:
Informational
Expires:
Authors:
M. Schanzenbach
GNUnet e.V.
C. Grothoff
Berner Fachhochschule
B. Fix
GNUnet e.V.

The R5N Distributed Hash Table

Abstract

This document contains the R5N DHT technical specification.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 3 June 2022.

Table of Contents

1. Introduction

Distributed Hash Tables (DHTs) are a key data structure for the construction of completely decentralized applications. DHTs are important because they generally provide a robust and efficient means to distribute the storage and retrieval of key-value pairs.

While [RFC6940] already provides a peer-to-peer (P2P) signaling protocol with extensible routing and topology mechanisms, it also relies on strict admission control through the use of either centralized enrollment servers or pre-shared keys. Modern decentralized applications require a more open system that enables ad-hoc participation and other means to prevent common attacks on P2P overlays.

This document contains the technical specification of the R5N DHT [R5N], a secure DHT routing algorithm and data structure for decentralized applications. R5N is an open P2P overlay routing mechanism which supports ad-hoc participation and security properties including support for topologies in restricted-route environments and path signatures.

This document defines the normative wire format of peer-to-peer messages, routing algorithms, cryptographic routines and security considerations for use by implementors.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

2. Architecture

R5N is an overlay network with a pluggable transport layer. The following figure shows the R5N architecture.

             |  +-----------------+  +-------+
Applications |  | GNU Name System |  | CADET |  ...
             |  +-----------------+  +-------+
-------------+------------------------------------ Overlay Interface
             |  ^
             |  |   +---------------+
             |  |   | Block Storage |
             |  |   +---------------+
             |  |    ^
R5N          |  v    v
             | +--------------------+    +---------+
             | | Message Processing |<-->| Routing |
             | +--------------------+    +---------+
             |  ^                          ^
             |  v                          v
-------------+------------------------------------ Underlay Interface
             | +--------+  +--------+
             | |GNUnet  |  |IP      |  ...
Connectivity | |Underlay|  |Underlay|
             | |Link    |  |Link    |
             | +--------+  +--------+
Figure 1
Applications
Applications are components which directly use the DHT overlay interfaces. Possible applications include the GNU Name System [I-D.draft-schanzen-gns] or the CADET transport system [cadet].
Overlay Interface
The Overlay Interface exposes the core operations of the DHT overlay to applications. This includes querying and retrieving data from the DHT.
Block Storage
The Block Storage component is used to persist and manage data by peers. It includes logic for quotas, caching stragegies and data validation.
Message Processing
The Message Processing component processes requests from and responses to applications as well as messages from the underlay network.
Routing
The Routing component includes the routing table as well as routing and peer selection logic. It facilitates the R5N routing algorithm with required data structures and algorithms.
Underlay Interface
The DHT Underlay Interface is an abstraction layer on top of the supported links of a peer. Peers may be linked by a variety of different transports, including "classical" protocols such as TCP, UDP and TLS or advanced protocols such as GNUnet, L2P or Tor.

3. Underlay

In the network underlay, a peer is addressable by traditional means out of scope of this document. For example, the peer may have a TCP/IP address, or a HTTPS endpoint. While the specific addressing options and mechanisms are out of scope for this document, it is necessary to define a universal addressing format in order to facilitate the distribution of connectivity information to other peers in the DHT overlay. This format is the "HELLO" message. A "HELLO" is a human-readable UTF-8 [RFC3629] string consisting of the peer public key and the HELLO URI [RFC3986].

hello-format := <peer-public-key> <hello-uri>
peer-public-key := [A-HJ-NP-Z1-9]+
Figure 2

For the string representation of the peer public key, the base-32 encoding "StringEncode" is used. However, instead of following [RFC4648] the character map is based on the optical character recognition friendly proposal of Crockford [CrockfordB32]. The only difference to Crockford is that the letter "U" decodes to the same base-32 value as the letter "V" (27).

The "scheme" part of the HELLO URI defined the addressing scheme which is used. An example of an addressing scheme used throughout this document is "ip+tcp", which refers to a standard TCP/IP socket connection. The "hier"-part of the URI must provide a suitable address for the given addressing scheme. The following is a non-normative example of a HELLO containing three HELLO URIs:

Y924NSHMMZ1N1SQCE5TXF93ED6S6JY311K0QT86G9WJC68F6XVZ0 \
        ip+tcp://1.2.3.4:6789 \
        gnunet+tcp://12.3.4.5/ \
        i2p+udp://1.2.4.5:424/ \
        tor+onionv3://rasdflkjasdfliasduf.onion/
Figure 3

It is expected that there are basic mechanisms available to manage peer connectivity and addressing. The required functionality are abstracted through the following procedures and events:

PEER_CONNECTED(phash,address)
is a signal that allows the DHT to react to peers which connect. Such an event triggers, for example, updates in the routing table.
PEER_DISCONNECTED(phash,address)
is a signal that allows the DHT to react to peers which disconnect. Such an event triggers, for example, updates in the routing table.
TRY_CONNECT(pid, address)
A function which allows a peer to attempt the establishment of a connection to another peer using an address.
HOLD(pash)
A function which tells the underlay to keep a hold on the connection to another peer.
DROP(pash)
A function which tells the underlay to drop the connection to another peer.
RECEIVE(source, message)
A function or event that allows the peer to receive protocol messages as defined in this document from a connected peer.
SEND(target, message)
A function that allows a peer to send protocol messages as defined in this document to a connected peer. If call to SEND fails, the message has not been sent.
NETWORK_SIZE_ESTIMATE(N)
A function or event that provides estimates on the network size for use in the DHT routing algorithms.
ADDRESS_ADD(pk, address)
The underlay signals us that an address was added. This information is used, for example, to publish connectivity as part of the bootstrapping and overlay creation.
ADDRESS_DELETE(pk, address)
The underlay signals us that an address was removed. This information is used, for example, to publish connectivity as part of the bootstrapping and overlay creation.
VERIFY(blob)
Signature verification by underlay.

4. Overlay

In the DHT overlay, a peer is addressable by its Peer ID. The Peer ID is the 256-bit hash of the peer public key. The peer public key is the public key of the corresponding Ed25519[ed25519] peer private key.

5. Block Storage

6. Routing

6.1. Peer selection

In order to select peers from the routing table which are suitable destinations for sending messages, R5N uses a hybrid approach: Given an estimated network size N, the peer selection for the first N hops is random. After the initial N hops, peer selection follows an XOR-based peer distance calculation.

As the message traverses a random path through the network for the first N hops, it is essential that routing loops are avoided. In R5N, a bloomfilter is used as part of the routing metadata in messages. The bloomfilter is updates at each hop with the hops peer identity. For the next hop selection in both the random and the deterministic case, any peer which is in the bloomfilter for the respective message is not included in the peer selection process.

The procedure to select a peer for a given message key and bloomfilter is defined as follows:

PEER-SELECT(key, bloomfilter)
  peers := <Select all known peers NOT in message bloomfilter>
  IF hops >= N
    dist := MAX_VALUE
    FOR EACH p IN peers
      IF XOR(p, key) < dist
        dist := XOR(p, key)
        target := p
      END
    END
  ELSE
    r := rand()
    target := peers[r]
  END
END
Figure 4

7. Message Processing

7.1. Bloomfilter

In order to prevent circular routes, GET and PUT messages contain a 128-bit Bloom filter (m=128). The Bloom filter is used to detect duplicate peer IDs along the route. A Bloom filter "bf" is initially empty, consisting only of zeroes. There are two functions which can be invoked on the Bloom filter: BF-SET(bf, e) and BF-TEST(bf, e) where "e" is an element which is to added to the Bloom filter or queried against the set. Any bloom filter uses k=16 different hash functions each of which is defined as follows:

BF-TEST(key, bloomfilter)
  H_key := SHA512 (key)
  FOR i IN 0..15
    bit := INT(H_key[i * k]) % 1024
    IF bloomfilter[bit] IS SET
      RETURN TRUE
    END
  END
  RETURN FALSE
END

BF-SET(key, bloomfilter)
  H_key := SHA512 (key)
  FOR i IN 0..15
    bit := INT(H_key[i * k]) % 1024
    bloomfilter[bit] := 1
  END
END
Figure 5

7.2. Processing options

In order to indicate certain processing requirements for messages a number of processing options may be specificied in the respective field of the signalling messages. The options field is 8 octets in length and each options is encoded in a single bit.

Demultiplex everywhere (0)
Each peer along the way should process the request. Otherwise only peers that are locally closest to the key and no longer in the random path mode should process it.
Record route (1)
Indicates to keep track of the route that the message took in the P2P network.
Find peer (2)
Indicates a 'FIND-PEER' request. Implies that approximate results are acceptable.

7.3. Extended query

TODO: What is this for? Not documented anywhere

7.4. PUT message

0     8     16    24    32    40    48    56
+-----+-----+-----+-----+-----+-----+-----+-----+
|  MSIZE    |   MTYPE   |         BTYPE         |
+-----+-----+-----+-----+-----+-----+-----+-----+
|  OPTIONS  | HOPCOUNT  | REPL_LVL  | PATH_LEN  |
+-----+-----+-----+-----+-----+-----+-----+-----+
|                    EXPIRATION                 |
+-----+-----+-----+-----+-----+-----+-----+-----+
|                   BLOOMFILTER                 /
/                 (128 byte)                    |
+-----+-----+-----+-----+-----+-----+-----+-----+
|                      KEY                      /
/                 (64 byte)                     |
+-----+-----+-----+-----+-----+-----+-----+-----+
/              PUTPATH (variable length)        /
+-----+-----+-----+-----+-----+-----+-----+-----+
/              PAYLOAD (variable length)        /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 6

where:

MSIZE
denotes the size of this message in network byte order.
MTYPE
is the 16-bit message type. This type can be one of the DHT message types but for put messages it must be set to the value 146 in network byte order.
BTYPE
is a 32-bit block type field. The block type indicates the content type of the payload. In network byte order.
OPTIONS
is a 16-bit options field (see below).
HOPCOUNT
is a 16-bit number indicating how many hops this message has traversed to far. In network byte order.
REPL_LVL
is a 16-bit number indicating the desired replication level of the data. In network byte order.
PATH_LEN
is a 16-bit number indicating the length of the PUT path recorded in PUTPATH. As PUTPATH is optiona, this value may be zero. In network byte order.
EXPIRATION
denotes the absolute 64-bit expiration date of the content. In microseconds since midnight (0 hour), January 1, 1970 in network byte order.
BLOOMFILTER
A bloomfilter (for peer identities) to stop circular routes.
KEY
The key under which the PUT request wants to store content under.
PUTPATH
the variable-length PUT path. The path consists of a list of PATH_LEN peer IDs.
PAYLOAD
the variable-length resource record data payload. The contents are defined by the respective type of the resource record.

7.5. GET message

0     8     16    24    32    40    48    56
+-----+-----+-----+-----+-----+-----+-----+-----+
|  MSIZE    |   MTYPE   |         BTYPE         |
+-----+-----+-----+-----+-----+-----+-----+-----+
|  OPTIONS  |  HOPCOUNT | REPL_LVL  |  XQ_SIZE  |
+-----+-----+-----+-----+-----+-----+-----+-----+
|                 BLOOMFILTER                   /
/                 (128 byte)                    |
+-----+-----+-----+-----+-----+-----+-----+-----+
|                       KEY                     /
/                 (64 byte)                     |
+-----+-----+-----+-----+-----+-----+-----+-----+
/     BF_MUTATOR        |   XQUERY              /
+-----+-----+-----+-----+                       /
/                 (variable length)             /
+-----+-----+-----+-----+-----+-----+-----+-----+
/              BF_RESULT (variable length)      /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 7

where:

MSIZE
denotes the size of this message in network byte order.
MTYPE
is the 16-bit message type. This type can be one of the DHT message types but for put messages it must be set to the value 147 in network byte order.
BTYPE
is a 32-bit block type field. The block type indicates the content type of the payload. In network byte order.
OPTIONS
is a 16-bit options field (see below).
HOPCOUNT
is a 16-bit number indicating how many hops this message has traversed to far. In network byte order.
REPL_LVL
is a 16-bit number indicating the desired replication level of the data. In network byte order.
XQ_SIZE
is a 32-bit number indicating the length of the optional extended query XQUERY. In network byte order.
BLOOMFILTER
A bloomfilter (for peer identities) to stop circular routes.
KEY
The key under which the PUT request wants to store content under.
XQUERY
the variable-length extended query. Optional.
BF_MUTATOR
The 32-bit bloomfilter mutator for the result bloomfilter.
RESULT_BF
the variable-length result bloomfilter.

7.6. RESULT message

0     8     16    24    32    40    48    56
+-----+-----+-----+-----+-----+-----+-----+-----+
|  MSIZE    |   MTYPE   |        BTYPE          |
+-----+-----+-----+-----+-----+-----+-----+-----+
|   //      | OPTIONS   | PUTPATH_L | GETPATH_L |
+-----+-----+-----+-----+-----+-----+-----+-----+
|                   EXPIRATION                  |
+-----+-----+-----+-----+-----+-----+-----+-----+
|                      KEY                      /
/                 (64 byte)                     |
+-----+-----+-----+-----+-----+-----+-----+-----+
/                    PUTPATH                    /
/                 (variable length)             /
+-----+-----+-----+-----+-----+-----+-----+-----+
/                    GETPATH                    /
/                 (variable length)             /
+-----+-----+-----+-----+-----+-----+-----+-----+
/                   PAYLOAD                     /
/              (variable length)                /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 8

where:

MSIZE
denotes the size of this message in network byte order.
MTYPE
is the 16-bit message type. This type can be one of the DHT message types but for put messages it must be set to the value 148 in network byte order.
OPTIONS
is a 16-bit options field (see below).
BTYPE
is a 32-bit block type field. The block type indicates the content type of the payload. In network byte order.
PUTPATH_L
is a 16-bit number indicating the length of the PUT path recorded in PUTPATH. As PUTPATH is optiona, this value may be zero. In network byte order.
GET_PATH_LEN
is a 16-bit number indicating the length of the GET path recorded in GETPATH. As PUTPATH is optiona, this value may be zero. In network byte order.
EXPIRATION
denotes the absolute 64-bit expiration date of the content. In microseconds since midnight (0 hour), January 1, 1970 in network byte order.
KEY
The key under which the PUT request wants to store content under.
PUTPATH
the variable-length PUT path. The path consists of a list of PATH_LEN peer IDs.
GETPATH
the variable-length PUT path. The path consists of a list of PATH_LEN peer IDs.
PAYLOAD
the variable-length resource record data payload. The contents are defined by the respective type of the resource record.

8. Security Considerations

9. GANA Considerations

GANA [GANA] is requested to create a "DHT Block Types" registry. The registry shall record for each entry:

The registration policy for this sub-registry is "First Come First Served", as described in [RFC8126]. GANA is requested to populate this registry as follows:

Number | Name    | Contact | References | Description
-------+---------+---------+------------+-------------------------

Figure 9

GANA is requested to amend the "GNUnet Signature Purpose" registry as follows:

Purpose | Name            | References | Description
--------+-----------------+------------+--------------------------
Figure 10

10. Test Vectors

11. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC3629]
Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, , <https://www.rfc-editor.org/info/rfc3629>.
[RFC3986]
Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, , <https://www.rfc-editor.org/info/rfc3986>.
[RFC4648]
Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/info/rfc4648>.
[RFC6940]
Jennings, C., Lowekamp, B., Ed., Rescorla, E., Baset, S., and H. Schulzrinne, "REsource LOcation And Discovery (RELOAD) Base Protocol", RFC 6940, DOI 10.17487/RFC6940, , <https://www.rfc-editor.org/info/rfc6940>.
[RFC8126]
Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, , <https://www.rfc-editor.org/info/rfc8126>.
[ed25519]
Bernstein, D., Duif, N., Lange, T., Schwabe, P., and B. Yang, "High-Speed High-Security Signatures", , <http://link.springer.com/chapter/10.1007/978-3-642-23951-9_9>.
[CrockfordB32]
Douglas, D., "Base32", , <https://www.crockford.com/base32.html>.
[GANA]
GNUnet e.V., "GNUnet Assigned Numbers Authority (GANA)", , <https://gana.gnunet.org/>.
[R5N]
Evans, N. S. and C. Grothoff, "R5N: Randomized recursive routing for restricted-route networks", , <https://doi.org/10.1109/ICNSS.2011.6060022>.
[cadet]
Polot, B. and C. Grothoff, "CADET: Confidential ad-hoc decentralized end-to-end transport", , <https://doi.org/10.1109/MedHocNet.2014.6849107>.
[I-D.draft-schanzen-gns]
Schanzenbach, M., Grothoff, C., and B. Fix, "The GNU Name System", , <https://datatracker.ietf.org/doc/draft-schanzen-gns/>.

Authors' Addresses

Martin Schanzenbach
GNUnet e.V.
Boltzmannstrasse 3
85748 Garching
Germany
Christian Grothoff
Berner Fachhochschule
Hoeheweg 80
CH-2501 Biel/Bienne
Switzerland
Bernd Fix
GNUnet e.V.
Boltzmannstrasse 3
85748 Garching
Germany