Internet-Draft Verifiable Random Selection July 2023
Eastlake Expires 11 January 2024 [Page]
Workgroup:
Network Working Group
Internet-Draft:
draft-eastlake-rfc3797bis-04
Obsoletes:
3797 (if approved)
Published:
Intended Status:
Best Current Practice
Expires:
Author:
D. Eastlake
Futurewei Technologies

Publicly Verifiable Nominations Committee (NomCom) Random Selection

Abstract

This document describes a method for making random selections in such a way as to promote public confidence in the unbiased nature of the choice. This method is referred to in this document as "verifiable selection". It focuses on the selection of the voting members of the IETF Nominations Committee (NomCom) from the pool of eligible volunteers; however, similar techniques could be and have been applied to other selections. This document obsoletes RFC 3797.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 11 January 2024.

Table of Contents

1. Introduction

This document describes a method for making random selections in such a way that as to promote public confidence in the unbiased nature of the choice. This method is referred to in this document as "verifiable selection". It focuses on the selection of the voting members of the IETF Nominations Committee (NomCom) from the pool of eligible volunteers; however, similar methods could be and have been applied to other cases such as the following:

This document obsoletes [RFC3797]. The primary changes to that RFC are listed in Appendix C.

Under the IETF rules, each year from among eligible volunteers as specified in [RFC9389] a set of people are randomly selected to be members of the IETF nominations committee (NomCom). The NomCom nominates members of the Internet Engineering Steering Group (IESG), the Internet Architecture Board (IAB), and other bodies as described in [RFC8713]. The number of eligible volunteers in the early years of the use of the NomCom mechanism was around 50 but in recent years has been over 200.

It is highly desirable that the random selection of the voting NomCom be done in an unimpeachable fashion so that no reasonable charges of bias or favoritism can be brought. This is as much for the protection of the selection administrator (currently, the appointed NomCom Chair) from suspicion of bias as it is for the protection of the IETF.

A method meets this criterion if public information will enable any person to reproduce the selection process and have reasonable confidence that it is unbiased. This document specifies such a method.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. General Flow of a Publicly Verifiable Process

A publicly verifiable selection could follow the three steps given in the subsections below: Determination of the Pool from which selection is made, Publication of the Algorithm, and Publication of the Resulting Selection. These steps are further detailed below. Section 3 then goes into greater depth on the required randomness.

The full selection of the IETF Nomcom is more complex in that, after the initial selection, a subsequent selection extension or extensions may be required. This is covered in Section 6 and touched on in earlier sections including this section.

2.1. Determination of the Pool

First, determine the pool of items from which the selection is to be made.

For the IETF NomCom, this is as provided in [RFC9389] or its successor. Currently, volunteers are solicited by the selection administrator. Their names are then checked for eligibility. The full list of eligible Nomcom volunteers MUST be made public early enough that a reasonable amount of time can be given for review so as to receive and hopefully resolve any disputes as to who should be in the pool before a deadline at which the pool is frozen. Although no one can be added after this deadline, the initial selection of someone included in the list who should not have been included can be easily handled as described below.

2.2. Publication of the Algorithm

The exact algorithm to be used, including the future public sources of randomness, is made public. For example, the members of the final list of eligible volunteers are ordered by publicly numbering them, some public future sources of randomness such as government run lotteries are specified, and an exact method is specified whereby eligible volunteers are selected based on a hash function [RFC4086] based on these future sources of randomness, such as the method in this document.

2.3. The Selection

When the pre-specified sources of randomness produce their output, those values plus a summary of the execution of the algorithm for selection and its results SHOULD be announced so that anyone can verify that the correct randomness source values were used and the algorithm properly executed.

For the IETF NomCom, the algorithm SHOULD be run to select, in an ordered fashion, a larger number than are actually necessary so that if any of those selected need to be passed over or replaced for any reason, an ordered set of additional alternate selections is available. Under some circumstances, additional rounds of Extended Selection may be useful as specified in Section 6.

A cut off time for any complaint that the algorithm was run with the wrong inputs or not faithfully executed MUST be specified for the initial selection and any extensions under Section 6 to finalize the output and provide a stable selection.

3. Randomness

The crux of the unbiased nature of the selection is that it is based in an exact, predetermined fashion on random information which will be revealed in the future and cannot be known to the person specifying the algorithm. That random information will be used to control the selection. The random information MUST be such that it will be publicly and unambiguously revealed in a timely fashion.

3.1. Sources of Randomness

The random sources MUST NOT include anything that a reasonable person would believe to be under the control or influence of the selection administrator. In the case of the IETF NomCom, that includes anything under the control or influence of the IETF or its components, such as IETF meeting attendance statistics, numbers of documents issued, or the like.

Examples of good information to use are winning lottery numbers for specified runnings of specified public lotteries. Particularly for major government run lotteries, great care is taken to see that they occur on time (or with minimal delay) and produce random quantities. Even in the very unlikely case one was to have been rigged, it would almost certainly be in connection with winning money in the lottery, not in connection with IETF use. Other possibilities are such things as the daily balance in the US Treasury on a specified day, the volume of trading on the New York Stock exchange on a specified day, etc. (However, the example code given below will not handle integers that are too large.) Sporting events can also be used. Experience has indicated that individual stock prices and/or volumes are a poor source of unambiguous data due to trading suspensions, company mergers, delistings, splits, multiple markets, etc. In all cases, great care MUST be taken to specify exactly what quantities are being used for randomness and what will be done if their issuance is cancelled, delayed, or advanced.

It is desireable that the last source of randomness, chronologically, produce a substantial amount of the entropy needed. If most of the randomness has come from the earlier of the specified sources, and someone has even limited influence on the final source, they might do an exhaustive analysis and exert such influence so as to bias the selection in the direction they wanted. Thus, it is RECOMMENDED that the last source be an especially strong and unbiased source of a large amount of randomness such as a major government run lottery.

It is best not to use too many different sources. Every additional source increases the probability that one or more sources might be delayed, cancelled, or just plain screwed up somehow, calling into play contingency provisions or, worst of all, creating an unanticipated situation. This would either require arbitrary judgment by the selection administrator, defeating the randomness of the selection, or a re-run with a new set of sources, causing much delay in what, for the IETF NomCom, needs to be a time bounded process. Three or four would be a good number of randomness sources. More than five is too many.

3.2. Skew

Some of the sources of randomness produce data that is not uniformly distributed. This is certainly true of volumes, prices, and horse race results, for example. However, use of a strong mixing function [RFC4086] will extract the available entropy and produce a hash value whose bits and whose remainder modulo a small divisor, only deviate from a uniform distribution by an insignificant amount.

3.3. Entropy Needed

What we are doing is selecting N items without replacement from a population of P items. The number of different ways to do this is as follows, where "!" represents the factorial function:


     P!
-------------
N! * (P - N)!

To do this in a completely random fashion requires as many random bits as the logarithm base 2 of that quantity. Some example approximate calculated number of random bits for the completely random selection of 10 items, such as IETF NomCom members, or 1 item, from various pool sizes are given below:

Table 1
Completely Random Selection of One or Ten Items From A Pool
Pool size 60 80 100 125 150 175 200 250
Bits needed to select 1 5.9 6.3 6.6 7.0 7.2 7.4 7.6 8.0
Bits needed to select 10 36 41 44 47 50 52 54 58

Using a smaller number of bits means that not all of the possible selections would be available, for example not all sets of 10 if 10 things are being selected. For a substantially smaller amount of entropy, if multiple things are being selected, there could be a correlation between the selection of two different members of the pool. However, as a practical matter, for pool sizes likely to be encountered in IETF NomCom membership selection, 42 bits of entropy should provide a sufficiently random selection of 10 items as further discussed in Appendix B.

The current USA Power Ball and Mega Millions lottery drawings have about 24 bits of entropy each in the five selected regular numbers and about 6 bits of entropy each in the Power Ball / Mega Ball. A four-digit daily numbers game drawing that selects four decimal digits has a little over 13 bits of entropy.

The source code in Section 10 uses the HMAC-SHA-256 [RFC6234] hash function which has 256 bits of output and therefore can preserve no more than that number of bits of entropy. However, this is very much more than what is likely to be needed for IETF NomCom membership selection and it is a strong mixing function that will defeat skew in the randomness input (see Section 3.2.

4. A Specific Algorithm for Initial Selection

It is important that a precise algorithm be given for canonicalizing and mixing the random sources being used and making the selection based thereon. Sources suggested above produce either a single positive number (i.e., NY Stock Exchange volume in thousands of shares) or a small set of positive numbers (many lotteries provide 6 numbers in the range of 1 through 75 or the like, a sporting event could produce the scores of two teams, etc.). A suggested precise algorithm is as follows:

  1. For each source producing one or more numeric values, each value is canonicalized by representing the value as a decimal number terminated by a period (or with a period separating the whole from the fractional part), without leading zeroes except for a single leading zero if the integer part is zero, and without trailing zeroes on the fractional part after the period. Some examples follow:

    Table 2
    Input Canonicalized
    0 0.
    0.0 0.
    42 42
    7.0 7.
    013. 13.
    .420 0.42
    12.34 12.34
    1.2340 1.234
  2. If a source produced multiple values, order those values from smallest to the largest in magnitude. This sorting is necessary because the same lottery results, for example, are sometimes reported in the order numbers were drawn and sometimes in numeric order and such things as the scores of two sports teams that play a game have no inherent order. The selection results would not be reproducible if different persons executing the algorithm could use different orderings.
  3. If a source produced multiple values, concatenate them and suffix the result with a "/". If a source produced a single number, simply represent it as above with an added "/" suffix.
  4. At this point you have a string for each source, say s1/, s2/, ... for source 1, source 2, ... Concatenate these strings in a pre-specified order, the order in which the sources were listed when they were announced if no other order is specified, and represent each character as its ASCII code [RFC0020] producing "s1/s2/.../" as the random key from which selection is derived.
  5. Produce a sequence of random values derived from applying the HMAC-SHA-256 function [RFC6234] using the key specified in step 4 to a 64-byte "messaage" composed as follows. Treat each of these "random" HMAC-SHA-256 output values as a positive 256-bit multiprecision big endian integer.

    5.A
    If one or more items are being selected without need for extensions, the messsage consists of 32 copies of the all zeros two-byte sequence for the first value, the 32 copies of 0x0001 for the second value, etc., treating the replicated two bytes as a big-endian counter.
    5.B
    For selections of the IETF Nomcom the initial 32 bytes (256 bits) of the messages are a hash chain value as specified in Section 6.1. The remaining 32 bytes are 16 copies of the two byte sequences as specified in 5.A above.
  6. Finally, do a pseudo-random series of selections from the pool of listed items (e.g., NomCom volunteers) as follows: If there are P pool members, select the first by dividing the first derived random value, treated as an unsigned integer, by P and using the remainder plus one as the position of the selectee in the published list. Select the second by dividing the second derived random value by P-1 and using the remainder plus one as the position in the list with the first selected person eliminated. And so on.

Any ambiguity in the above procedure is resolved by consulting the example code below.

Use of alphanumeric random sources is NOT RECOMMENDED due to the much greater difficulty in canonicalizing them in an independently repeatable fashion; however, if the administrator of the selection process chooses to ignore this advice and use an ASCII or similar Roman alphanumeric source or sources, all white space, punctuation, accents, and special characters should be removed, and all letters set to upper case. This will leave only an unbroken sequence of letters A-Z and digits 0-9 which can be treated as a canonicalized single number above and suffixed with a "./". The administrator MUST NOT use even more complex and harder to canonicalize quantities such as complex numbers or UNICODE international text.

5. Handling Real World Problems

In the real world, problems can arise in following the steps and flow outlined above. Some problems that have actually arisen are described below with recommendations for handling them.

5.1. Uncertainty as to the Nomcom Pool

Every reasonable effort should be made to see that the published NomCom pool, from which selection is made, is of certain and eligible persons. However, especially with compressed schedules or perhaps someone whose claim that they volunteered and/or are eligible has not been resolved by the deadline, or a determination that someone is not eligible which occurs after the publication of the pool, or the like, there may still be uncertainties.

This is handled by maintaining the announced schedule, INCLUDEing in the published pool those whose eligibility is uncertain and keeping the published pool list numbering IMMUTABLE after it is frozen. If one or more people in the pool are later selected by the algorithm and random input but it has been determined they are ineligible, they can be skipped and subsequently selected persons used. (This is referred to in Section 6 as Type A elimination.) Thus, the uncertainty only effects one selection and in general no more than a maximum of U selections where there are U uncertain pool members.

Other courses of action are far worse. Actual insertion or deletion of entries in the pool after its publication changes the length of the list and scrambles who is selected. Even if done before the random numbers are known, such fiddling with the list after its publication looks bad. To avoid schedule slips, there MUST be clear fixed firm public deadlines and someone who challenges their absence from the pool after the published deadline MUST have their challenge automatically denied for tardiness even if their delay is not the fault of the challenger.

5.2. Randomness Ambiguities

The best good faith efforts have been made to specify precise and unambiguous sources of randomness. These sources have been made public in advance and there has not been objection to them. However, it has happened that when the time comes to actually get and use this randomness, the real world has thrown a curve ball and it isn't quite clear what data to use. Problems have particularly arisen in connection with individual stock prices, volumes, and financial exchange rates or indices. If volumes that were published in thousands are published in hundreds, you have a rounding problem. Prices that were quoted in fractions or decimals can change to the other. If you take care of every contingency that has come up in the past, you might be hit with a new one. When this sort of thing happens, it is generally too late to announce new sources, an action which could raise suspicions of its own as well as causing substantial delay. About the only course of action is to make a reasonable choice within the ambiguity and depend on confidence in the good faith of the selection administrator. With care, such cases should be extremely rare.

Based on these experiences, it is again recommended that public lottery numbers or the like be used as the random inputs and financial volumes or prices avoided.

6. Extended NomCom Selection

There may be reasons why one or more of the selected members of the pool need to be eliminated and further selections made. This is particularly true for the IETF NomCom given the strong recommendation above that, in case of doubt or not-yet-resolved eligibility dispute, possible pool members should be left in the pool with the understanding that, in the event they are selected, they can be eliminated should it be decided they are not eligible. For the IETF NomCom, there are two types of reasons for elimination as follows:

The reasons for elimination are divided into two categories, A and B, below. Only eliminations for category B reasons require the Extension mechanisms of this section.

A.

Elimination due to direct rule enforcement by the administrator. Examples would be someone that did not meet the eligibility requirements or whose inclusion would violate the rule (or similar future rules) limiting the number of NomCo voting members with the same sponsor or all but one occurrence of someone included multiple times due to a name change or similar confusion. When there are such eliminations in the initial selectees, the administrator simply goes further down the ordered list produced with the initial randomness sources until there are the desired number of selectees who are not eliminated by such decisions. The administrator SHOULD announce who has been eliminated and the reason for the administrator's decision to eliminate them.

B.

Eliminations due to inability by the administrator to obtain confirmation of agreement from the selectee to serve before an established deadline. For example, either the selectee declines to serve or, despite reasonable efforts, a response cannot be obtained from the selectee as to whether they are willing to serve.

(The elimination of someone due to non-contactability may be viewed by the indiviual involved as working a hardship for them if it was due to no fault of their own and they wanted to serve. But there is no reasonable alternative if a NomCom voting membership of volunteers with a confirmed agreement to serve is to be finalized in a timely manner. Since someone so eliminated will, as provided below, be replaced by another randomly selected and fully qualified pool member, there is no problem from the point of view of NomCom composition.)

It will frequently be the case that, after the initial selection from the pool and the handling of any Type A eliminations as above, there will be a small number of Type B eliminations. If no further actions were taken, there will be an insufficient number of people selected and not eliminated. If additional selectees were found in such a case by just going further down the ordered list, as with Type A eliminations, this would give initially selected persons the ability to, by declining to serve, in effect, transfer their voting NomCom membership to a known different person since the entire initial ordered list is, at that point, publicly known. Some believe this is a problem, so it is resolved by the administrator iteratively using what is essentially a miniature version of the initial selection to re-randomize the remaining pool members.

6.1. Preparing for Possible Extension

Before the announcement of the public randomness sources, the administrator determines a secret random seed R possibly using the techniques given in Section 4 using secret sources of randomness which MUST be different from those publicly announced for the initial selection. For example, multiple rolls of a 20-sided die with numbered sides. The administrator MUST record this secret random seed and SHOULD record its randomness source(s) although these need not be publicly verifiable.

The administrator then secretly calculates and records a hash chain using the SHA-256 [RFC6234] hash function, denoted as H, as follows: denote H(R) as H[1](R), H(H(R)) = H(H[1](R)) as H[2](R), H(H(H(R))) = H(H[2](R)) as H[3](R), ... H(H[N-1](R))) as H[N](R), where N is a number chosen by the administrator as somewhat larger the maximum plausible number of times it might be necessary to extend selection due to Type B eliminations. It would always be safe to set N to the size of the pool minus the number of people to be selected but, as a practical matter for IETF NomCom selection, an N of 20 or so should be a generous allowance.

The last hash chain value, H[N](R), is publicly announced at the same time as the publicly verifiable randomness sourced and algorithm and is used as specified in Step 5.B in Section 4.

6.2. Extension Procedure

  1. The new pool consists of the initial pool in the same order without any selectees who have agreed to serve and without any pool members eliminated by any earlier Type A or B eliminations.
  2. The new randomness is the next earlier value in the hash chain, that is H[N - 1](R). This randomness is used as part of the message being hashed by HMAC-SHA-256 as specified in Step 5.B in Section 4. The key remains the same. (See worked example and the example code below.)
  3. The administrator publicly announces the selectees who were not eliminated, how many additional selections are needed, and H[N - 1](R). Since H[N(R) was previously made public, anyone can check that the administrator has correctly announced H[N - 1](R) by calculating H(H[N - 1](R)) and comparing it with H[N](R). The administrator announces the extended selections and any further selectees from the extended selection due to category A eliminations.
  4. The administrator still needs to check for category B eliminations among the new Extended Selection selectees. At this point in the process, the time constraints are likely to be very tight so contacting extensions selectees to be sure they are still willing to serve MUST be done urgently and with a very tight deadline. Since there may be further category B eliminations among the extended selectees, more than one cycle of Extended Selection may be needed. If so, steps 2 through 5 are repeated with minor modifications as follows: For Step 2, those in the pool before the next extension are all those from the pool who have not been selected or been subject to category A or B elimination so far. In particular, note that because they have been previous eliminated and to avoid various complex disputes and timing race conditions, someone who was uncontactable or declined to serve in an earlier round does NOT become eligible for later rounds even if they later become contactable or change their mind about declining. For Step 3, the next earlier hash in the hash chain is used as the additional randomness in the message hashed. In Step 4 the hash chain value announced is H[N-E](R) where this is the Eth Selection Extension.

The use of a hash chain, as in step 1 above, is a well known technique that first appeared in [Lamport] and is used in [RFC1760]. Because the hash function H is assumed to be non-invertible, the public announcement of H[N](R) or any other value in the chain does not reveal any earlier values in the hash chain. While the administrator could try various values of R and could thus influence the value of H[N](R) or other H[*](R), this does not provide any control over the selections because the hash chain value is combined with the output of the pre-specified public randomness sources using HMAC-SHA-256.

Multiple extension cycles may be required so the selection administration should allow enough time for at least 5 of them. For example, in the selection of the 2022/2023 NomCom, 3 extensions would have been required: The pool was, by historical standards, huge, with 267 members, the largest up till then. In the initial selection, one of the 10 potential selectees was category B eliminated because confirmation of their willingness to serve could not be obtained in a timely fashion. In the 1st Extended Selection, the 11th potential selectee was category B eliminated because they declined to serve and the 12th was category A eliminated because there were already two selectees with the same sponsor. In the 2nd Extended Selection, the 13th potential selected also declined to serve. In the 3rd Extended Selection, the 14th potential selectee became the final voting member of the Nomcom when they confirmed their willingness to serve.

7. Fully Worked Examples

>> EXAMPLE NEEDS TO ALSO COVER THE SECTION 5 EXTENSION PROVISIONS. <<

  1. Assume the eligible volunteers published in advance of selection are the numbered list of 31 past NomCom Chairs appearing below in Appendix A.

  2. Assume the following (fake example) ordered list of randomness sources:

    2.1 The Kingdom of Alphaland State Lottery daily number for 1 November 2025 treated as a single five-digit integer.

    2.2 (a) The People's Democratic Republic of Betastani State Lottery six winning numbers for 1 November 2025 and then (b) the seventh "extra number" for that day as if it was a separate random source.

Hypothetical randomness publicly produced:

Source 1: 29319

Source 2a: 9, 61, 26, 34, 42, 41

Source 2b: 55

Resulting seed string:

29319./9.26.34.41.42.61./55./

The table below gives the hex of the MD-5 of the above key string bracketed with a two-byte string that is successively 0x0000, 0x0001, 0x0002, through 0x0010 (16 decimal). The divisor for the number size of the remaining pool at each stage is given and the index of the selectee as per the original number of those in the pool.

Table 3
index Base64 value of SHA-256 div selected
1 fgSNUcziqvUcd1j46xGZdpLQmgyW+OZzGfJAx2/EyS0= 31 > 4 <
2 kMd2sgTSiCF1o11lM6Rs8yeQeRMLPnZo5k0wSFPMjHw= 30 > 30 <
3 pwrk69jq8cUF5KrD0vg31SQMOvtf5117Y6Ox5cm38f0= 29 > 19 <
4 KRXZEdXGiprKvqQ2aSnzYQpzaE0YwlfyDTBBI+R8kv8= 28 > 13 <
5 K2qq2NImq28ESPaVB9uCVrI0tPT/NOYAtryUcjGpzt8= 27 > 7 <
6 8PQ4tm652Kr8yV2D2OBKAYrKxWtkddxqtiMvIuknhgU= 26 > 22 <
7 fJQRVYErqgAmJAs7a01/SoACdnCBNcqzrGbUsFticjM= 25 > 12 <
8 wlfiQaw6S/bxcbT2u+7oshpAFxrsy6wIZyFD+uWle80= 24 > 28 <
9 ekEoRHYTkT6p5m2fP3mn354kQSI1pz/B1RKC+Fa8YXA= 23 > 15 <
10 ggmvds6SzOGPwr8vUwSPNHtk7WIsQLYiO2tl0V3yzZQ= 22 > 11 <
11 ntjVm6AGBtydG6l9aiTSSojdcp6UcYhk55Rg71y0Z+s= 21 > 5 <
12 CE14MeW+JUzb+D/gQ82dJF62NBapfROt7Ff2ngkT/XE= 20 > 27 <
13 ZRYzTo0OZ0ASx5keWlh3YH1Di4o9p5jefz+MCWmWjFk= 19 > 23 <
14 lvA2rjCw7sT0+SVNOZB29HZOVvIAiS3yA85wqE9ugPk= 18 > 6 <
15 aQy+Eof9q4MbDZam/D+Sxc5yLixLYdArJ6kr1KmrbKA= 17 > 14 <

Resulting first ten selected, in order selected:

Table 4
1. G. Huston (4) 6. M. Richardson (22)
2. R. Salz (30) 7. D. McPherson (12)
3. S. Krishnan (19) 8. B. Stark (28)
4. R. Droms (13) 9. L. Dondet (15)
5. A. Doria (7) 10. R. Draves (11)

Should one of the above turn out to be ineligible or otherwise be eliminaged by a Type A reason, the next would be M. St.Johns, number 5.

8. Security Considerations

Careful choice should be made of randomness inputs so that there is no reasonable likelihood that they are under the control of the administrator. Guidelines given above to use a reasonably small number of inputs with a substantial amount of entropy from the last should be followed. And equal care needs to be given that the algorithm selected is faithfully executed with the designated inputs values.

Publication of the random inputs and results, including the hash chain seed R (Section 6), and something like a one-week window for the community of interest to duplicate the calculations and protest if there is any discrepancy should give a reasonable assurance of faithful implementation and execution.

9. IANA Considerations

This document requires no IANA actions.

10. Source Code

The C source code below makes use of the SHA-256 reference code from [RFC6234]. The original code in [RFC2777] was written by Donald Eastlake except for the code dealing with multiple floating point number input which was written by Matt Crawford. The [RFC2777] code could only handle pools of up to 255 members and was extended to 2**16-1 by Erik Nordmark for the code in [RFC3797]. Both of these earlier versions used MD-5 [RFC1321] rather than SHA-256.

Python code by Rich Salz to implement the method in [RFC3797] is available at https://github.com/richsalz/ietf-rfc3797

The code below uses HMAC-SHA-256 [RFC6234] and has provisions for extended selections (see Section 6). It has been compiled, and tested. While no flaws were found, it is possible that when used with some compiler on some system under some circumstances some flaw will manifest itself.

<CODE BEGINS>

//*****************************************************************
/*  Example code for
 *      "Publicly Verifiable Random Selection"
 *          Donald E. Eastlake 3rd
 *              Original February 2004
 *              Updated August 2022 and June/July 2023
 *
 * Redistribution and use in source and binary forms, with or
 * without modification, is permitted pursuant to, and subject
 * to the license terms contained in, the Revised BSD License
 * set forth in Section 4.c of the IETF Trust's Legal Provisions
 * Relating to IETF Documents
 * (http://trustee.ietf.org/license-info).                       */
//*****************************************************************

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <ctype.h>


//  SHA-256, HMAC, RFC 6234
//  Note: If you build this with the RFC 6234 sources then, because
//        of the way that HMAC dispatches on the SHA type, you have
//        to include in your build not just sha224-256.c and
//        sha-private.h but also sha1.c and sha384-512.c.

#include "sha.h"


// CONSTANTS

#define MAXLINE 256           /* maximum input line */
#define MAXGENERATIONS 99     /* maximu hash chain length */


//  Local prototypes in alphabetic order
//*****************************************************************
void     b64parse ( char *input, uint8_t *output );
void     b64print ( uint8_t *p, int length );
void     b64toHex ( void );
int      b64v ( char );
void     CheckSum ( int gen, uint8_t *data,
                    int datalength, uint8_t *result );
int      getChain ( int *gen, uint8_t *hash64 );
long int getInteger ( char *prompt );
int      getNP ( void );
int      getSeed ( char *key );
void     hashChain ( void );
void     hashSHA256 ( int errreturn, int errloc );
void     hexprint ( uint8_t *p, int length );
void     hexToB64 ( void );
int      longremainder ( unsigned int divisor,
                         uint8_t hash[SHA256HashSize] );
double   NPentropy ( void );
void     pick ( void );    // RFC 3797 but with HMAC-SHA-256
void     selectExt ( void );  // [this document]
void     testCrypto ( void );


//  Global Variables
//*****************************************************************
char           tin[MAXLINE+2];  // type in buffer
int            keysize;
char           key[800];  // where key string is accumulated
unsigned int   N; // Number of items to be selected
unsigned int   P; // Size of pool
char           b64[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
                       "abcdefghijklmnopqrstuvwxyz0123456789+/";
int            debug = 0;/* debug level, = 1 print some extra stuff
                           > 1 print even more extra stuff */


//  Main driver/dispatch routine
//*****************************************************************
int main ( int argc, const char * argv[] ) {
 char       *cherr;
 int         i;
 char        ch;

nextcommand:
 printf ( "How may I serve you? " );
 cherr = fgets( tin, MAXLINE, stdin );   // get commeand
 if ( cherr == NULL )
     exit ( 102 );
 for ( i = 0; i < MAXLINE; ++i ){
     ch = tin[i];
     if ( debug > 0 )
         printf ( "%s%X%s",
             "Command character 0X", ch, "\n");
     switch ( ch ) {
         case '?': // help
             printf ( " ? -> Help\n"
                     " d -> set debug level\n"
                     " e -> entropy needed\n"
                     " h -> hash chain\n"
                     " p -> pick from pool\n"
                     " q -> quit\n"
                     " s -> select with extensions\n" );
             if ( debug )
                 printf ( " t -> test Crypto\n"
                         " 8 -> hex to base64\n"
                         " 9 -> base64 to hex\n" );
             // falls through
         case 0: case '\n': // "gets" string is zero terminated
             goto nextcommand;
         case ' ': case 127: case '\t': // skip white space
         case'\v': case '\r': case '\f': case '\b':
             continue;  // try next character
         case 'd': case 'D': // set debug level
             debug = (int)getInteger ( "Set debug level" );
             if ( debug > 1 ) {
             printf ( "argc: %i\n", argc );
             if ( argc > 0 )
                 printf ( "%s\n", argv[0] );
             }
             goto nextcommand;
         case 'e': case 'E':  // calculate entropy needed
             if ( !getNP ( ) )
                 NPentropy ( );
             goto nextcommand;
         case 'h': case 'H': // calculate hash chain
             if ( !getSeed ( key ) )
                 hashChain ( );
             goto nextcommand;
         case 'p': case 'P':  // pick from pool
             if ( !getNP ( ) && !getSeed ( key ) )
                 pick ( );
             goto nextcommand;
         case 'q': case 'Q': case '\a':  // quit
             exit ( 0 );
         case 's': case 'S':  // select with extensions
             if ( !getNP ( ) && !getSeed ( key ) )
                 selectExt ( );  // [this document]
             goto nextcommand;
         case 't': case 'T':
             testCrypto ( );
             goto nextcommand;
         case '8':
             hexToB64 ( );
             goto nextcommand;
         case '9':
             b64toHex ( );
             goto nextcommand;
         default:
             printf ( "%s%s%s", "Undefined command: ",
                      tin, "\n");
             goto nextcommand;
         }  // end switch
 }  // end "for i"
}  // end main


//  parse some base64
//  assumes input already cleaned to just Base64 characters
//    excluding '=' and is a legal length, zero terminated.
//*****************************************************************
void b64parse ( char *input, uint8_t *output ) {
 int     i, k;

 for ( i = 0, k = 0; i < MAXLINE; i += 4, k += 3 ) {
     output[k] = ( b64v ( input[i] ) << 2 ) |
                ( ( b64v ( input[i+1] ) >> 4 ) & 0xF );
     if ( !input[i+2] ) {
         output[k+1] = ( b64v ( input[i+1] ) << 4 ) |
                       ( ( b64v ( input[i+2] ) >> 2 ) &0xF );
         if ( !input[i+3] ) {
             output[k+2] = ( b64v ( input[i+2] ) << 6 ) |
                            b64v ( input[i+3] );
             }
         }
     if ( !input[i+4] )
         break;
 }  // end for i
}  // end b64parse


//  print binary as base64
//*****************************************************************
void b64print ( uint8_t *p, int length ) {
 uint8_t   nib;

 while ( length > 0 ) {
     nib = p[0] >> 2;
     printf ( "%c", b64[nib] ); // print 1st 6 bits
     nib = ( p[0] & 0x3 ) << 4; // get bottom 2 bits of 1st byte
     if ( --length ) {
         nib += ( p[1] >> 4 ); // get top 4 bits of 2nd byte
         printf ( "%c", b64[nib] ); //print 2nd 6 bits
         nib = ( p[1] & 0xF ) << 2; //get bottom 4 bits of 2nd byte
         if ( --length ) {
             nib += ( p[2] >> 6 ); // get top 2 bits of 3rd byte
             printf ( "%c%c", b64[nib], b64[ p[2] & 0x3F ] );
       }
       else  // length was 2, print rest of 2nd byte
           printf ( "%c=", b64[nib] );
     }
     else  // length was 1, print rest of 1st byte
         printf ( "%c==", b64[nib] );
     p += 3;
     --length;
 } // while
} // end b64print


//  read base64, print as hex // xxx
//*****************************************************************
void b64toHex ( void ) {
 char       clean[MAXLINE];
 uint8_t    val64[((MAXLINE*3)/4)+2];
 int        i, j, k, v;
 int        equalsigns = 0;
 char      *cherr;

 printf ( "Type some Base64: " );
 cherr = fgets ( tin, MAXLINE, stdin );
 if ( cherr == NULL )
     exit ( 902 );
 for ( i = 0, j = 0; i < MAXLINE; ++i ) {
     if ( ( tin[i] == '\n' ) || ( tin[i] == 0 ) )
         break;  // end of the line
     if ( isspace ( tin[i] ) )
         continue;  // skip white space
     if ( isalnum ( tin[i] ) ||
          ( tin[i] == '+' ) || ( tin[i] == '/') ) {
         if ( equalsigns ) {
             printf ( "Stuff after an equal sign.\n" );
             return;
             }
         clean[j] = tin[i];
         ++j;
         continue;
         }
     if ( tin[i] == '=' ) {
         switch ( equalsigns ) {
             case 0:
                 v = j % 4;
                 if ( ( v != 2 ) && ( v != 3 ) )
                     printf ( "Wrong length before '='.\n" );
                 // fall through
             case 1:
                 ++equalsigns;
                 break; // out of switch equalsigns
             case 2:
                 printf ( "Too many equal signs.\n" );
                 return;
             } // switch equalsigns
         } // if equal sign
 } // for i
 clean[j] = 0;
 if ( debug ) {
     hexprint ( (uint8_t *)clean, j+1 );
     printf ( " " );
     }
 v = j % 4;
 if ( v == 1 )
     printf ( "Wrong Base64 length.\n" );
 b64parse ( clean, val64 );
 hexprint ( val64, ( (j*3)/4 ) );
 printf ( "\n" );
} // end b64toHex


//  convert a base64 char to int
//*****************************************************************
int b64v ( char ch ) {
 int     i;

 for ( i = 0; i < 64; ++i )
     if ( ch == b64[i] )
         return i;
 exit ( 1 );
} // end b64v


// calculate and store back a 24 bit checksum of the low order
// byte of an int and a block of bytes
// This is an FNV-32 xor folded to 24 bits
//****************************************************************
void CheckSum ( int gen, uint8_t *hash,
                int hashlength, uint8_t *result ) {
 uint32_t    temp = 0x811C9DC5; // FNV32basis
 int         i;

 temp ^= gen & 0xFF;
 temp *= 0x01000193;
 for ( i = 0; i < hashlength; ++i ) {
     temp ^= hash[i];
     temp *= 0x01000193; // FNV32prime
 } // for i
 result[2] = ( temp & 0xFF ) ^ ( temp >> 24 );
 temp >>= 8;
 result[1] = temp & 0xFF;
 result[0] = temp >> 8;
}  // end CheckSum


//  get a hash chain entry
//  return zero for success, non-zero for error/quit
//*****************************************************************
int getChain ( int *gen, uint8_t *hash ) {
 char       *cherr;
 int         j;
 char        hash64[44];
 uint8_t     hashBin[SHA256HashSize];
 uint8_t     checkIn[5];  // checksum read in
 uint8_t     checkCalc[5];  // checksum calculated

 printf ( "Format is gen-hash=check where gen is the\n"
           " decimal generation number, hash= is the\n"
           " Base64 hash, and check the Base64\n checksum.\n"
           "Input hash chain value (or 'quit'): " );
 cherr = fgets ( tin, MAXLINE, stdin );
 if ( cherr == NULL )
     exit ( 1 );
 j = sscanf ( tin, "%2d-%43s=%4s", gen, hash64, (char *)checkIn );
 if ( j != 3 ) {
     if ( ( tin[0] == 'q' ) || ( tin[0] == 'Q' ) )
         return 1;  // quit
     printf ( "Bad hash chain entry foramt.\n" );
     return 1;
     }
 hash64[43] = 0;
 if ( ( *gen > MAXGENERATIONS ) || ( *gen <= 0 ) ) {
         printf ( "Bad hash chain generation.\n" );
         return 1;
     }
 b64parse ( hash64, hashBin );
 CheckSum ( *gen, hashBin, SHA256HashSize, checkCalc );
 if ( memcmp ( checkIn, checkCalc, 3 ) ) {
   printf ( "Checksum fails.\n" );
   return 1; // not equal
   }
 return 0;  // Check Sunm checks
} // end getChain


//  prompt for and get an integer input
//*****************************************************************
long int getInteger ( char *prompt ) {
 long int    i;
 char        *cherr;
 int         j;

 while ( 1 ) {
    printf ( "%s (or 'quit' to exit) ", prompt );
    cherr = fgets ( tin, MAXLINE, stdin  );
    if ( cherr == NULL )
        exit ( 1 );
    j = sscanf ( tin, "%ld", &i );
    if ( j == 1 )
        return i;
    if ( ( tin[0] == 'q' ) ||
         ( tin[0] == 'Q' ) )
        exit ( j );
 }
} // end getInteger


//  get pool size and number of items to pick
//  returns zero for success, non-zero for failure
//****************************************************************
int getNP ( void ) {

 P = (unsigned int)getInteger ( "Type size of pool:" );
 if ( ( P > 65535 ) ||
      ( P <= 0 ) ) {
     printf ( "Pool zero, negative, or too big.\n" );
     return 1;
     }
 N = (unsigned int)getInteger (
     "Type number of items to be selected:" );
 if ( N > P ){
     printf ( "Pool too small.\n" );
     return 1;
     }
 if ( N <= 0 ) {
     printf ( "Selecting zero or negative things?\n" );
     return 1;
     }
 return 0;  // got possibly reasonable values
} // end getNP


//  get the "random" inputs. echo back to user so the user may
//  be able to tell if truncation or other glitches occur.
//
//  Up to 16 inputs each of which can be either up to 16 integers
//      or up to 16 floating point numbers
//
//  output 1 for failure, 0 for success
//****************************************************************
int getSeed ( char *key ) {
 long int    temp, array[16];
 int         i, j, k, k2;
 char        sarray[16][256];
 char        *cherr;

 for ( i = 0, keysize = 0; i < 16; ++i ) {
     if ( keysize > 511 ) {
         printf ( "Too much input.\n" );
         return 1;
         }
 nexttry:
     printf (
 "Type #%d randomness, 'end', or 'quit' followed by new line.\n",
         i+1 );
     if ( i == 0 )
         printf (
         "Up to 16 integers or the word 'float' followed by up\n"
         "to 16 x.y format reals.\n" );
     cherr = fgets ( tin, MAXLINE, stdin );
     if ( cherr == NULL )
         exit ( 403 );
     j = sscanf ( tin,  // try to parse as "long int"s
                "%ld%ld%ld%ld%ld%ld%ld%ld%ld%ld%ld%ld%ld%ld%ld%ld",
                 &array[0], &array[1], &array[2], &array[3],
                 &array[4], &array[5], &array[6], &array[7],
                 &array[8], &array[9], &array[10], &array[11],
                 &array[12], &array[13], &array[14], &array[15] );
     if ( j == EOF )  // empty input
         goto nexttry;
     if ( !j ) {
         if ( ( tin[0] == 'q' ) || ( tin[0] == 'Q' ) ) // "q"uit
             return 1;
         if ( ( tin[0] == 'e' ) || ( tin[0] == 'E' ) ) // "e"nd
             break; // break out of "for i"
         else {   // floating point code by Matt Crawford
             j = sscanf ( tin,
                 "float %ld.%[0-9]%ld.%[0-9]%ld.%[0-9]%ld.%[0-9]"
                 "%ld.%[0-9]%ld.%[0-9]%ld.%[0-9]%ld.%[0-9]"
                 "%ld.%[0-9]%ld.%[0-9]%ld.%[0-9]%ld.%[0-9]"
                 "%ld.%[0-9]%ld.%[0-9]%ld.%[0-9]%ld.%[0-9]",
                 &array[0], sarray[0], &array[1], sarray[1],
                 &array[2], sarray[2], &array[3], sarray[3],
                 &array[4], sarray[4], &array[5], sarray[5],
                 &array[6], sarray[6], &array[7], sarray[7],
                 &array[8], sarray[8], &array[9], sarray[9],
                 &array[10], sarray[10], &array[11], sarray[11],
                 &array[12], sarray[12], &array[13], sarray[13],
                 &array[14], sarray[14], &array[15], sarray[15] );
             if ( ( j == 0 ) || ( j & 1 ) ) {
                 printf ( "Bad format." );
                 return 1;
             }
             else {
                 for ( k = 0, j /= 2; k < j; k++ )
                 /* strip trailing zeros */
                     for ( k2 = (int)strlen(sarray[k]);
                         sarray[k][--k2]=='0'; )
                             sarray[k][k2] = '\0';
                     printf ( "%ld.%s\n", array[k], sarray[k] );
                     keysize += sprintf ( &key[keysize], "%ld.%s",
                                          array[k], sarray[k] );
                 }
                 keysize += sprintf ( &key[keysize], "/" );
             }
         }  // end "if ( !j )"
         else
         { // sort integer values, not a very efficient algorithm
             for ( k2 = 0; k2 < j - 1; ++k2 )
                 for ( k = 0; k < j - 1; ++k )
                     if ( array[k] > array[k+1] ) {
                         temp = array[k];
                         array[k] = array[k+1];
                         array[k+1] = temp;
                         }
             for ( k = 0; k < j; ++k ) {  // print for user check
                 printf ( "%ld ", array[k] );
                 keysize += sprintf ( &key[keysize], "%ld.",
                                      array[k] );
             }  // end "for k"
             printf ( "\n" );
             keysize += sprintf ( &key[keysize], "/" );
         }  // end "if ( !j )" else
 }  // end "for i"
 if ( i == 0 ) {
     printf ( "No key input.\n" );
     return 1;
     }
 printf ( "Key is:\n %s\n", key );
 return 0;
}  // end getSeed


//  print out a hash Chain based on key
//*****************************************************************
void hashChain ( void ) {
 int            i;
 long int       j;
 SHA256Context  context;
 uint8_t        hash[SHA256HashSize];
 uint8_t        check[3];

 j = getInteger ( "Length of chain to print:" );
 if ( j > MAXGENERATIONS ) {
     j = MAXGENERATIONS;
     printf ( "Chain length clipped at %d.\n", MAXGENERATIONS );
     }
 testCrypto ( );
 hashSHA256 ( SHA256Reset ( &context ), 511 );
 hashSHA256 ( SHA256Input ( &context,
                            (uint8_t *)key,
                            (int)strlen ( key ) ), 512 );
 hashSHA256 ( SHA256Result ( &context, hash ), 513 );
 if ( debug ) {
     printf ( "Hex of SHA-256 of Key:\n" );
     hexprint ( hash, SHA256HashSize );
     printf ( "\n" );
     }
 printf ( "Generation- HashValue= Checksum\n00-" );
 b64print ( hash, SHA256HashSize );
 CheckSum ( 0, hash, SHA256HashSize,check );
 b64print ( check, 3 );
 printf ( "\n" );
 for ( i = 1; i <= j; ++i ) {
     hashSHA256 ( SHA256Reset ( &context ), 521 );
     hashSHA256 ( SHA256Input ( &context, hash, SHA256HashSize ),
                  522 );
     hashSHA256 ( SHA256Result ( &context, hash ), 523 );
     printf ( "%02d-", i );
     b64print ( hash, SHA256HashSize );
     CheckSum ( i, hash, SHA256HashSize, check );
     b64print ( check, 3 );
     printf ( "\n" );
 } // for i
} // end hashChain


//  check SHA256/HMAC return code
//*****************************************************************
void hashSHA256 ( int errreturn, int errloc ) {

 if ( !errreturn )  // zero -> success
     return;
 else
     printf ( "SHA returns error %i at %i.\n",
              errreturn, errloc );
 exit ( 1 );
}  // end hashSHA256


// print out a SHA-256 hash in hex
//****************************************************************
void hexprint ( uint8_t *p, int length ) {
 int    i;

 for ( i = 0; i < length; ++i ) {
     printf ( "%02X", p[i] );
 } // for i
} // end hexprint


//  read hex, print as base64
//*****************************************************************
void hexToB64 ( void ) {
 uint8_t   hexval[(MAXLINE/2)+2];
 char      clean[MAXLINE];
 char     *cherr;
 int       i, j, v;

 printf ( "Type some bytes in hex: " );
 cherr = fgets ( tin, MAXLINE, stdin );
 if ( cherr == NULL )
     exit ( 1102 );
 for ( i = 0, j = 0; i < MAXLINE; ++i ) {
     if ( ( tin[i] == '\n' ) || ( tin[i] == 0 ) )
         break;  // end of the line
     if ( isspace ( tin[i] ) )
         continue;  // skip white space
     if ( isxdigit ( tin[i] ) ) {
         clean[j] = tolower ( tin[i] );
         ++j;
         continue; // for
         }
     printf ( "Non-hex digit encountered: %02X\n", tin[i] );
     exit ( 1 );
     return;
}  // for i
 clean[j] = 0;
 if ( j & 1 ) {
     printf ( "Odd number of hex digits? %i\n", j );
     return;
 }
 for ( i = 0; i < j; ++i ) { // from clean to hexval
     if ( clean[i] >= 'a' && clean[i] <= 'f' )
         v = clean[i] - 'a' + 10;
     else
         v = clean[i] - '0';
     if ( i & 1 )
         hexval[i/2] += v;
     else
         hexval[i/2] = v << 4;
 }  // for i
 if ( debug ) {
     hexprint ( hexval, j/2 );
     printf ( "\n" );
     }
 b64print ( hexval, j/2 );
 printf ( "\n" );
}  // end hexToB64


// get remainder of dividing a SHA-256 hash
//   by a small positive number
//****************************************************************
int longremainder ( unsigned int divisor,
                    uint8_t hash[SHA256HashSize] ) {
 long int     kruft;
 int          i;

 if ( divisor <= 0 )
     exit ( 1 );
 for ( i = 0, kruft = 0; i < SHA256HashSize; ++i )
     {
     kruft = ( kruft << 8 ) + hash[i];
     kruft %= divisor;
     }
 return (int)kruft;
}  // end longremainder


//  calculate how many bits of entropy it takes to select N from P
//      withour replacement. Print and return it.
//****************************************************************
/*               P!
    log  ( ----------------- )
       2    N! * ( P - N )!
*/
double NPentropy ( void )
{
 long int    i;
 double      result = 0.0;

 if (    ( N < 1 )   // not selecting anything?
    ||   ( N >= P )  // selecting all of pool or more?
    )
    result = 0.0;    // degenerate case
 else {
     for ( i = P; i > ( P - N ); --i )
         result += log ( i );
     for ( i = N; i > 1; --i )
         result -= log ( i );
     /* divide by [ log (base e) of 2 ] to convert to bits */
     result /= log ( 2 );
 }
 printf ( "Approximately %.1f bits of entropy needed.\n",
                     result );
 return result;
} // end NPentropy


//  Pick N items from the pool of P items using the probe method
//****************************************************************
void pick ( void ) {
 unsigned short     *selected;
 HMACContext        context;
 uint8_t            hash[SHA256HashSize];
 uint8_t            message[64];
 unsigned int       i, remaining, divisor;
 int                j, k;

 selected =
     (unsigned short *)malloc ( P * sizeof ( unsigned short ) );
 if ( !selected ) {
     printf ( "Out of memory.\n" );
     exit ( 1 );
 }
 for ( i = 0; i < P; ++i )
     selected [i] = (unsigned short)(i + 1);
 printf ( " No extensions.\n "
"index       base64 value of HMAC-SHA-256        div selected\n"
         );
 remaining = N;
 divisor = P;
 testCrypto ( );
 for ( i = 0; i < N; ++i, --remaining, --divisor ) {
     hashSHA256 (
         hmacReset ( &context, SHA256,
                     (uint8_t *)key,
                     (int)strlen ( key ) ),
         201 );
     for ( j = 0; j < 64; ++j ) {
         if ( j & 1 )
             message[j] = i & 0xFF;
         else
             message[j] = i >> 8;
     }
     if ( debug > 1 ) {
       printf ( "message:" );
       hexprint ( message, 64 );
       printf ( "\n" );
     }
     hashSHA256 ( hmacInput ( &context, message, 64 ), 202 );
     hashSHA256 ( hmacResult ( &context, hash ), 203 );
     k = longremainder ( divisor, hash );
     for ( j = 0; j < P; ++j) {
       if ( selected[j] )
             if ( --k < 0 ) {
                 printf ( "%3d ", i + 1 );
                 b64print ( hash, SHA256HashSize );
                 printf ( " %3d  >%3d<\n", divisor,
                                           selected[j] );
                 selected[j] = 0;
                 break;  // for j
             }
     } // for j
 } // for i
 free ( (void *)selected );
} // end pick


//  Select items from a pool with possible extensions
//  You must already have a hashChain() you have saved
//****************************************************************
void selectExt ( void ) {
 unsigned short     *selected;
 HMACContext        context;
 uint8_t            hash[SHA256HashSize];
 uint8_t            chainhash[SHA256HashSize];
 uint8_t            message[64];
 unsigned int       remaining, divisor, cumulative = 0;
 int                i, j, k;
 int                stepIn;
 int                stepPrev = 0;
 int                extension = 0;

 selected =
     (unsigned short *)malloc ( P * sizeof ( unsigned short ) );
 if ( !selected ) {
     printf ( "Out of memory.\n" );
     exit ( 1 );
 }
 for ( i = 0; i < P; ++i )
     selected [i] = (unsigned short)(i + 1);
 printf ( "Input final hash chain string.\n" );
 remaining = N;
 divisor = P;

extendloop:
 if ( getChain ( &stepIn, chainhash ) )
     return;
 if ( stepPrev && ( stepIn != (stepPrev - 1) ) ) {
   printf ( "Wrong generation hash chain string. "
            "Should have been %i.\n", stepPrev - 1 );
   goto extendloop;
   }
 // set first half of message from hash chain
 for ( i = 0; i < 32; ++i )
   message[i] = chainhash[i];
 stepPrev = stepIn;
 if ( extension )
     printf ( " Extension #%i.\n", extension );
 else
     printf ( " Initial selection.\n" );
 printf (
 "index       base64 value of HMAC-SHA-256        div selected\n"
         );
 for ( i = 0; i < N; ++i, --remaining, --divisor ) {
     hmacReset ( &context, SHA256,
                (uint8_t *)key, (int)strlen ( key ) );
     for ( j = 32; j < 64; ++j ) {
       if ( j & 1 )  // set second half of message
             message[j] = i & 0xFF;
         else
             message[j] = i >> 8;
     }
     if ( debug > 1 ) {
       printf ( "message:" );
       hexprint ( message, 64 );
       printf ( "\n" );
     }
     hmacInput ( &context, message, 64 );
     hmacResult ( &context, hash );
     k = longremainder ( divisor, hash );
     for ( j = 0; j < P; ++j) {
       if ( selected[j] )
             if ( --k < 0 ) {
                 printf ( "%3d ", cumulative + i + 1 );
                 b64print ( hash, SHA256HashSize );
                 printf ( " %3d  >%3d<\n", divisor, selected[j] );
                 selected[j] = 0;
                 break;  // for j
             }
     } // for j
 } // for i
 extension += 1;
 cumulative += N;
 N = (unsigned int)getInteger (
     "Number of picks in next extension, 0 to end: " );
 if ( N > 0 )
     goto extendloop;
 free ( (void *)selected );
}


//  Test that SHA-256 and HMAC code seems to be working
//****************************************************************
void testCrypto ( void ) {
 SHA256Context  contexts;
 HMACContext    contexth;
 char           test1[] = "abc";  // SHA-256
 char           test2k[] = "Jefe";  // HMAC key
 char           test2d[] = "what do ya want for nothing?";
 uint8_t        corrects[] = { 0xBA, 0x78, 0x16, 0xBF, 0x8F, 0x01,
     0xCF, 0xEA, 0x41, 0x41, 0x40, 0xDE, 0x5D, 0xAE, 0x22, 0x23,
     0xB0, 0x03, 0x61, 0xA3, 0x96, 0x17, 0x7A, 0x9C, 0xB4, 0x10,
     0xFF, 0x61, 0xF2, 0x00, 0x15, 0xAD };
 uint8_t        correcth[] = { 0x5B, 0xDC, 0xC1, 0x46, 0xBF, 0x60,
     0x75, 0x4E, 0x6A, 0x04, 0x24, 0x26, 0x08, 0x95, 0x75, 0xC7,
     0x5A, 0x00, 0x3F, 0x08, 0x9D, 0x27, 0x39, 0x83, 0x9D, 0xEC,
     0x58, 0xB9, 0x64, 0xEC, 0x38, 0x43 };
 uint8_t        hash[SHA256HashSize];

 hashSHA256 ( SHA256Reset ( &contexts ), 1201 );
 hashSHA256 ( SHA256Input ( &contexts, (uint8_t *)test1, 3 ),
              1202 );
 hashSHA256 ( SHA256Result ( &contexts, hash ), 1203 );
 if ( memcmp ( hash, corrects, SHA256HashSize ) ) {
     printf ( "SHA256 not working.\n" );
     exit ( 1 );
     }
 if ( debug )
     printf ( "SHA256 OK.\n" );
 hashSHA256 ( hmacReset ( &contexth, SHA256,
                          (uint8_t *)test2k,
                          (int)strlen ( test2k ) ),
              1203 );
 hashSHA256 ( hmacInput ( &contexth, (uint8_t *)test2d,
                          (int)strlen ( test2d ) ),
              1204 );
 hashSHA256 ( hmacResult ( &contexth, hash ), 1205 );
 if ( memcmp ( hash, correcth, SHA256HashSize ) ) {
     printf ( "HMAC not working.\n" );
     exit ( 1 );
     }
 if ( debug )
     printf ( "HMAC OK.\n" );
} // end testCrypto


<CODE ENDS>

11. Normative References

[RFC0020]
Cerf, V., "ASCII format for network interchange", STD 80, RFC 20, DOI 10.17487/RFC0020, , <https://www.rfc-editor.org/info/rfc20>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC4086]
Eastlake 3rd, D., Schiller, J., and S. Crocker, "Randomness Requirements for Security", BCP 106, RFC 4086, DOI 10.17487/RFC4086, , <https://www.rfc-editor.org/info/rfc4086>.
[RFC6234]
Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)", RFC 6234, DOI 10.17487/RFC6234, , <https://www.rfc-editor.org/info/rfc6234>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

12. Informative References

[Lamport]
Lamport, L., "Password Authentication with Insecure Communication", Communications of the ACM 24.11, pages 770-772, .
[RFC1321]
Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, DOI 10.17487/RFC1321, , <https://www.rfc-editor.org/info/rfc1321>.
[RFC1760]
Haller, N., "The S/KEY One-Time Password System", RFC 1760, DOI 10.17487/RFC1760, , <https://www.rfc-editor.org/info/rfc1760>.
[RFC2777]
Eastlake 3rd, D., "Publicly Verifiable Nomcom Random Selection", RFC 2777, DOI 10.17487/RFC2777, , <https://www.rfc-editor.org/info/rfc2777>.
[RFC3797]
Eastlake 3rd, D., "Publicly Verifiable Nominations Committee (NomCom) Random Selection", RFC 3797, DOI 10.17487/RFC3797, , <https://www.rfc-editor.org/info/rfc3797>.
[RFC5890]
Klensin, J., "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", RFC 5890, DOI 10.17487/RFC5890, , <https://www.rfc-editor.org/info/rfc5890>.
[RFC8713]
Kucherawy, M., Ed., Hinden, R., Ed., and J. Livingood, Ed., "IAB, IESG, IETF Trust, and IETF LLC Selection, Confirmation, and Recall Process: Operation of the IETF Nominating and Recall Committees", BCP 10, RFC 8713, DOI 10.17487/RFC8713, , <https://www.rfc-editor.org/info/rfc8713>.
[RFC9389]
Duke, M., "Nominating Committee Eligibility", BCP 10, RFC 9389, DOI 10.17487/RFC9389, , <https://www.rfc-editor.org/info/rfc9389>.

Appendix A. History of NomCom Voting Member Selection

For reference purposes, here is a list of the IETF Nominations Committee member selection techniques and chairs so far:

Table 5
Num YEAR CHAIR SELECTION METHOD
1 1993/1994 Jeff Case Clergy
2 1994/1995 Fred Baker Clergy
3 1995/1996 Guy Almes Clergy
4 1996/1997 Geoff Huston Spouse
5 1997/1998 Mike St.Johns Algorithm
6 1998/1999 Donald Eastlake 3rd RFC 2777
7 1999/2000 Avri Doria RFC 2777
8 2000/2001 Bernard Aboba RFC 2777
9 2001/2002 Theodore Ts'o RFC 2777
10 2002/2003 Phil Roberts RFC 2777
11 2003/2004 Rich Draves RFC 2777
12 2004/2005 Danny McPherson RFC 3797
13 2005/2006 Ralph Droms RFC 3797
14 2006/2007 Andrew Lange RFC 3797
15 2007/2008 Lakshminath Dondeti RFC 3797
16 2008/2009 Joel M. Halpern RFC 3797
17 2009/2010 Mary Barnes RFC 3797
18 2010/2011 Tom Walsh RFC 3797
19 2011/2012 Suresh Krishnan RFC 3797
20 2012/2013 Matt Lepinski RFC 3797
21 2013/2014 Allison Mankin RFC 3797
22 2014/2015 Michael Richardson RFC 3797
23 2015/2016 Harald Alvestrand RFC 3797
24 2016/2017 Lucy Lynch RFC 3797
25 2017/2018 Peter Yee RFC 3797
26 2018/2019 Scott Mansfield RFC 3797
27 2019/2020 Victor Kuarsingh RFC 3797
28 2020/2021 Barbara Stark RFC 3797
29 2021/2022 Gabriel Montenegro RFC 3797
30 2022/2023 Rich Salz RFC 3797
31 2023/2024 Martin Thomson RFC 3797 + hash chain extensions

Clergy = Names were written on pieces of paper, placed in a receptacle, and a member of the clergy picked the NomCom members.

Spouse = Same as Clergy except chair's spouse made the selection.

Algorithm = Algorithmic selection based on similar concepts to those documented in [RFC2777] and [RFC3797].

RFC 2777 = Algorithmic selection using the algorithm and reference code provided in [RFC2777] (but not the fake example sources of randomness).

RFC 3797 = Algorithmic selection using the algorithm and reference code provided in [RFC3797] (but not the fake example sources of randomness).

RFC 3797 + hash chain extensions = As with [RFC3797] but using a hash chain for Extended Selection as generally specified in Section 6.

Appendix B. More Equations and Numbers

You can skip this section unless you want to dig a little bit further into the statistical arguments.

To illustrate the relatively minor effect in practice of less entropy than needed for complete randomization, assume you select N items from a pool of P things and that you do this T times where N << P << T. Obviously, the expected value of the number of times each thing would be selected is


                   N * T
Expected Value = ---------
                     P

Although NomCom selection is done without replacement (since it makes no sense to select the same person more than once), given that N << P we can approximate selection statistics assuming selection with replacement. Making the further approximation of the binomial distribution for the Gaussian distribution, the standard deviation of the number of times a thing would be selected is


                                  ___________________
                               2 /     N          N
Standard Deviation of Value =   / T * --- * (1 - ---)
                               V       P          P

Assuming the specific case of selecting 10 items from a pool of 200, typical of an IETF NomCom selection near the date of the document. The following table shows, for various powers of 2 number of item set selections, the expected number of times each item would be selected and the standard deviation in the expected number.

Table 6
Times Set of 10 Selected Base 2 Log(Times) Expected Times Each Item Selected Standard Deviation of Times Item Selected SD as a % of Expected
1,024 10 51.2 22.1 43.2%
1,​048,​576 20 52,​429 706 1.35%
1,​073,​741,​824 30 53,​687,​091 22,​584 0.0421%
1,​099,​511,​627,​776 40 54,​975,​581,​389 722,​681 0.00131%

Thus, even if more bits are needed for perfect randomness, 40 bits of entropy will assure only an insignificant deviation from completely random selection for the difference in probability of selection of different pool members, the correlation between the selection of any pair of pool members, and the like for a small number of pool members.

Appendix C. Changes from RFC 3797

The primary differences between this documenet and [RFC3797], the previous version, are the following:

Appendix D. Versions Change History

RFC EDITOR NOTE: Please remove this Appendix before publication

D.1. -00 to -01

D.2. -01 to -02

D.3. -02 to -03

D.4. -03 to -04

Acknowledgements

The suggestions and comments on this document from the following persons are gratefully acknowledged: Paul Hoffman and Martin Thomson.

Acknowledgements for RFC 3797: Matt Crawford and Erik Nordmark made major contributions to this document. Comments by Bernard Aboba, Theodore Ts'o, Jim Galvin, Steve Bellovin, and others have been incorporated.

Author's Address

Donald E. Eastlake 3rd
Futurewei Technologies
2386 Panoramic Circle
Apopka, Florida 32703
United States of America