Structured ("Sector") GKR

Source: Tha13, section 5 ("Time-Optimal Protocols for Circuit Evaluation").

Review: Equality MLE

We begin by briefly recalling the MLE (see this section for more details). We first consider the binary string equality function , where

This function is if and only if and are equal as binary strings, and otherwise. We can extend this to a multilinear extension via the following -- consider , where

Structured Layerwise Relationship

See Tha13, page 25 ("Theorem 1") for a more rigorous treatment. Note that Remainder does not implement Theorem 1 in its entirety, and that many circuits which do fulfill the criteria of Theorem 1 are currently not expressible within Remainder's circuit frontend.

Structured layerwise relationships can loosely be thought of as data relationships where the bits of the index of the "destination" value in the 'th layer are a (optionally subset) permutation of the bits of the index of the "source" value in the 'th layer for . As a concrete example, we consider a layerwise relationship where the destination layer is half the size, and its values are the element-wise products of the two halves of the source layer's values: Let represent the MLE of the destination layer, and let represent the MLE of the source layer.

Let the evaluations of over the hypercube be . Then we wish to create a layerwise relationship such that the evaluations of over the hypercube are . We can actually write this as a simple rule in terms of the (integer) indices of as follows:

If we allow for our arguments to be the binary decomposition of rather than itself, we might have the following relationship:

where is the binary representation of and is the binary representation of . This is in fact very close to the exact form-factor of the polynomial layerwise relationship which we should create between the layers -- we now consider the somewhat un-intuitive relationship

One way to read the above relationship is the following: for any (binary decomposition) , the value of the 'th layer at the index represented by should be . We are summing over all possible values of the hypercube above, , and for each value we check whether the current iterated hypercube value "equals" the argument value. If so, we contribute to the sum and if not, we contribute zero to the sum.

In this way we see for that all of the summed values will be zero except for when are exactly identical to , and thus only the correct value will contribute to the sum (and thus the value of ).

As described, the above relationship looks extremely inefficient in some sense -- why bother summing over all the hypercube values when we already know that all of them will be zero because will evaluate to zero at all values except one?

The answer is that it's not enough to only consider for binary , as our claims will be of the form , where , and is the multilinear extension of (see claims section for more information on prover claims). Another way to see this is that the above relationship is able to be shown for each , but we want to make sure that the relationship holds for all . Rather than checking each index individually, it's much more efficient to check a "random combination" of all values simultaneously by evaluating at a random point . We thus have, instead, that

Since is identical to (and similarly for and ) everywhere on the hypercube, the above relationship should still hold for all binary . Moreover, the above relationship is now one which we can directly apply sumcheck to, since we have a summation over the hypercube!

But wait, you might say. This still seems wasteful -- why are we bothering with this summation and polynomial? Why can't we just have something like

Unfortunately the above relationship cannot work, as are linear on the LHS and quadratic on the RHS. The purpose of the summation and polynomial is to "linearize" the RHS and quite literally turn any high degree polynomial (such as ) into its unique multilinear extension (recall the definition of multilinear extension).

We see that the general pattern of creating a "structured" layerwise relationship is as follows:

  • First, write the relationship in terms of the binary indices of values between each layer. In our case, .
  • Next, replace on the LHS of the equation with formal variable , and allow the LHS to be a multilinear extension. We now have on the LHS.
  • Next, replace on the RHS of the equation with boolean values and add an predicate between , and add a summation over all values. Additionally, extend all to their multilinear extensions (this is importantly only for sumcheck):

Structured "Selector" Variables

Some relationships between layers are best expressed piece-wise. For example, let's say that we have a destination layer, , and a source layer of the same size, , where we'd like to square the first two evaluations but double the last two.

In other words, if has evaluations over the boolean hypercube, then should have evaluations . If we follow our usual protocol for writing the layerwise relationship here, we would have something like the following for the "integer index" version of the relationship:

We notice that in binary form, whenever (and when ). We can thus re-write the above as

In other words, when the second summand on the RHS is zero, and the first summand is just since we already know that , and vice versa for when . At a first glance, this may look similar to the earlier example in which we took the products of adjacent pairs of layer values, but the two are not the same --

  • First, the current setup is semantically quite different; in the previous example we applied a binary "shrinking" transformation between the source and destination layers by multiplying pairs of values, while in the current example we are "splitting" the circuit into two semantic halves and applying a different unary operation element-wise to each half.
  • Second, the current setup actually has its variable outside of the argument to an MLE representing data in a previous layer. This is precisely what allows us to "select" between the two semantic halves of the circuit and compute an element-wise squaring in the first and a doubling in the second.

Applying the third transformation rule from above and extending everything into its multilinear form, we get

However, now the observation that should only apply to variables which are nonlinear on the RHS is helpful here -- notice that although as a variable would be quadratic on the RHS, is linear and can thus be removed from the summation altogether and replaced directly with :

This layerwise relationship form-factor is called a "selector" in Remainder terminology and in general refers to an in-circuit version of an "if/else" statement where MLEs representing the values of layers can be broken into power-of-two-sized pieces.

Why Structured Circuits?

Compared to canonic GKR, structured circuits are good for two reasons -- verifier runtime and circuit description size. Note that every layerwise relationship which can be expressed as a structured layer (using ) can be written as an equivalent series of and gate layers.

To see why structured layers are good for circuit description size, we compare the following layer-wise relationships which describe the same circuit wiring pattern, but with different verifier cost and circuit description complexities (let and for shorthand). Firstly, the structured version, which computes an element-wise product between each pair of evaluations:

And secondly, the multiplication gate version:

We first consider the verifier runtime for both. Note that the sumcheck verifier performs operations per round of sumcheck, plus the work necessary for the oracle query. In the structured relationship's case, there are rounds of sumcheck, and assuming that get bound to , the oracle query which the verifier must evaluate is of the form

where is the 'th univariate polynomial which the prover sends during sumcheck. The prover sends the claimed values for both and , and so the verifier doesn't do any work there. The verifier additionally evaluates on its own, which it can do in time.

Next, we consider the verifier runtime for the multiplication gate case: let be bound to and let be bound to during sumcheck. The oracle query is then

Similarly to the structured case, the prover sends claimed values for and , and so the verifier doesn't have to do any work here. However, the verifier must also evaluate on its own. This requires time linear to the sparsity of the polynomial, i.e. in this example, since there are nonzero multiplication gates (one for each pair of values in layer ).

A circuit description size comparison between the two can be seen in a very similar light. In particular, the representation of requires just words to store (assuming each word can hold an -bit value), as we simply enumerate the indices between the 's and the 's. On the other hand, storing the sparse representation of required for linear-time proving requires storing all nonzero evaluations, i.e. such indices in the above example (although one might argue that the representation is quite structured and can therefore be further compressed).

A note on claims

The above example (and other similar layerwise relationships) give us a further prover speedup through the structure of the claims which arise from the oracle query during sumcheck. In particular, the claims which the prover makes to the verifier in the structured case are as follows:

These claims can be aggregated with almost no additional cost to the prover and verifier, as the first challenges are identical between the two and the last challenges are precisely a and a . In particular, the verifier can simply sample and have the prover instead show that

On the other hand, the claims generated by the multiplication gate version of the layer above are in the form

These claims have no challenges in common, and can only be aggregated through interpolative or RLC claim aggregation, both of which are significantly more expensive than the above method.

Costs

In general (note that this does not capture all possible structured layer cases, but should be enough to give some intuition on the prover/verifier/proof size costs for most structured layers), let us assume that our layerwise relationship can be expressed in the following manner -- In other words, we have a layerwise relationship over variables, where the values in are a function of those in layers , where , such that we have total summand "groups" of MLEs over variables of size up to , i.e. the total polynomial degree in each of the is .

The prover costs are as follows:

  • For simplicity here, we assume that the prover has access to any value in (note that these evaluations must be either precomputed in linear time, e.g. Tha13, or can be streamed in linear time in a clever way, e.g. Rot24).

  • Additionally, we assume that the prover has access to any value in any , since the prover presumably knows all of the circuit values ahead of time.

  • Finally, we assume that the prover can use the "bookkeeping table folding" trick from Tha13 to compute both from and from in field operations.

  • We thus see that in the first round of sumcheck, the prover must do the following:

    • Evaluate each of the terms in the summation for every and sum them together. Each evaluation takes field operations, as there are summands and each requires multiplications.
    • Since there are values of , the above costs field operations. This is the total cost for the prover to compute the claimed sum for the first round.
    • Next, the prover must evaluate the RHS of the sumchecked equation at in the place of to compute the univariate sumcheck message. Each evaluation of , similarly to the above, costs field operations.
    • The total cost for the prover to compute the claimed sum + univariate message in the first round is thus field operations.
  • We can generalize the above to the 'th round of sumcheck (let for the first round) by noting that rather than variables, there are variables which are being summed over in the outer sum. Since the other values remain constant, the prover's total cost is simply .

  • Finally, as mentioned earlier, the prover can generate the necessary precomputed values from the 'th round from those of the 'th round in , as there are MLEs (including the polynomial) which need their bookkeeping tables to be "folded" in .

  • Putting it altogether, the prover's total cost across all sumcheck rounds is thus

  • Since the above is a geometric series in , we have that the prover's total cost across all rounds is simply

The proof size is as follows:

  • For each of the rounds of sumcheck, the prover must send over a degree univariate polynomial to the verifier. Additionally, the prover must send the original sum (although this is actually free in GKR since the verifier already has the prover-claimed sum implicitly through the prover's claim from a previous sumcheck's oracle query).
  • Finally, the prover must send over each of its claimed values for the at the end of sumcheck. There are at most claims.
  • The proof size is thus simply field elements.

The verifier runtime is as follows:

  • For each round of sumcheck, let the prover's univariate polynomial message be . The verifier samples a random challenge and checks whether
  • Since the verifier can evaluate and in and performs this check for each of the rounds of sumcheck, they can compute all the intermediate checks in .
  • During the final oracle query, the verifier must check whether
  • The verifier can compute on its own in , and has access to each of the prover-claimed values for . It can thus compute the RHS of the above in .
  • The verifier's total runtime is thus