Rngs allows the creation of RngStream which are used to easily generate new unique
random keys on demand. An RngStream is a wrapper around a JAX random key, and a
counter. Every time a key is requested, the counter is incremented and the key is
generated from the seed key and the counter by using jax.random.fold_in.
To create an Rngs pass in an integer or jax.random.key to the
constructor as a keyword argument with the name of the stream. The key will be used as the
starting seed for the stream, and the counter will be initialized to zero. Then call the
stream to get a key:
Trying to generate a key for a stream that was not specified during construction
will result in an error being raised:
>>> rngs=nnx.Rngs(params=0,dropout=1)>>> try:... key=rngs.unkown_stream()... exceptAttributeErrorase:... print(e)No RngStream named 'unkown_stream' found in Rngs.
The default stream can be created by passing in a key to the constructor without
specifying a stream name. When the default stream is set the rngs object can be
called directly to get a key, and calling streams that were not specified during
construction will fallback to default:
Sample Bernoulli random values with given shape and mean.
The values are distributed according to the probability mass function:
\[f(k; p) = p^k(1 - p)^{1 - k}\]
where \(k \in \{0, 1\}\) and \(0 \le p \le 1\).
Parameters:
key – a PRNG key used as the random key.
p – optional, a float or array of floats for the mean of the random
variables. Must be broadcast-compatible with shape. Default 0.5.
shape – optional, a tuple of nonnegative integers representing the result
shape. Must be broadcast-compatible with p.shape. The default (None)
produces a result shape equal to p.shape.
mode – optional, “high” or “low” for how many bits to use when sampling.
default=’low’. Set to “high” for correct sampling at small values of
p. When sampling in float32, bernoulli samples with mode=’low’ produce
incorrect results for p < ~1E-7. mode=”high” approximately doubles the
cost of sampling.
Returns:
A random array with boolean dtype and shape given by shape if shape
is not None, or else p.shape.
Sample Beta random values with given shape and float dtype.
The values are distributed according to the probability density function:
\[f(x;a,b) \propto x^{a - 1}(1 - x)^{b - 1}\]
on the domain \(0 \le x \le 1\).
Parameters:
key – a PRNG key used as the random key.
a – a float or array of floats broadcast-compatible with shape
representing the first parameter “alpha”.
b – a float or array of floats broadcast-compatible with shape
representing the second parameter “beta”.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with a and b. The default
(None) produces a result shape by broadcasting a and b.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and shape given by shape if
shape is not None, or else by broadcasting a and b.
Sample Binomial random values with given shape and float dtype.
The values are returned according to the probability mass function:
\[f(k;n,p) = \binom{n}{k}p^k(1-p)^{n-k}\]
on the domain \(0 < p < 1\), and where \(n\) is a nonnegative integer
representing the number of trials and \(p\) is a float representing the
probability of success of an individual trial.
Parameters:
key – a PRNG key used as the random key.
n – a float or array of floats broadcast-compatible with shape
representing the number of trials.
p – a float or array of floats broadcast-compatible with shape
representing the probability of success of an individual trial.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with n and p.
The default (None) produces a result shape equal to np.broadcast(n,p).shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by
np.broadcast(n,p).shape.
Sample random values from categorical distributions.
Sampling with replacement uses the Gumbel max trick. Sampling without replacement uses
the Gumbel top-k trick. See [1] for reference.
Parameters:
key – a PRNG key used as the random key.
logits – Unnormalized log probabilities of the categorical distribution(s) to sample from,
so that softmax(logits, axis) gives the corresponding probabilities.
axis – Axis along which logits belong to the same categorical distribution.
shape – Optional, a tuple of nonnegative integers representing the result shape.
Must be broadcast-compatible with np.delete(logits.shape,axis).
The default (None) produces a result shape equal to np.delete(logits.shape,axis).
replace – If True (default), perform sampling with replacement. If False, perform
sampling without replacement.
mode – optional, “high” or “low” for how many bits to use in the gumbel sampler.
The default is determined by the use_high_dynamic_range_gumbel config,
which defaults to “low”. With mode=”low”, in float32 sampling will be biased
for events with probability less than about 1E-7; with mode=”high” this limit
is pushed down to about 1E-14. mode=”high” approximately doubles the cost of
sampling.
Returns:
A random array with int dtype and shape given by shape if shape
is not None, or else np.delete(logits.shape,axis).
Sample Chisquare random values with given shape and float dtype.
The values are distributed according to the probability density function:
\[f(x; \nu) \propto x^{\nu/2 - 1}e^{-x/2}\]
on the domain \(0 < x < \infty\), where \(\nu > 0\) represents the
degrees of freedom, given by the parameter df.
Parameters:
key – a PRNG key used as the random key.
df – a float or array of floats broadcast-compatible with shape
representing the parameter of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with df. The default (None)
produces a result shape equal to df.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by df.shape.
If p has fewer non-zero elements than the requested number of samples,
as specified in shape, and replace=False, the output of this
function is ill-defined. Please make sure to use appropriate inputs.
Parameters:
key – a PRNG key used as the random key.
a – array or int. If an ndarray, a random sample is generated from
its elements. If an int, the random sample is generated as if a were
arange(a).
shape – tuple of ints, optional. Output shape. If the given shape is,
e.g., (m,n), then m*n samples are drawn. Default is (),
in which case a single value is returned.
replace – boolean. Whether the sample is with or without replacement.
Default is True.
p – 1-D array-like, The probabilities associated with each entry in a.
If not given the sample assumes a uniform distribution over all
entries in a.
axis – int, optional. The axis along which the selection is performed.
The default, 0, selects by row.
mode – optional, “high” or “low” for how many bits to use in the gumbel sampler
when p is None and replace = False. The default is determined by the
use_high_dynamic_range_gumbel config, which defaults to “low”. With mode=”low”,
in float32 sampling will be biased for choices with probability less than about
1E-7; with mode=”high” this limit is pushed down to about 1E-14. mode=”high”
approximately doubles the cost of sampling.
Returns:
An array of shape shape containing samples from a.
Where \(k\) is the dimension, and \(\{x_i\}\) satisfies
\[\sum_{i=1}^k x_i = 1\]
and \(0 \le x_i \le 1\) for all \(x_i\).
Parameters:
key – a PRNG key used as the random key.
alpha – an array of shape (...,n) used as the concentration
parameter of the random variables.
shape – optional, a tuple of nonnegative integers specifying the result
batch shape; that is, the prefix of the result shape excluding the last
element of value n. Must be broadcast-compatible with
alpha.shape[:-1]. The default (None) produces a result shape equal to
alpha.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and shape given by
shape+(alpha.shape[-1],) if shape is not None, or else
alpha.shape.
on the domain \(0 < x < \infty\). Here \(\nu_1\) is the degrees of
freedom of the numerator (dfnum), and \(\nu_2\) is the degrees of
freedom of the denominator (dfden).
Parameters:
key – a PRNG key used as the random key.
dfnum – a float or array of floats broadcast-compatible with shape
representing the numerator’s df of the distribution.
dfden – a float or array of floats broadcast-compatible with shape
representing the denominator’s df of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with dfnum and dfden.
The default (None) produces a result shape equal to dfnum.shape,
and dfden.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by df.shape.
Sample Gamma random values with given shape and float dtype.
The values are distributed according to the probability density function:
\[f(x;a) \propto x^{a - 1} e^{-x}\]
on the domain \(0 \le x < \infty\), with \(a > 0\).
This is the standard gamma density, with a unit scale/rate parameter.
Dividing the sample output by the rate is equivalent to sampling from
gamma(a, rate), and multiplying the sample output by the scale is equivalent
to sampling from gamma(a, scale).
Parameters:
key – a PRNG key used as the random key.
a – a float or array of floats broadcast-compatible with shape
representing the parameter of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with a. The default (None)
produces a result shape equal to a.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by a.shape.
See also
loggammasample gamma values in log-space, which can provide improved
Sample Geometric random values with given shape and float dtype.
The values are returned according to the probability mass function:
\[f(k;p) = p(1-p)^{k-1}\]
on the domain \(0 < p < 1\).
Parameters:
key – a PRNG key used as the random key.
p – a float or array of floats broadcast-compatible with shape
representing the probability of success of an individual trial.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with p. The default
(None) produces a result shape equal to np.shape(p).
dtype – optional, a int dtype for the returned values (default int64 if
jax_enable_x64 is true, otherwise int32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by p.shape.
Sample Gumbel random values with given shape and float dtype.
The values are distributed according to the probability density function:
\[f(x) = e^{-(x + e^{-x})}\]
Parameters:
key – a PRNG key used as the random key.
shape – optional, a tuple of nonnegative integers representing the result
shape. Default ().
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
mode – optional, “high” or “low” for how many bits to use when sampling.
The default is determined by the use_high_dynamic_range_gumbel config,
which defaults to “low”. When drawing float32 samples, with mode=”low” the
uniform resolution is such that the largest possible gumbel logit is ~16;
with mode=”high” this is increased to ~32, at approximately double the
computational cost.
Returns:
A random array with the specified shape and dtype.
The benefit of log-gamma is that for samples very close to zero (which occur frequently
when a << 1) sampling in log space provides better precision.
Parameters:
key – a PRNG key used as the random key.
a – a float or array of floats broadcast-compatible with shape
representing the parameter of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with a. The default (None)
produces a result shape equal to a.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by a.shape.
sigma – a float or array of floats broadcast-compatible with shape representing
the standard deviation of the underlying normal distribution. Default 1.
shape – optional, a tuple of nonnegative integers specifying the result
shape. The default (None) produces a result shape equal to ().
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape.
n – number of trials. Should have shape broadcastable to p.shape[:-1].
p – probability of each outcome, with outcomes along the last axis.
shape – optional, a tuple of nonnegative integers specifying the result batch
shape, that is, the prefix of the result shape excluding the last axis.
Must be broadcast-compatible with p.shape[:-1]. The default (None)
produces a result shape equal to p.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
unroll – optional, unroll parameter passed to jax.lax.scan() inside the
implementation of this function.
Returns:
An array of counts for each outcome with the specified dtype and with shape
p.shape if shape is None, otherwise shape+(p.shape[-1],).
where \(k\) is the dimension, \(\mu\) is the mean (given by mean) and
\(\Sigma\) is the covariance matrix (given by cov).
Parameters:
key – a PRNG key used as the random key.
mean – a mean vector of shape (...,n).
cov – a positive definite covariance matrix of shape (...,n,n). The
batch shape ... must be broadcast-compatible with that of mean.
shape – optional, a tuple of nonnegative integers specifying the result
batch shape; that is, the prefix of the result shape excluding the last
axis. Must be broadcast-compatible with mean.shape[:-1] and
cov.shape[:-2]. The default (None) produces a result batch shape by
broadcasting together the batch shapes of mean and cov.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
method – optional, a method to compute the factor of cov.
Must be one of ‘svd’, ‘eigh’, and ‘cholesky’. Default ‘cholesky’. For
singular covariance matrices, use ‘svd’ or ‘eigh’.
Returns:
A random array with the specified dtype and shape given by
shape+mean.shape[-1:] if shape is not None, or else
broadcast_shapes(mean.shape[:-1],cov.shape[:-2])+mean.shape[-1:].
If the dtype is complex, sample uniformly from the unitary group U(n).
For unequal rows and columns, this samples a semi-orthogonal matrix instead.
That is, if \(A\) is the resulting matrix and \(A^*\) is its conjugate
transpose, then:
If \(n \leq m\), the rows are mutually orthonormal: \(A A^* = I_n\).
If \(m \leq n\), the columns are mutually orthonormal: \(A^* A = I_m\).
Parameters:
key – a PRNG key used as the random key.
n – an integer indicating the number of rows.
shape – optional, the batch dimensions of the result. Default ().
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
m – an integer indicating the number of columns. Defaults to n.
Returns:
A random array of shape (*shape, n, m) and specified dtype.
Sample Pareto random values with given shape and float dtype.
The values are distributed according to the probability density function:
\[f(x; b) = b / x^{b + 1}\]
on the domain \(1 \le x < \infty\) with \(b > 0\)
Parameters:
key – a PRNG key used as the random key.
b – a float or array of floats broadcast-compatible with shape
representing the parameter of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with b. The default (None)
produces a result shape equal to b.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by b.shape.
Sample uniform random values in [minval, maxval) with given shape/dtype.
Parameters:
key – a PRNG key used as the random key.
shape – a tuple of nonnegative integers representing the shape.
minval – int or array of ints broadcast-compatible with shape, a minimum
(inclusive) value for the range.
maxval – int or array of ints broadcast-compatible with shape, a maximum
(exclusive) value for the range.
dtype – optional, an int dtype for the returned values (default int64 if
jax_enable_x64 is true, otherwise int32).
Returns:
A random array with the specified shape and dtype.
Note
randint() uses a modulus-based computation that is known to produce
slightly biased values in some cases. The magnitude of the bias scales as
(maxval-minval)*((2**nbits)%(maxval-minval))/2**nbits:
in words, the bias goes to zero when (maxval-minval) is a power of 2,
and otherwise the bias will be small whenever (maxval-minval) is
small compared to the range of the sampled type.
To reduce this bias, 8-bit and 16-bit values will always be sampled at 32-bit and
then cast to the requested type. If you find yourself sampling values for which
this bias may be problematic, a possible alternative is to sample via uniform:
But keep in mind this method has its own biases due to floating point rounding
errors, and in particular there may be some integers in the range
[minval,maxval) that are impossible to produce with this approach.
Sample Rayleigh random values with given shape and float dtype.
The values are returned according to the probability density function:
\[f(x;\sigma) \propto xe^{-x^2/(2\sigma^2)}\]
on the domain \(-\infty < x < \infty\), and where \(\sigma > 0\) is the scale
parameter of the distribution.
Parameters:
key – a PRNG key used as the random key.
scale – a float or array of floats broadcast-compatible with shape
representing the parameter of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with scale. The default (None)
produces a result shape equal to scale.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by scale.shape.
Where \(\nu > 0\) is the degrees of freedom, given by the parameter df.
Parameters:
key – a PRNG key used as the random key.
df – a float or array of floats broadcast-compatible with shape
representing the degrees of freedom parameter of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with df. The default (None)
produces a result shape equal to df.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by df.shape.
Sample Triangular random values with given shape and float dtype.
The values are returned according to the probability density function:
\[\begin{split}f(x; a, b, c) = \frac{2}{c-a} \left\{ \begin{array}{ll} \frac{x-a}{b-a} & a \leq x \leq b \\ \frac{c-x}{c-b} & b \leq x \leq c \end{array} \right.\end{split}\]
on the domain \(a \leq x \leq c\).
Parameters:
key – a PRNG key used as the random key.
left – a float or array of floats broadcast-compatible with shape
representing the lower limit parameter of the distribution.
mode – a float or array of floats broadcast-compatible with shape
representing the peak value parameter of the distribution, value must
fulfill the condition left<=mode<=right.
right – a float or array of floats broadcast-compatible with shape
representing the upper limit parameter of the distribution, must be
larger than left.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with left,``mode`` and right.
The default (None) produces a result shape equal to left.shape, mode.shape
and right.shape.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by left.shape, mode.shape and right.shape.
Sample truncated standard normal random values with given shape and dtype.
The values are returned according to the probability density function:
\[f(x) \propto e^{-x^2/2}\]
on the domain \(\rm{lower} < x < \rm{upper}\).
Parameters:
key – a PRNG key used as the random key.
lower – a float or array of floats representing the lower bound for
truncation. Must be broadcast-compatible with upper.
upper – a float or array of floats representing the upper bound for
truncation. Must be broadcast-compatible with lower.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with lower and upper. The
default (None) produces a result shape by broadcasting lower and
upper.
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and shape given by shape if
shape is not None, or else by broadcasting lower and upper.
Returns values in the open interval (lower,upper).
on the domain \(-\infty < x < \infty\), and where \(\mu > 0\) is the location
parameter of the distribution.
Parameters:
key – a PRNG key used as the random key.
mean – a float or array of floats broadcast-compatible with shape
representing the mean parameter of the distribution.
shape – optional, a tuple of nonnegative integers specifying the result
shape. Must be broadcast-compatible with mean. The default
(None) produces a result shape equal to np.shape(mean).
dtype – optional, a float dtype for the returned values (default float64 if
jax_enable_x64 is true, otherwise float32).
Returns:
A random array with the specified dtype and with shape given by shape if
shape is not None, or else by mean.shape.
node – the base node containing the rng states to split.
splits – an integer or tuple of integers specifying the
shape of the split rng keys.
only – a Filter selecting which rng states to split.
graph – If True (default), uses graph-mode which supports the full
NNX feature set including shared references. If False, uses
tree-mode which treats Modules as regular JAX pytrees, avoiding
the overhead of the graph protocol.
Returns:
A SplitBackups iterable if node is provided, otherwise a
decorator that splits the rng states of the inputs to the
decorated function.
split_rngs returns a SplitBackups object that can be used to restore the
original unsplit rng states using nnx.restore_rngs(), this is useful
when you only want to split the rng states temporarily:
node – the base node containing the rng states to fork.
graph – If True (default), uses graph-mode which supports the full
NNX feature set including shared references. If False, uses
tree-mode which treats Modules as regular JAX pytrees, avoiding
the overhead of the graph protocol.
Returns:
A SplitBackups iterable if node is provided, otherwise a
decorator that forks the rng states of the inputs to the
decorated function.
fork_rngs returns a SplitBackups object that can be used to restore the
original unforked rng states using nnx.restore_rngs(), this is useful
when you only want to fork the rng states temporarily:
Update the keys of the specified RNG streams with new keys.
Parameters:
node – the node to reseed the RNG streams in.
graph – If True (default), uses graph-mode which supports the full
NNX feature set including shared references. If False, uses
tree-mode which treats Modules as regular JAX pytrees, avoiding
the overhead of the graph protocol.
policy – defines how the the new scalar key is for each RngStream is used to
reseed the stream. If 'scalars_only' is given (the default), an error is raised
if the target stream key is not a scalar. If 'match_shape' is given, the new
scalar key is split to match the shape of the target stream key. A callable
of the form (path,scalar_key,target_shape)->new_key can be passed to
define a custom reseeding policy.
**stream_keys – a mapping of stream names to new keys. The keys can be
either integers or jax.random.key.