SEEDING

  In the original version by Makoto Matsumoto and Takuji Nishimura,
  mti == N + 1 indicates the the generator is not seeded.  mti should
  never exceed N once one has started random number generation.
  Therefore, mti == N + 1 means that one is about to poll the first
  random number, and then one has to seed first.  After seeding, mti
  is left at N.

  In the MesoRD C++ version, the generator is seeded from the
  constructor.  One does therefore not need to check whether seeding
  is required before polling a random number in genrand_int32().
  Because the seed() functions leave mti at N, genState() will be
  called from the first invocation to genrand_int32().


TWIST

  The Makoto Matsumoto and Takuji Nishimura original:

    static unsigned long mag01[2] = {0x0UL, MATRIX_A};
    ...
    for (kk = 0; kk < N - M; kk++) {
      y      = (mt[kk] & UPPER_MASK) | (mt[kk + 1] & LOWER_MASK);
      mt[kk] = mt[kk + M] ^ (y >> 1) ^ mag01[y & 0x1UL];
    }
    for ( ; kk < N - 1; kk++) {
      y      = (mt[kk] & UPPER_MASK) | (mt[kk + 1] & LOWER_MASK);
      mt[kk] = mt[kk + (M - N)] ^ (y >> 1) ^ mag01[y & 0x1UL];
    }
    y         = (mt[N - 1] & UPPER_MASK) | (mt[0] & LOWER_MASK);
    mt[N - 1] = mt[M - 1] ^ (y >> 1) ^ mag01[y & 0x1UL];
    mti       = 0;

  The three assignments to mt[] are on the form "a ^ b ^ c", where

    (1) a = mt[d], for some index d.

    (2) b = e >> 1, where e in turn has its high bits from mt[f] and
        its low bits from mt[g], for some indices f and g.  e is
        computed by the mixBits() function in MesoRD,

          inline uint32_t
          mixBits(const uint32_t u, const uint32_t v) const
            {
              return ((u & UPPER_MASK) | (v & LOWER_MASK));
            }

        where u corresponds to mt[f] and v corresponds to mt[g].
        Richard Wagner's mixBits() function should be identical,

          uint32 hiBit(const uint32& u) const {
            return u & 0x80000000UL;
          }
          uint32 loBits(const uint32& u) const {
            return u & 0x7fffffffUL;
          }
          uint32 mixBits(const uint32& u, const uint32& v) const {
            return hiBit(u) | loBits(v);
          }

        except the logical AND is split into separate functions.  On
        the contrary, Jasper Bedaux does the AND directly in the
        corresponding twist() function, to be discussed next.  See the
        first term in the twiddle() function below.

    (3) c = 0 if LSB(e) == 0, and c = MATRIX_A otherwise.  Here, LSB()
        could be defined as LSB(z) = z & 0x1.  Because the low bits in
        e come from mt[g], c = 0 if LSB(mt[g]) == 0 and c = MATRIX_A
        if LSB(mt[g]) == 1.  The two last terms in the XOR "a ^ b ^ c"
        are calculated as follows by Jasper Bedaux

          inline unsigned long
          twiddle(unsigned long u, unsigned long v) {
            return (((u & 0x80000000UL) | (v & 0x7FFFFFFFUL)) >> 1)
              ^ ((v & 1UL) ? 0x9908B0DFUL : 0x0UL);
          }

        which would correspond to

          (mixBits(u, v) >> 1) ^ ((v & 1) ? MATRIX_A : 0).

        Hence the MesoRD way of computing "a ^ b ^ c"

          inline uint32_t
          twist(uint32_t u, uint32_t v, uint32_t w) const
            {
              return (u
                ^ (mixBits(v, w) >> 1)
                ^ (((w & 0x1) != 0) ? MATRIX_A : 0));
            }

        an XOR of all three terms.  It should be identical to Richard
        Wagner's:

          uint32 loBit(const uint32 u) const {
            return u & 0x00000001UL;
          }
          uint32 magic(const uint32 u) const {
            return loBit(u) ? 0x9908b0dfUL : 0x0UL;
          }
          uint32 twist(const uint32& m,
                       const uint32& s0,
                       const uint32& s1) const {
            return m ^ (mixBits(s0,s1)>>1) ^ magic(s1));
          }

  In summary, the assignments can be written as a function of mt[d],
  mt[f] and mt[g].  This yields the MesoRD implementation,

    inline void
    genState(){
      size_t kk;
      for (kk = 0; kk < N - M; kk++)
        mt[kk] = twist(mt[kk + M], mt[kk], mt[kk + 1]);
      for (kk = N - M; kk < N - 1; kk++)
        mt[kk] = twist(mt[kk + (M - N)], mt[kk], mt[kk + 1]);
      mt[N - 1] = twist(mt[M - 1], mt[N - 1], mt[0]);
      mti = 0;
    }

  which should be identical to the Makoto Matsumoto and Takuji
  Nishimura original, apart from that the assignments are done by
  twist().  Richard Wagner's implementation should be identical, too,
  except a pointer p is used instead.  At the last assignment p points
  to the last element, so p[M - N] is the (M - N):th element from the
  end, or the (M - 1):th element from the beginning.

    inline void MTRand::reload() {
      static const int MmN = int(M) - int(N);
      register uint32 *p = state;
      register int i;
      for (i = N - M; i--; ++p)
        *p = twist(p[M], p[0], p[1]);
      for (i = M; --i; ++p)
        *p = twist(p[MmN], p[0], p[1]);
      *p = twist(p[MmN], p[0], state[0]);
      left = N, pNext = state;
    }

  Because Jasper Bedaux's twiddle() function, which is roughly
  equivalent to twist() in MesoRD, does an XOR with the two last
  terms, the third XOR has to be done in gen_state()

    void MTRand_int32::gen_state() { // generate new state vector
      for (int i = 0; i < (n - m); ++i)
        state[i] = state[i + m] ^ twiddle(state[i], state[i + 1]);
      for (int i = n - m; i < (n - 1); ++i)
        state[i] = state[i + m - n] ^ twiddle(state[i], state[i + 1]);
      state[n - 1] = state[m - 1] ^ twiddle(state[n - 1], state[0]);
      p = 0; // reset position
    }


ACKNOWLEDGEMENTS

  We thank Shawn Cokus, Jasper Bedaux, Makoto Matsumoto, Takuji
  Nishimura, and Richard J. Wagner for their respective BSD licensed
  implementations.

  We also thank Martina Froehlich for bringing the problem with the
  previous MesoRD implementation to our attention.
