Monday, December 31, 2012

Evaluating probabilistic cellular automata is comonadic, too! (part II)


Part I is here. The complete code is here.

We have defined our CA datatype and a local update rule for it. Since our CA is probabilistic, the update rule has a monadic effect in the Rand monad.

To perform a "global update" of the CA, we pass the current CA state and the local update rule to the extend function of Comonad:

extend localRule ca :: EnvT Probs U (Rand StdGen Bool)

The returning type is almost, but not quite, what we need. We now have a CA in which each cell is a monadic random effect. What we would like is to have the whole CA inside the Rand monad, so that a single call to runRand or evalRand gives us a value of EnvT Probs U Bool. We need a way to aggregate the monadic effects of each singular cell into a global monadic effect for the whole CA.

This is similar to the problem of transforming a list of IO actions into an IO action that returns a list. List is an instance of Traversable, so we can use the sequence function on it. Likewise, we must make our type U an instance of Traversable.

The easiest way is to have the instance derived for you automatically. Just use the DeriveTraversable extension and add Traversable to the deriving clause when declaring the U datatype.

But it doesn´t work. More precisely, it works only for one half of the universe! I wanted to print the CA's new state, 8 cells around the center (the showca function is defined in the gist):

    putStrLn . showca 8 
             . lower 
             . flip evalRand (mkStdGen 7) 
             . sequence 
             $ extend localRule ca

This printed only the left side of the universe. When it tried to print the first cell to the right, the whole computation just hanged. Why?

sequence is defined in terms of traverse. The default definition provided by the compiler first traverses the left side of the universe, then the center cell and the right side last. The StdGen generator is threaded throughout. But each side is infinite; when traversing the left side, we can get the updated generator to use on the right side only after an infinite number of steps. Laziness doesn't help here, because there is a data dependency.

This stumped me for a while, but finally I came up with an alternate version of traverse which doesn't suffer this problem:

    instance Traversable U where
        traverse f (U lstream focus rstream) = 
            let pairs =  liftA unzip 
                       . sequenceA . fmap (traversepair f) 
                       $ zip lstream rstream 
                traversepair f (a,b) = (,) <$> f a <*> f b
                rebuild c (u,v) = U u c v
            in rebuild <$> f focus <*> pairs

The trick is to zip the two streams together before traversing them, and unzip them afterwards inside the monad. This way, accessing the value in the first cell to the right doesn't require an infinite number of steps.

That solved, there is still one problem left. When we use evalRand to obtain the new state of the CA, we don't get an updated generator alongside it. And we can't use runRand to get it because the CA is potentially infinite. As we lazily consume the returned CA state, the generator we supplied is still used "under the hood" for generating new values. If we feed the updated generator returned by runRand to the following iteration, the whole computation would hang, again.

Fortunately, the MonadRandom class provides a getSplit operation which "bifurcates" a generator. At each iteration, we split the generator, throwing one half into the black hole of evalRand, and keeping the other half for the next iteration. We now know enough to define a function that, given a generator and an initial state of the CA, produces a lazy stream of all the succesive states of the automaton:

evolve :: EnvT Probs U Bool -> Rand StdGen (EnvT Probs U Bool)
evolve ca = sequence $ extend localRule ca

history :: StdGen -> EnvT Probs U Bool -> Stream (EnvT Probs U Bool)
history seed initialca =     
   let unfoldf (ca,seed) = 
           let (seed',seed'') = runRand getSplit seed 
               nextca = evalRand (evolve ca) seed'
           in  (nextca,(nextca,seed''))
   in unfold unfoldf (initialca,seed)

That's all folks! Happy New Year.

Edit: I wasn't sure of the correctness of my Traversable instance, so I posted a question on Stack Overflow.

Sunday, December 30, 2012

Evaluating probabilistic cellular automata is comonadic, too! (part I)

I'm trying to develop my intuition for comonads and comonad transformers in Haskell. This article in A Neighborhood of Infinity helped me grasp the basic concepts, so I decided to continue in the same vein, and explore how to handle simple probabilistic cellular atomata like the ones described in this link.

Let's start by adapting sigfpe's datatype to make it use the standard Comonad typeclass from the comonad package:

    data U x = U (Stream x) x (Stream x) deriving (Functor,Foldable)

    right (U a b (c:>cs)) = U (b:>a) c cs
    left  (U (a:>as) b c) = U as a (b:>c)

    instance Comonad U where
       extract (U _ b _) = b
       duplicate a = U (tail $ iterate left a) a (tail $ iterate right a)

I have used the Stream datatype (from the streams package) instead of lists, to actually enforce that the universe is infinite in both directions.

Now, to allow probabilistic "evolution" of the CA, we need two things:

  • Some way of storing and accessing the probability parameters. Think of them as the "basic constants" of the CA's universe.
  • Some way of generating random values that plays well with the Comonad instance. Remember that randomness is a kind of monadic effect.

The first one is easy, we'll just use the EnvT comonad transfomer. This transformer attaches "environment values" to comonadic values. The environment can be later extracted from the transformed comonadic value using the ask function.

Let's add probabilities to our U datatype. We pass a vanilla U value to the EnvT constructor, along with a probabilities tuple:

    type Probs = (Float,Float,Float,Float)

    ca :: EnvT Probs U Bool
    ca = EnvT (0.0,0.6,0.7,0.0) $ U (repeat False) True (repeat False)

    probs :: Probs
    probs = ask ca
 
Easy as a pie! Notice the following: with monads, there is a uniform interface to "put things in" (return) but the interfaces to "get things out" are specialized (runReaderTrunWriterT) or don't even exist (like with IO). With comonads, there is a uniform interface to "get things out" (aptly named extract) but the interface to "put things in" depends on the comonad (the EnvT constructor in this case).

Let's now define a probabilistic local update rule which calculates the next state of a cell from the state of its neighbours. We are using the MonadRandom package:

    localRule :: EnvT Probs U Bool -> Rand StdGen Bool
    localRule ca =
        let (tt,tf,ft,ff) = ask ca
            black prob = (<prob) <$> getRandomR (0,1)
        in case lower ca of
            U (True:>_) _ (True:>_) -> black tt
            U (True:>_) _ (False:>_) -> black tf
            U (False:>_) _ (True:>_) -> black ft
            U (False:>_) _ (False:>_) -> black ff

We already know what ask does. But what does lower do? lower is the dual of lift. lift adds new effects to a base monadic value; lower strips one layer of a transformed comonadic value and returns the comonadic value underneath. Here we use it to access the base U value and pattern match against it.

We are almost there. Having defined localRule, we can use it together with extend to perform a "global update" that transforms an EnvT Probs U Bool into an EnvT Probs U (Rand StdGen Bool). But now we've hit a bit of a block. What we really want to get is a Rand StdGen (EnvT Probs U Bool) which we could then "run" using evalRand and obtain the next state of the CA. How to do this is explained in the next post.

A gist with the Haskell code can be found here.

Part II of this post is here.