“Conroutines Pipelines” notes

This is the notes during reading an excellent paper by Mario Blazevic on Haskell conroutines. The paper can be found here. It is the last paper of The Monad Reader issue 19.

Monad Trampolines

As we know, monad is of great help when you want to somehow sequence your computations(though this is not what monad is all about, it is probably the biggest motivation it was introduced in the first place). Now, when we execute a monad, what if we want to be able to pause the monad for whatever reasons? A Trampoline monad helps us achieve this by wrapping monads in a new type that will either run the underlying monad or return the current continuation and let the caller to decide when to continue. I highlighted either to emphasis that we are calling for a…well, Either type to wrap around the underlying monad.

Let’s define such a type:

newtype Trampoline m r = Trampoline{
    bounce :: m (Either (Trampoline m r) r)
}

Trampoline m r is a type that wraps an underlying Monad m, and with a bounce function to continue a trampoline until a continuation is hit instead of underlying monad.

It is of course itself a monad:

instance Monad m => Monad (Tranpoline m) where
    return r = Trampoline $ return (Right r)
    t >>= f = Trampoline (bounce t >>= either
                                       (return . Left . (>>=f))
                                       (bounce . f))

Note that in the definition of bind function, we first bounce t, if we get a continuation back, we wrap it in Left constructor and return, else we continue to bounce the next Trampoline returned by monadic function f.

Also note how function bounce is correctly implemented in both return and the bind function for each Trampoline instance we created.

It is also a monad transformer so that we can easily lift underlying monad to Trampoline.

instance MonadTrans Trampoline where
    lift = Trampoline . liftM Right

We then define pause to be a trampoline to indicate a cont should be returned, and run to be a function that runs a trampoline skipping all pauses.

pause :: Monad m => Trampoline m ()
pause = Trampoline (return $ Left $ return ())

run :: Monad m => Trampiline m a -> m a
run t = bounce t >>= either run return

Now we have all we needed for a trampoline computation that can pause anywhere we want:

*Main> let hello = do{lift (putStr "Hello, "); pause; lift (putStr "World! ")}
*Main> Left cont <- bounce hello
Hello,
*Main> putStr "wonderful "
wonderful
*Main> run cont
"World!"
()

Because of the pausing/suspension feature, we can easily interleave two trampoline now, see the original paper for an example where multiple trampolines are synchronized(meaning those faster to reach a pause will be held until all trampolines reached to their next pause). Notice how this approach executes a list of actions cooperating with each other, and without introducing any threads(and the logic to coordinate those threads!).

Suspension Functors

We see that trampoline pauses the program to let the caller do whatever it wants during the pause. We can introduce two more types with the pause feature, but providing more during the pause. Generator generates a value to the caller for a pause, and Iteratee expects a value from the caller to proceed after the pause.

Naturally, a Generator is a type when bounceed, will Either generate a value and a continuation — (a, Generator a m x), or a result directly.

newtype Generator a m x = Generator {
      bounceGen :: m (Either (a, Generator a m x) x)
}

instance Monad m => Monad (Generator a m) where
    return r = Generator $ return (Right r)
    g >>= f =  Generator (bounceGen g >>= either
                                    ((a,cont) -> 
                                        return $ Left (a, cont >>= f)
                                    )
                                    (bounceGen . f)
                         )

instance MonadTrans (Generator a) where
    lift = Generator . liftM Right

yeild :: Monad m => a -> Generator a m ()
yeild a = Generator (return $ Left (a, return ()))

runGenerator :: Monad m => Generator a m x -> m ([a], x)
runGenerator = run' id where
    run' f g = bounceGen g
               >>= either
                   ((a,cont) -> run' (f . (a:)) cont)
                   (x -> return (f [], x))

As for Iteratee, just replace the tuple with a function, that takes an expecting value to return a new Iteratee.

newtype Iteratee a m x = Iteratee{
      bounceIter :: m (Either (a -> (Iteratee a m x)) x)
}


instance Monad m => Monad (Iteratee a m) where
    return r = Iteratee $ return (Right r)
    t >>= f = Iteratee (bounceIter t 
                        >>= either
                            (cont -> return $ Left ((>>=f) . cont))
                            (bounceIter . f)
                       )

instance MonadTrans (Iteratee a) where
    lift = Iteratee . liftM Right

await :: Monad m => Iteratee a m a
await = Iteratee (return $ Left return)

runIteratee :: Monad m => [a] -> Iteratee a m x -> m x
runIteratee (a:rest) i = bounceIter i
                         >>= either
                             (cont -> runIteratee rest (cont a))
                             return
runIteratee [] i = bounceIter i
                   >>= either
                       (cont -> runIteratee []
                                 (cont $ error "No more values to feed")
                       )
                   return

Now, Trampoline, Generator and Iteratee all have very similar structure in their implementation. In fact, they only differ in the Left data construct of the Either type: for Trampoline it is a id around the continuation, for Generator it is a (a,) and for Iteratee it is a curried function. And the implementations only differs in how they handle these different types.

The key idea of Coroutine type is based on the realization that all those (id, (a,), curried function) are functors. So we start define a type of Coroutine with all functors.

newtype Coroutine s m r = Coroutine{
      resume :: m (Either (s (Coroutine s m r)) r)
}


instance (Functor s, Monad m) => Monad (Coroutine s m) where
    return r = Coroutine $ return (Right r)
    t >>= f = Coroutine (resume t 
                        >>= either
                            (return . Left . (fmap (>>=f)))
--                            same impl but less elegant
--                            (fcont -> return $ Left (fmap (>>=f) fcont))
                            (resume . f)
                       )

instance MonadTrans (Coroutine a) where
    lift = Coroutine . liftM Right

suspend :: (Monad m, Functor s) => s (Coroutine s m x) -> Coroutine s m x
suspend s = Coroutine( return (Left s)
                     )

And now the definitions of Trampoline, Generator and Iteratee can be defined using the Coroutine typeclass.

type Trampoline' m x = Coroutine Identity m x
type Generator' a m x = Coroutine ((,) a) m x
type Iteratee' a m x = Coroutine ((->) a) m x

Above tells how each of the Coroutine should be composed as a monad and monad transformer.

We now go define the run and pause action for each of them. Note how similar they look to the original definitions.

suspend :: (Monad m, Functor s) => s (Coroutine s m x) -> Coroutine s m x
suspend s = Coroutine( return (Left s)
                     )
pause' :: Monad m => Trampoline' m ()
pause' = suspend (Identity $ return ())

yield' :: (Monad m, Functor ((,) x)) => x -> Generator' x m ()
yield' x = suspend (x, return ())

await' :: (Monad m, Functor ((->) x)) => Iteratee' x m x
await' = suspend return

run' :: Monad m => Trampoline' m x -> m x
run' t = resume t >>= either (run' . runIdentity) return

runGenerator' :: Monad m => Generator' x m r -> m ([x], r)
runGenerator' = run' id where
    run' f g = resume g
               >>= either
                   ((x,cont) -> run' (f . (x:)) cont)
                   (r -> return (f [], r))

runIteratee' :: Monad m => [x] -> Iteratee' x m r -> m r
runIteratee' (x:xs) i = resume i 
                        >>= either
                            (cont -> runIteratee' xs (cont x))
                            return
runIteratee' [] i = resume i
                    >>= either
                        (cont -> runIteratee' [] (cont $ error "No more values to feed"))
                        return

We also define an utility function to map one Coroutine with functor s to Coroutine with functor s'.

mapSuspentions :: (Functor s, Monad m) => (forall a. s a -> s' a)
                  -> Coroutine s m x -> Coroutine s' m x
mapSuspentions f cort = Coroutine {resume = liftM map' (resume cort)}
                        where map' (Right r) = Right r
                              map' (Left s) = Left $ f (fmap (mapSuspentions f) s)

By now, the idea of coroutine should be clear: it is a computation with predefined pausing actions. The sequence between two pauses are “atomic”, meaning they will run as one single computation, and they will stop when they hot a pause, returning back a continuation instead of the value of the computation. The continuation might be just there for the pausing effect, or it can do some more work, like generating a value or expecting a value. With these pausing continuations, it is then possible to mix many coroutines together so that they work cooperatively(thus the name “co-routine”). They are “continued” by its neighbors and they “continue” it’s neighbors when they complete their part and found it is time to let the neighbors continue. As we will show below.

Communicating coroutines

Before we dive into the piping usage of coroutines which is the core of haskell lib pipes and conduits, there is one thing I want to point out: if we make a coroutine whose suspension functor simply pause and request a global scheduler to switch to another coroutine(randomly probably), then we had ourselves a non-preemptive green thread system(aka symmetric coroutine) where threads themselves decide when to give up cpu and let others run. This shows only one of many possible usages of coroutine. We can further improve the scheduler to run coroutines in different OS-thread to gain from multiple cpu. All these can be done relatively easily!

Now, let’s chain together two coroutines that are dual to each other: Generator and Iteratee. Naturally, one produces value and one consumes them.

bindM2 :: Monad m => (a -> b -> m c) -> m a -> m b -> m c
bindM2 f ma mb = do{a<-ma;b<-mb;f a b}

pipe1 :: Monad m => Generator' a m x -> Iteratee' a m y -> m (x,y)
pipe1 g i = bindM2 proceed (resume g) (resume i) where
    proceed (Left (a,c)) (Left f) = pipe1 c (f a)
    proceed (Left (a,c)) (Right y) = pipe1 c (return y) -- more generated values
    proceed (Right x) (Left f) = error "The producer ended too soon"
    proceed (Right x) (Right y) = return (x,y)

-- better error handling with Maybe type
pipe2 :: Monad m => Generator' a m x -> Iteratee' (Maybe a) m y -> m (x,y)
pipe2 g i = bindM2 proceed (resume g) (resume i) where
    proceed (Left (a,c)) (Left f) = pipe2 c (f $ Just a)
    proceed (Left (a,c)) (Right y) = pipe2 c (return y) -- more generated values
    proceed (Right x) (Left f) = pipe2 (return x) (f Nothing)
    proceed (Right x) (Right y) = return (x,y)

pipe1 and pipe2 does the same thing, they resume both Generator and Iteratee until either the values from generator is all consumed by iteratee, and the computation ends; or iteratee demands more, in which case, pipe1 reports an run time error and pipe2 uses Maybe type to allow user handle it gracefully.

Example:

*Main> let gen = do {lift $ putStrLn "yielding one"; yield' 1; lift $ putStrLn "then two"; yield' 2; lift $ putStrLn "returning 3"; return 3}
*Main> runGenerator' gen
yielding one
then two
returning 3
([1,2],3)
*Main> let iter = do {lift $ putStrLn "Enter two numbers"; x <- await' ; y <- await'; lift $ putStrLn $ "sum is " ++ show (x+y)}
*Main> pipe1 gen iter
yielding one
Enter two numbers
then two
returning 3
sum is 3
(3,())
*Main> let iter2 s = do {lift $ putStrLn "Enter two numbers"; x <- await' ; maybe (lift $ putStrLn $ "sum is " ++ show s) (n -> iter2 (s + n)) x}
*Main> pipe2 gen (iter2 0)
yielding one
Enter two numbers
then two
Enter two numbers
returning 3
Enter two numbers
sum is 3
(3,())
*Main> pipe2 (return ()) (iter2 0)
Enter two numbers
sum is 0
((),())
*Main> pipe2 (gen >> gen) (iter2 0)
yielding one
Enter two numbers
then two
Enter two numbers
returning 3
yielding one
Enter two numbers
then two
Enter two numbers
returning 3
Enter two numbers
sum is 6
(3,())
*Main> pipe1 (gen >> gen) iter
yielding one
Enter two numbers
then two
returning 3
yielding one
sum is 3
then two
returning 3
(3,())
*Main> 

Note the difference between last two outputs, this is because iter2 is defined to be able to run infinitely, consuming all generated values, while iter is not.

The above example is quite limited, and manual. We now look to generalize this process. To do that, we introduce a new type of Coroutine: Transducer, or Enumeratee.

A Transducer is also a Coroutine, it’s a Coroutine whose first type argument(s in above definition) is a functor that is either a Generator or a Iteratee. First, we need EitherFunctor type from transformers lib. It takes two functors and compose them to be a new functor.

data EitherFunctor l r x = LeftF (l x) | RightF (r x)
instance (Functor l, Functor r) => Functor (EitherFunctor l r) where
    fmap f (LeftF l) = LeftF (fmap f l) 
    fmap f (RightF r) = RightF (fmap f r) 

Now we define Transducer.

type Transducer a b m x = Coroutine (EitherFunctor ((->) (Maybe a)) ((,) b)) m x

awaitT :: Monad m => Transducer a b m (Maybe a)
awaitT = suspend (LeftF return)

yieldT :: Monad m => b -> Transducer a b m ()
yieldT x = suspend (RightF (x, return ()))

We can show that we are able to lift pure or stateful computations into a Transducer easily:

<br />-- lift a pure function to a Transcuder that first pauses for a value of type `a`, then yield `f a` once it is give and proceed/resume the computation
lift121 :: Monad m => (a -> b) -> Transducer a b m ()
lift121 f = awaitT >>= maybe (return ()) (\a -> yieldT (f a) >> lift121 f)

-- lift a pure function that when give a, produce a list of type b values. The resulting transducer awaits for a, and yield the list of values, then resumes
liftStateless :: Monad m => (a -> [b]) -> Transducer a b m ()
liftStateless f = awaitT >>= 
                  maybe (return ()) 
                        (\a -> mapM_ yieldT (f a) >> liftStateless f)

-- lift a stateful computation. The new state is passed to the resumed computation.
liftStateful :: Monad m => (state -> a -> (state, [b])) -> (state -> [b]) -> state ->
                Transducer a b m ()
liftStateful f eof s = awaitT
                       >>= maybe (mapM_ yieldT (eof s))
                           (\a -> let (s', bs) = f s a
                                  in mapM_ yieldT bs >> liftStateful f eof s'
                           )

We could now pipe a Generator to a Transducer, or a Transducer to a Iteratee, or all three of them. However, if that is all we can do after all these careful type and function definitions, it will be very limited. Indeed, the reason we went through this trouble to have Transducer is that it is a generalized form of Generator and Iteratee. Note we have defined mapSuspensions earlier, which transforms before different functors. By using this, we can convert every Coroutine types we have seen into a Transducer, and if we define a function to pipe two Transducer together, we have a generalized framework for piping any Coroutine types!

Here we go:

data Naught
fromGenerator :: Monad m => Generator' a m x -> Transducer Naught a m x
fromGenerator = mapSuspensions RightF

fromIteratee :: Monad m => Iteratee' (Maybe a) m x -> Transducer a Naught m x
fromIteratee = mapSuspensions LeftF

toGenerator :: Monad m => Transducer Naught a m x -> Generator' a m x
toGenerator = mapSuspensions ((RightF a) -> a)

toIteratee :: Monad m => Transducer a Naught m x -> Iteratee' (Maybe a) m x
toIteratee = mapSuspensions ((LeftF a) -> a)

(=>=) :: forall a b c m x y . Monad m=>
         Transducer a b m x -> Transducer b c m y -> Transducer a c m (x,y)
-- Cont of t1 is a Iteratee
t1 =>= t2 = Coroutine (bindM2 proceed (resume t1) (resume t2)) where
    -- t1 is demanding value, return a Transducer demanding value too
    proceed (Left (LeftF s)) c = 
        return (Left $ LeftF $ fmap (=>= Coroutine (return c)) s)
    -- t2 is generating value, return a Transducer generating value too
    proceed c (Left (RightF s)) = 
        return (Left $ RightF $ fmap (Coroutine (return c) =>=) s)
    -- t1 is generating and t2 is consuming, happy!
    proceed (Left (RightF (b, c1))) (Left (LeftF f)) = 
        resume (c1 =>= f (Just b))
    -- t1 is generating and t2 finished, return a transducer that continue to run t1
    proceed (Left (RightF (b,c1))) (Right y) = 
        resume (c1 =>= (return y :: Transducer b c m y))
    -- t1 finished, and t2 is still demanding, feed Nothing to t2
    proceed (Right x) (Left (LeftF f)) = 
        resume ((return x :: Transducer a b m x) =>= f Nothing)
    -- both finished!
    proceed (Right x) (Right y) = return $ Right (x,y)


Let’s play around:

*Main> let iter2 s = do {lift $ putStrLn "Enter a number"; x <- await' ; maybe (lift $ putStrLn $ "sum is " ++ show s) (n -> iter2 (s + n)) x}
*Main> runIteratee' [Just 3, Nothing] (toIteratee $ double =>= fromIteratee (iter2 0))
Enter a number
Enter a number
Enter a number
sum is 6
((),())
*Main> let gen = do {lift $ putStrLn "yielding one"; yield' 1; lift $ putStrLn "then two"; yield' 2; lift $ putStrLn "returning 3"; return 3}
*Main> runGenerator' (toGenerator $ fromGenerator gen =>= double)
yielding one
then two
returning 3
([1,1,2,2],(3,()))
*Main> run' (toTrampoline $ fromGenerator (yield' 3) =>= double =>= fromIteratee (iter2 0))
Enter a number
Enter a number
Enter a number
sum is 6
(((),()),())
*Main> run' (toTrampoline $ fromGenerator (yield' 3) =>= double =>= double =>= fromIteratee (iter2 0))
Enter a number
Enter a number
Enter a number
Enter a number
Enter a number
sum is 12
((((),()),()),())

Note how the converted Transducer are chain together and it function is exactly what is defined in the general pipe function.

More generalizations

OK, we have done some really cool generalization with Transducer. Now, we are ready to move on to the next level.

We can define Coroutines that yields two type of values for downstream Coroutines, basically, it splits it’s generated values. Similarly, we can have a Coroutine that awaits for two data sources and join them together.

With Splitter and Join, we are able to do non-linear piping, and define conditional Coroutines.

type Splitter a m x = Coroutine (EitherFunctor ((->) (Maybe a)) (EitherFunctor ((,) a) ((,)a))) m x

type Join a m x = 
    Coroutine (EitherFunctor (EitherFunctor ((->) (Maybe a)) ((->) (Maybe a))) ((,) a)) m x

I’ll stop here, please refer to the original paper for the yield and await definition. Also, the paper gives the type signature of the conditional combinators like ifThenElse,not, and and others. It also gives an example of possible GREP implementation using these combinators and Splitter , Join and Transducers.

Summary

The post tries to show how Iteratee and Enumeratee in haskell are implemented. It gives a very generic and non-optimized implementation, compared to the libs that currently exist. Nevertheless, it helps to gain a deep understanding of Coroutine, and it’s application in pipe line IOs(pipes and conduits).

Under the hood, Coroutine provide a way to pause/resume a computaion. Exactly what it does when it is paused is controlled by the functor argument that is used to construct the Coroutine. With different functors, we can have Trampoline, Generator, Iteratee, Transducer, Splitter and Join. With these types, it is possible to construct any stream that processes incoming value and generating new values, or in other words, it is possible to pipe a list of those types to create a single resulting type that knows how to process incoming values.

The benefit lies in that all these Coroutine types are piped together using Continuation, they know how to pass the value to next Coroutine, and they are woken up automatically when the value it awaits is ready. There is no context switch, as in multithreaded IO programming. However, they allow programmer to work as if each Coroutine is in a separate thread. It achieve maximum efficiency of async IO without all the callbacks, thanks to the underlying continuations.

To the concern that it only utilize one OS thread when it runs, there are not much point to run it in many threads because they depend on each other, and would spend most of the time waiting, creating unnecessary context switches. In applications, it is typical that a Coroutine, or pipe/conduit is constructed from many small Coroutines, they cooperate within one OS thread, minimize context switches. When you have two separate streams, you construct two of them and let them run on two OS thread. Each of them are optimal, and we still achieve parallel IO this way.

what you need to know about network flows in programming contests

Reference:

Give that below is just my notes when reading some materials, and it will be hard for anyone to read and understand since there is no illustrations, i’ll list the references at the start, so people have a better place to start.

  1. TopCoder tutorial
  2. MIT Advanced Algorithms material
  3. Algorithm Design
  4. A Chinese discussion of a min-cost problem.

Introduction

Network Flow is a graph model that captures a large number of interesting problems. Specifically, it is a graph with edges(sometimes vertexes too) assigned with a number representing the requirements the flow/traffic passing through should meet. Despite being mainly a tool to study flows, it often arises in problems where there is no concepts of flow/traffic at all. Those problems are typically optimization or feasibility problems, involving “selecting a certain set” to maximize/minimize an object or looking to know “a selection of elements is possible”, usually having many restrictions on the selection.

One important point is all network flow problems can be converted to Linear Programming. The constraints on edge and vertex are then just linear inequalities , and the structure of the network determines the set the inequalities and how they relate to each variable. Naturally, the “selection” in many network flow problems are modelled by having variable x with upper bound 1 and low bound 0, and because of the solution is always of integral in these problems, x is effective boolean variable representing whether or not it is selected. However, LP algorithms(Simplex and others) are design to solve general LP problems, and network flow problems often present their own structure which makes much more efficient algorithms possible. One can find a discussion of solving network flow problems with specialized LP algorithms in chapter 10 of this book. LP algorithms are hard to implement in a contest setting, and is rarely required. We will skip it in our discussion.

Maximum flow and Ford–Fulkerson method

Ford–Fulkerson method is usually THE algorithm used in a contest. The key here is the introduction of “Residual Graph”. There is a residual graph for any legal flow in the network, the key realization is that residual graph encoded all the options we have for the given flow. If some combinations of the options leads to an increase of the flow value, we obtain a new flow, if none of such combination exists, we are at the maximum. The way to find such a combination is simply finding a path from source to sink in the residual graph.

One way to help understanding is that Ford-Fulkerson has the same idea as heuristic optimization, in that for a given position, it looks around to see if there is a way to move closer to the optimum. In a heuristic setting like gradient decent, the residual graph corresponds to the gradient, which tells you how to proceed. The difference is, with FF, there is one global optimum, and FF can always find it if all capacities are integers. In other words, FF is a heuristic method that happen to have no local optimums and is guaranteed to always find the optimum in network whose edges are all integers.

A common improvement on plain Ford-Fulkerson method is Edmonds–Karp algorithm. Instead of finding an arbitrary path in residual graph, it finds the path with minimum number of edges from source to sink. It does so by assign unit length to edges and do a BFS until sink appears in the BFS queue. Edmonds–Karp improves the complexity to O(VE^2). See CLRS for the proof.

Adding lower bounds to edge capacity

One common extension to maximum flow problem is adding (non-zero) lower bounds on the capacity attached to the edges. When lower bounds are added, it becomes obvious that there may not be a legal flow satisfying all lower bounds. Indeed, zero flow is no longer legal with non zero lower bounds, for example.We then have to face a feasibility issue before tacking on the maximization goal: to apply Ford-Fulkerson method, one needs to start with a legal flow, but zero flow is no longer legal here!

This situation resembles a lot like finding a legal variable assignment in Simplex to solve a linear programming problem. We are always first introduced to Simplex using examples where the origin(all variable set to zero) serves as a legal start. And the solution to both situations resemble each other too. Just like finding a legal start in Simplex is casted into a new LP problem, finding a legal start flow is equivalent to a max-flow problem(though other method exists for special cases)!

We do this by converting the original network to a new one. In the new graph, lower bounds are moved from edges to vertexes. Specifically, vertexes now have a variable indicating the flow that is supposed to go through it, negative values means the vertex is generating flows while positive values means the vertex is consuming flow(or the other way around, as long as it is consistent throughout the algorithm). For example, if we have a edge (u,v) with lower bound l. In the new graph, (u,v) no longer has lower bound, instead, flow_u is decreased by l and flow_v is increased by l. It is easy to see a legal flow in new graph is a legal flow in the original one.

We then create a source and a sink in the new graph. The source connects to all vertexes with negative(generator) values, with edge capacity set to |flow_v|; similarly the sink connects to all positive flow vertexes. And now it is easy to see that the graph has a feasible flow if and only if the maximum flow value is the sum of |flow_v|, v is vertexes with pos or neg values. And the maximum flow itself is the flow we are looking for, in both original and new graphs.

Now we have some legal flow to start with. Note that the flow we have is from the source to the sink that we created, but we are looking for a max flow between any two nodes in the original graph. This should not stop us, we just create the residual graph based on the flow, with the new source and sink, then look for paths from the given two vertexes. We do need to keep the lower bound(or flow value on vertexes) satisfied in the residual graph however, but this should not be too much to add to Ford-Fulkerson, just more bookkeeping.

Min cost max flow

Another extension common seen is to add cost to each edge: the goal is then to find the flow satisfying all lower bounds of the network that is of minimum cost. The cost of a flow is defined to be f_e*c_e where c_e is a cost value each edge has. This is so called minimum cost circulation problem. It is obvious that if the network has no positive lower bounds, zero flow is the optimum flow.

We convert network with positive lower edge bound to a network whose vertexes have demand/supply values(AKA transportation network) in last section, then solve the max flow problem in the transportation network to get a legal flow in the original network. Since network with positive lower edge bound and transportation network are equivalent, we see that the min cost circulation is exactly the maximum flow that has minimum cost in the transportation network. Therefore, the minimum cost circulation is the same problem as min cost max flow problem.

There are two facts that are important to solving this problem.

1. A circulation is optimal (min-cost) iff there are no negative cost cycles in the residual network.

To understand this, let’s say there is a negative cost cycles in the residual network of circulation f, then that cycle can be added to f, the cost will drop while f increases, which means cost of f is not minimum. See here for a formal proof in both directions. What is important here, is that if f‘s residual has no negative cycles, f is optimal, that is f has lowest possible cost among all flows carrying the same value.

2. In the equivalent transportation network, for an optimal flow f, if instead of finding any path from S to T, we find the min cost path and use that to augment the flow, we will get a new flow f' that is optimal, but with greater flow value.

To see this, we show that augmenting a min cost path(call it p) in the residual network of an optimal flow f will not leads to negative cycles in the residual graph of the new flow. For if it does, p will not be min cost. If w is the new negative cycle introduced, then some part of w overlaps with p, we replace that part with the rest of w and get a new path from S to T, but with lower cost. Contradiction.

With above two facts, it is easy to see an iterative algorithm exists, we start by an optimal flow f, and find the min cost path in residual network G_f, use it to augment to get a new optimal flow f'. Continue the process until no path from S to T can be found.

There is one more complication: how do we find the initial optimal flow to start with? It turns out that we just find the initial flow same way as last section, and check if it is optimal, if yes we are all set, if not, it means the cost is not lower bounded, as we continue to increase flow, the cost can drop further, the min cost problem is ill-posed.

In terms of implementation, because the possibility of negative cost edges, we need to use Bellman-ford to find the shortest path. The complexity is O(VE). However, there exists an improvement of Bellman-ford called SPFA, whose complexity is O(kE). k is close to 2 on average, and becomes V in worst case. More importantly, it is very easy to implement: it is basically a BFS where you need to track if a vertex is in queue or not, and if it’s distance is shortened, you need to put it back to queue. Here is the code from wikipedia page:

procedure Shortest-Path-Faster-Algorithm(G, s)
  1    for each vertex v ≠ s in V(G)
  2        d(v) := ∞
  3    d(s) := 0
  4    offer s into Q
  5    while Q is not empty
  6        u := poll Q
  7        for each edge (u, v) in E(G)
  8            if d(u) + w(u, v) < d(v) then
  9                d(v) := d(u) + w(u, v)
 10                if v is not in Q then
 11                    offer v into Q

One last note is many materials suggest the use of price on vertexes. Among other benefits, it will help remove the negative edges in residual graph, which enables Dijkstra’s Algorithm for shortest path. I don’t include it here because it is harder to grasp, and SPFA is very easy to implement with good performance. In a programming contest, it is usually a better way to go. Of course, in real applications, this might not be true.

Summary

Network flow problems are deep and rich, in that a large set of seemingly unrelated problems can be modelled by network flows. Ford-Fulkerson method is most commonly used in solving the problems, it can be summarized as below:

  1. Find a legal initial flow(zero flow in no positive lower bounds case, otherwise construct transportation network to get a legal flow)
  2. Find a path from source to sink in residual network (any path in common cases, min cost path if we are solving min cost problems)
  3. Augment the flow to get a new flow.
  4. repeat 2/3 until no further path can be found.

The key in solving this problems is to realize many problems needs a bit tweaking to become obvious as network problems, one needs to carefully design the capacity upper/lower bound and cost values to construct a network that can be used to solve the problems. And the same structure can be modelled in different networks(transportation network VS positive lower bounds, etc.)

Categories, Functors and FreeMonads

This post is based on this article from haskellforall.com. It’s basically a collections of my own thoughts and notes when reading through that article. Anyone interested in understanding free monads should read the original article first.

What is a category, in programmer’s term?

A category is a set, and an composition operator and there is a left identify and right identity to the operator. The composition operator satisfies the associative law ((a comp b) comp c = a comp (b comp c)). For example, all numbers and the addition operator (+) forms a category, with the left and right identity being 0.

Functions form a category too, with the composition operator being (.) with definition (f . g) x = f (g x), and left right identity being id x = x.

Monadic functions form their category as well, with identity being monad

return :: (Monad m) => a -> ma

and compositor being the monadic comp function

(<=<) :: (b -> m c) -> (a -> m b) -> (a -> m c)

What is a functor?

Now, what are functors? Functors are transformation operators that transform one category into another, typically from simple categories to more ‘featureful’ ones. Example, function length is actually a functor from category formed by lists to category formed by all non-negative integers. The list identity is [] and the non-negative interger identity is 0. length [] = 0 means the transformation transforms source identity to target identity, this is functor law #2. length ([1] ++ [2 3]) = length [1] + lengtj[2 3] means the transformation transforms the source composition to target composition. This is functor law#1. With the two laws, two category are “perfectly matched”, we are sure the transformation we get is a well defined category.

Another easy-to-see example is concat as a functor from the category of lists to itself.

Functors are very general, and it is everywhere when you need to map from one category to another.

The haskell type class Functor however, is just a special kind of functor that maps categories of ordinary functions. In particular, it maps the category of ordinary function

a -> b

to a more “meaningful” function

f a -> f b

With this, programmers have all tools written as ordinary functions to use when they are dealing with a specific type of functor.

Bonus: monad transformers form a category, and

lift :: (Monad m, MonadTrans t) => m a -> t m a

is a functor that transforms an ordinary monad to a monad transformer!

What is a free monad.

Now we know what functor is, it will help us understand free monad.

A free monad takes in a functor and gives back a monad of sequencing the given functor to be a recursive data type.

If that is confusing, let us begin with the definition of Free.

data Free f r = Free (f (Free f r)) | Pure r

Note that this is a recursive data type. And now the monad definition of Free:

instance (Functor f) => Monad (Free f) where
return = Pure
-- note "x = (f (Free f r))" and "g :: r -> Free f r" and "(>>= g) :: Free f r -> Free f r"
(Free x) >>= g = Free (fmap (>>= g) x)
(Pure x) >>= g = (g x)

If you look closely, you will notice the bind function in a free monad just continuously applies the monadic function to the recursive part of Free. Note that normally a monad’s bind function is defined to be:

(>>=) = (join .) . flip fmap)
join :: m (m a) -> m a

So that the monad context is collapsed into single layer.

With Free monad, the context (Free f) does not collapse, instead, without the join function in use, the Free monad will preserve the structure of the monad context. That is, it gives a monadic way to construct the recursive type Free f r.

For example, if we have a type type MaybeFree = Free Maybe a, and we want to construct an instance with 3 layer of Free Maybe type constuctor, we could write threeLayer = Free (Just (Free (Just (Free Just (Pure 1))))). The construction of recursive data types like this is cumbersome and counter intuitive. With monadict construction, we can do instead:

oneLayer :: MaybeFree Int
oneLayer = Free (Just (Pure 1))
-- or
oneLayer = liftF (Just 1)
threeLayer' :: MaybeFree Int
threeLayer' = do oneLayer
                 oneLayer
                 oneLayer

The monadic way is more more clear. Note above example only shows a very simple usage, we have entire haskell monad support for us to create the recursive data type instance. For example, accumulate out layer’s value and sum it up as the end result.

If you still have not seen the tree value of Free Monad, notice instead of a one-layer monad we get back(all computation already happen when you get the return data type), we now get back a nested recursive data type, and that means no real computation has occured yet. This is the true value of free monad, the computation is deferred and we will have an interpreter make some sense out of the recursive instance. Since we have the instance ready(or AST really), we can implement multiple interpreters for different purposes: pretty print/static analysis/actually run the program.

Another example used by free monad’s API page is a binary tree with data stored in leaves only.

data Tree a = Bin (Tree a) (Tree a) | Tip a 

instance (Show a) => Show (Tree a) where
    show (Bin l r) = "(bin " ++ show l ++ " " ++ show r ++ ")";
    show (Tip a) = "(Tip " ++ show a ++ ")"

instance Monad Tree where
    return = Tip
    Tip a >>= f = f a
    Bin l r >>= f = Bin (l >>= f) (r >>= f)

tip = Tip
bin = Bin (tip 1) (tip 2)

build = do x <- bin
           y <- bin
           tip (2 * y)

Note here we manually implemented the free moand instance of Tree, where we can instead define type Tree a = Free Pair a. Note that when build is called, it demostrated a fork behavior automatically, and the end result is a full binary tree with 4 leaves. Note also that if we want to build more complex trees, we can also do it as we have the values ‘x’ ‘y’ from out layers already..

*Main> build
(bin (bin (Tip 2) (Tip 4)) (bin (Tip 2) (Tip 4)))

MaybeFree and binary tree are simple examples. But imagine that instead of just Just or Bin data constructors, we have some type that has a rich set of data constructors, and this gives us a way to interpret the free monad according to the constructors, and if the constructors correspond to language primitives, we get a full interpreter. To show the true value of Free moand, we now give a more complex example, stolen from you-could-have-invented-free-monads.

A true free monad interpreter

Suppose we run a on line game system. We want to give user the ability to program their agents but limit them from doing things we don’t want them to do…basically we need a sandbox.

We create a recursive data type to capture the interactions user can do with our system:

data Interaction next =
    Look Direction (Image -> next)
  | Fire Direction next
  | ReadLine (String -> next)
  | WriteLine String (Bool -> next)

Note the usage of functions as the recursive part of the type, these monadic functions gives the ability to chain the interactions and move to the next one.

Now, of course, Interaction needs to be a functor.

data Interaction ... deriving (Functor)

We then define a type sym to represent the free monad

type Program = Free Interaction

Because of the use of monadic function in Interaction, we can build a Program a and encode the user logic in those monadic functions:

easyToAnger = Free $ ReadLine $ s -> case s of
    "No" -> Free $ Fire Forward
          $ Free $ WriteLine "Take that!" (_ -> easyToAnger)
    _    -> easyToAnger

This corresponds to the manual construction of recursive data types, and it is hard to grasp.

However, since we use the free monad, we can build this with do notations:

look :: Direction -> Program Image
look dir = liftF (Look dir id)

fire :: Direction -> Program ()
fire dir = liftF (Fire dir ())

readLine :: Program String
readLine = liftF (ReadLine id)

writeLine :: String -> Program Bool
writeLine s = liftF (WriteLine s id)

easyToAnger :: Program a
easyToAnger = forever $ do
    str <- readLine
    when (str == "No") $ do
        fire Forward
        -- Ignore the Bool returned by writeLine
        _ <- writeLine "Take that!"
        return ()

easyToAnger now construct the free monad instance in a monadic way! We can do whatever we want with the free monad now, interpret and run it/display strategy/etc. Here is an interpretor for completeness.

interpret :: Program r -> Game r
interpret prog = case prog of
    Free (Look dir g) -> do
        img <- collectImage dir
        interpret (g img)
    Free (Fire dir next) -> do
        sendBullet dir
        interpret next
    Free (ReadLine g) -> do
        str <- getChatLine
        interpret (g str)
    Free (WriteLine s g) ->
        putChatLine s
        interpret (g True)
    Pure r -> return r

Conclusion

The key to grasp free monad is to realize it’s connection to recursive data types and how it helps to build an instance of recursive type in a monadic way. By now “collapsing” or “join” the monad context, the free monad preserve the structure of the order of monads used. After constructing the free monad(the recursive type), we have the flexibility of however we want to treat them, while normal monads implicitly requires “collapsing/perform the computation at scene”. Because of these, the free monad is of great help to build embeded DSL interpreters, or anywhere you want to construct a recursive type instance in a monadic way!

Haskell Monad and Monadic Parser

So this is a blog to explain the idea of monadic parser in haskell. I am basially stealing code and structure from this great paper: http://www.cs.uwyo.edu/~jlc/courses/3015/parser_pearl.pdf, but with my own words, so that I understand it better.

Monad is a dreadful topic for anyone new to haskell. Those who are experienced with other programming language in particular, find it hard to grasp. There are many tutorials trying to explain it, and many still are confused. But monad should not be so hard..because it is ju
st a way for a purely functional lazy language to specify sequential evaluation order, which is just natural in other languages.

So, haskell is a purely functional language with laziness as default behavior. Because of this, the evaluation order of two functions A and B is not specified, the compiler and choose whatever order to evaluation them when there is a need. This is a nice thing to have except when they are meant to be evaluated in a particular order.

Let’s say A has to be evaluated after B. In normal languages, you call B first and then A and the compiler will do the right thing(modern compilers do reordering too, but only when it is sure A and B are not impacting each other.) How do we do this in haskell then? Easy, you make the computation for result of A a dependency of B by having that computation having a parameter that is the result of B. This way, in order to perform computation for A, you have to get the result of B and supply it to that computation. That computation is called monad.

I think monads are introduced to Haskell originally as an way to cope with IO actions, where the order is important. But later, people found they are everywhere: report an error in a series of computation as soon as it occurs, maintain states for an list of computation, etc. It turns out, that this way of explicitly specifying evaluation order(plus the fact that haskell is purely functional) gives some advantage to better organization of the computations, despite it looked like something people had to put there to help with something missing from the language. One example of that advantage is, monadic parsers.

What is a parser? What does it do? A parser can be viewed as a function that takes a input string, consumes a prefix of the input and makes some sense out of the prefix. It will return what is left with the input and the ‘sense’ it makes out of the prefix. In fact, I should probably say it consumes all possible prefixes that it can make sense out and return the result as a list. Here is a definition:

newtype Parser a = Parser (String -> [a, String])

A Parser a type is a function tries to consume a prefix of the input, convert the prefix to type a for all valid prefixes and return the converted result and the remained string.

Parsing is a sequential computation, the next parser can only be applied to what is left from the previous parser. Therefore, we make Parser an instance of monad.

instance Monad Parser where
    return a = Parser (cs -> [(a, cs)])
    p >>= f = Parser (cs -> concat [parse (f a) cs' | 
                                     (a, cs') <- parse p cs])

parse :: Parser a -> String -> [(a,String)]
parse (Parser p) = p

So, return gives back a parsed type without consuming any input. &gt;&gt;=(bind) binds a first parser to a monadic function wraps around a second parser, it returns such a new Parser x value that the new parser first apply input to the first parser, hands the result to the monadic function. The monadic function applies the result and gets a new parser, which is based on the second parser(it can just return the second parser or a parser combine the result of first and second parser). After we have a new parser from monadic function, we apply it to the remained strings from first parser. Note the result of &gt;&gt;= is a new parser(function) that combines two parsers. This makes it extremely easy to compose different parsers.

Since now Parser is a monad, we have our way of sequentially apply a list of parsers, in fact, the list of parser are now just a big parser that will do the right thing when applied to an input string.

We now handle the case of an or relationship. We have two parsers for the same input, and either one of the two parsers return a valid result will make the the parsing proceed. For that, we first need to define a result that indicates the parsing is invalid. We represent invalid result as an empty list, why? Because an empty list means empty inputs to the parsers still have not beeb applied in the chain, and they will just return empty lists two. In other words, as soon as there is a invalid result, the whole parsing will return an empty list.

class MonadZero m where
    zero :: m a

instance MonadZero Parser where
    zero = Parser (_ -> [])

Now we give the definition of the or combinator. With zero defined, it is easy, we just join the two lists returned from both parsers.

class MonadPlus m where
    .++ :: ma -> ma -> ma 

instance MonadPlus Parser where
    p .++ q = Parser (cs -> parse p cs ++ parse q cs)

Note .++ is exhaustive, meaning it will try all possible matches for both parsers. We can define a new combinator that stops whenever there is a valid result, by exploiting the laziness of haskell.

p +++ q = Parser (cs -> case parse (p .++ q) cs of
                      [] -> []
                      (x:xs) -> [x])

Now we have an fully functioning parsers. We can start to build other helper parsers and combinators with just above infrastructure!

Below are some commonly useful parsers and combinators:

sat :: (Char->Bool) -> Parser Char
sat pre = do c  Parser Char
char c = sat (==c) 

string :: String -> Parser String
string "" -> return ""
string (c:cs) -> do x <- char c
                    xs  Parser [a]

-- match 0-INF times of the given parser. Note the use of +++ to catch the case of none matches.
many p = many1 p +++ return []

-- return parsing failed(zero) upon failure, which will cause all subsequent parsers to skip. Note 
-- the use of `many`, this means `many1` matches 1-INF times.
many1 :: Parser a -> Parser [a]
many1 p = do a <- p
             as <- many p
             return (a:as)

sepBy :: Parser a -> Parser b -> Parser [a]
p `sepBy` sep = (p `sepBy1` sep) +++ return []

sepBy1 :: Parser a -> Parser b -> Parser [a]
p `sepBy1` sep = do a <- p
                    remain  Parser (a -> a -> a) -> a -> Parser a
chainl p op def = (p `chainl1` op) +++ return def

chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
p `chainl1` op = do a <- p
                    rest a
                         where rest a = (do f <- op
                                            b <- p
                                            rest (f a b))
                                        +++ return a

With a couple more parsers for lexico parsing, we are all set for a fully functional arithmetic interpreter. See the original paper for details.

Haskell monadic parsers are very useful for leisure parsing, where performance is not critical and clearness is more valued. With the power of moands, pure functions and function combinations, writing a parser is merely translating the grammar definition from English to Haskell.

 

Art and craft of PS

Strategies

mental toughness

– if you cannot solve original problem, solve an easiser one.

creativetity

– Learn to shamelessly approporiate new ideas and make them your own!

– spend a minute to make sure we are not impsoing unneccessary rules to our own, try bend or break the rules.

– Toughen up, loosen up and practice!

To get started

Orientation

– read the problem carefully

– classify the problem, try to think of a similar problem

– list out the hypothesis and conclusion

– try brainstorming

  • think about notation
  • does a particular method seems plausible
  • can you guess a solution?
  • what are the important key words and concepts?

– do orientation again!

– restate the problem in your own language means a end of orientation.

Now I’m oriented

– get your hands dirty

– penultimate step, especially when the end state/conclusion is well described

– Wishful thinking and make it easier, what makes it so difficult, can you solve it without it or replacing it with a easier statement? What does that tell you?

methods of arguments

– good arguments skills can steer and modify your investigation

– deduction

– contradiction

  • What happens if i negate the conclusion? Will it give me something easier to work on?
  • Anything furthers investigation is good!
  • ex 2.3.22, took the negation conclusion wrong!!

– mathematical induction

  • stand and strong induction
  • the point of strong induction is we should be flexible in defining hypotheses and conclusion

other important strategies

– draw a picture

– recast the problem in other ways, crossovers. While reformulation is strategic, its implementation is tactical, requiring specific knowledge, see chapter 4.

– Change your point of view.

Tactics

symmetry

– consider a symmetry in a problem is almost always helpful

– it is OK if you only find a almost-symmetric case, some times what you find is just “harmony”, it will help you even if you dont know how to define what you find.

– The application is broad, from geometry to algebra, to equations and inequalities.

extreme principle

– assume the elements of the problem are in order, consider the largest or smallest element, as they may be constrained in interesting ways

pigeonhole principle

– if p pigeons are to be put to h hols, then at least one hole contains at least i pigeons, where i is the closest integer no less than p/h.

  • recognize pigeonhole principle might help solving the problem
  • find out what pigeons and hols must be, this is usually crux move(sometimes it is the holes that are hard to find)
  • usually more to be done after that, work on the result from step 2.

invariants

– A lot problems is about extract information from a chaos. Invariants are some aspects of the problems that dont change when other properties change.

– Symmetry is actually a special form of invariants

– Try to find easy “invariants”, change your problem to get to number 0,1.

– Parity is an important aspect as invariants to look for, especially when there is a integer involved.

– Modular arithmetic is parity++

– problems that invariants principle usually help:

  • can an end state be reached?
  • find all end states
  • is there a convergence to an end state?
  • find all periods with or without tails

 

Crossovers

 

Generating Function

 

  • when x^m*x^n we have x^{m+n}
  • local knowledge of coefficients and global knowledge of the GF can sometime provide information of the behavior of the other, by plugging values, taking some transformation like differentiation
  • multiply or divided by x will generate a GF for  the series that is shift to one direction.

Summaries on Theory of Poker

Chapter Three – The fundamental theorem of Poker

The best way to play is the way they would play if they knew their opponent’s card. Good player should always try to play the best way and force opponents play differently.

Chapter Four – The Ante Structure

Size of ante determines the way to play, large ante indicates a loose play while small ante indicates a tight play.

Large – 15% or more of the average future bets

– loose play because of better pot odd, cost too much to wait for big, opponents are playing loose and they will give no action when you do have big hand. Play loose at beginning and later rounds because the weaker starter carries over.

– should try to steal ante, when you found somebody are tight.

– when have a good hand in large ante game, should raise instead of slowplay because it’d make opponents call expensive to the pot odd and sometime make them call with improper odds.

Small – 5% or less of the average future bets

– play tight, because opponents play tight which means average hand value is high.

– slowplay when have a good hand to draw people in, because pot odd is not good with low ante.

– when ante and initial bet are small, should play loose on initial round, then folds if your hand has not improved, because the price to develop is cheap in this case.

Chapter Five – Pot Odds

– Calling when all cards are out requires the ability to read player and hands, something only learned in fields.

– When there are more to come, consider:

  • favor card:unseen card
  • Notice the exposed card
  • Position is also important in determining the pot odd, different position has different pot odds, need to experience this in field
  • Extra outs increase the odds of making your hand quite good.
  • The probability of you lose after you make your hand, it will decrease the pot odd.

Chapter Six – Effective Odds

– The real pot odds when you include possible future bets with more than one card to come

– Think about the money you need to bet to see your hand vs.  The money you do win when you realize your hand.

– When the last card is almost free, say you and your opponent are all-in or almost all-in, no need to apply effective odds.

– Sometimes when the bet size on later round are increase a lot from current round, it makes sense to see the first card since it is cheap.

Ex: You have 4-flush after flop, chance to make it is 1- \frac{38*37}{47*46} = \frac{9}{25}. It is 10-20 game with 20 in pot, and your opponent bets 10. effective odds is then 50-to-30. The chance of making it on first card is 1/5, and immediate pot is 3-to-1, still not worth to call for just next round. You should fold.

However, if current pot size is 40. The game is 10-50, and your opponent bet 10 already. immediate pot odds is 5-to-1, effective odds is 5-to-3. Your odds is make the hand is 16-to-9, which makes it not worth it to see the hand. But your odds of making next card is actually slightly less than 5-to-1(38-to-9), therefore you should call the next bet, and fold if you dont hit with the suit you want.

Chapter Seven – Implied Odds and Reverse Implied Odds
– In the above example, second scenario, it is justify to call when the effective odds are low because the later beds increase a lot, which make the implied odds good enough to call. Implied odds in this case is 100-to-10, meaning your expected winning is 100, and you only need to spend 10 to keep it going. The best fold opportunity is when the next card does not hit.

– Implied odds is the total expected winning vs. the price you need to pay at present. Generally, when later bet size increase a lot, implied odds also high.

– elements to consider

  • The size of future bets
  • your hand needs to be hidden well(otherwise ppl wont call)
  • the ability of your opponents(weaker opponents give you higher implied odds)
  • you hand should be strong enough to hold after you make it.

– reversed implied odd is when you are not sure where you’re at and you have a small change of improvement to beat your opponent, if you call your opponent then he can back off anytime(if he does not improve). So if you lose,  you lose maximum possible, if you win, you win minimum possible. (So you better have a accurate estimate of where you are and your chance of improving, more reading is needed.)

Chapter Eight – The Value of Deception

– You will definitely lose if your opponent can see your card, the goal of deception is to make your opponent make mistakes.

– Use more deceptions and you play with tough opponents(or hand readers) and play straightforward when you opponents are weak.

– Deception becomes less useful with a large pot, as the pot size becomes the major factor people decide to call, not the possibilities of their opponents card.

– The bigger the diff between early round and later round, the better for you to disguise a good hand in early rounds(good implied odd!).

– The more opponents you have, the less you gain by disguising your hand, because it’s less likely you can make everyone fold when you have a weak hand, and  you may lose future bets if you dont raise with strong hand(not fully understood!), and you let other people in too cheaply, one of them might have a chance to beat you!

– summarize: use deception when play with tough ones(hand readers, or), when the pot is small, when future bets is large , when there are only 1 or 2 opponents left, when you have a monster hand.

Chapter Nine – Win the Big Pots Right Away

– When the pot is large enough, its correct to make your opponents fold even when you have best hand, because the pot-odds is large enough for him to call, therefore you has a chance to lose if he did get to increase his hand(you should make him feel its not worth to call in other words).

– If you dont bet with the best hand, you are giving your opponents free chance to increase!

– You should always try to win a large pot right away with best or second-best game, unless what you have is a monster hand.

Chapter Ten – The Free Card

– Giving a free card is to give your oppo a free chance to beat you.

– You should only consider giving free card when you have a monster card and the pot is small

-You can try to get a free card by raising on early round or by tricks and ploys

– Being at the last-to-act position can help you to get a free chance. While being at first-to-act, you should avoid giving free card

– With marginal hands, below facts contribute to the decision to bet/raise:

  • Your improving chances are relative good
  • Pot is large
  • Your oppo chance of improving is good
  • The odds that you will beat your oppo after you complete your hand is not good.

– Overall, fail to bet/giving free card will possibly cost you the pot while get called after betting will only cost you the money you bet/raise.

Chapter Eleven – The Semi-Bluff

– A semi-bluff is a bet with a hand, if called, that may not be the best hand at the moment, but still has a reasonable chance to outdraw others on future cards.

– If you have a hand that you think you would call, it is better to bet/raise yourself, it adds probability your oppo will fold incorrectly and therefore increase the expected income.

– In stud, where you dont share the common cards, if you know your oppo will not fold, it’s still good to bet as it representing higher hands, if your hand does improve in next round, he might fold. Obviously it does not work with Hold’em.

– Why semi-bluff

  • chance that oppo folds incorrectly increases
  • not giving a free card to your oppo(even when you have the best hand)
  • adds deceptiveness, disguise the hand you are expecting, make a moderate card looks dangerous and a real good card insignificant
  • Often gives you free card in next round as your oppo might fold

– semi-bluff, betting with marginal hand, dont give a worse hand a free card are cases of one general principle: it is better to be betting than just calling. Even if you get caught in semi-bluff, it can be value of future deception.

– When the pot is larger, it becomes justify to bet and you are less depending on the chance of your oppo folds incorrectly. Game theory suggests otherwise, but in practice, semi-bluff and bluff will work out better when the pot is larger.

– You should semi-bluff less when you are sure to be called, that makes the combination of oppo folds + hand increase disappear, you are only left with one way: hand improvement.

– Last position is not a good position to semi-bluff as well, you can get a free card and somebody might raise after you.
Chapter Twelve – Defense against semi-bluff

– semi-bluff is powerful and defense against it is hard and therefore very important

– Frequently you should fold when you think someone is semi-bluffing, and when the pot is small. You should only consider other response when 1. the pot is not small; 2. your hand is in good favor of beating him.

– It is not correct to call when you have a fair hand, and you think your oppo is semi-bluffing, he has too many ways to beat you

  • He could have you beaten already with a best hand
  • He may outdraw you in later cards
  • A scary card in later round make you have to fold

– Curious case arises when you and your oppo both appear to have a promising(good improvable) hand, and you think your oppo is semi-bluffing.

– A correct play would be raising his bet. You are giving him pressure to fold if he does not have a good hand; and if he did, remember you are raising when you have a promising hand yourself, the chances are still not bad when he calls.

– Extra advantage is it might deter your oppo in making semi-bluff in the future.

– In most cases you still have to fold(no promising hand), but when you do have a good hand, raise instead of call.

– Consider below items to decide when to fold and when to raise, when you have a medium-value hand

  • The chances your oppo is semi-bluffing or bluffing
  • The chances of your oppo outdraw you, suppose he has the worst hand at present
  • The chances of you outdraw your oppo, suppose he has the best hand at present

– The larger chance item 1 and 3 are, the more you should raise. And The larger item 2 is, the less you should consider raise.

– There are times when calling the the best play though

  • The the pot is large, you might want to call a semi-bluff when you have a reasonable hand because you dont want to give away a large pot, but raise wont benefit you because your oppo will most likely call your raise given the large pot.
  • When your oppo is expecting on the come to make a good hand, you would call and try to bet right out in later rounds if his expected card didnt come. If you raise, he is mostly like to call when he is expecting a straight of a flush. When the card he is expecting does come, you should then check or fold(unless the pot is large enough for you to stay!).
  • A delayed semi-bluff raise is useful when playing against tough player. You call first and raise in later rounds will confuse your oppo that you have make a good hand, they will usually fold or check, and you can get yourself free cards in later rounds to beat them(More to be understood).

– Decide fold/raise/call and bet/call and fold/etc against a semi-bluff or bluff is the trickiest thing in poker.

Chapter Thirteen – Raising

– Raising to get more money in the pot. The better you hand is, the more you should not raise in early round. In later rounds however, it is usually OK to give your hand out by raising as the pot is usually large to make oppo call, if they fold, you win for sure; if they do, with a good hand you are raising, you are still in favor to beat them.

– When all are out, you may not want to raise if you are sure all players after you have worse hands than you, but they might call the original bet. Raising, instead, will make them fold which you dont want. But if you are not sure(especially if they will fold), raise instead of going for overall.

– Raise is also a way to cut down the pot odds for player after you. You can make them fold or call against odds. With what you think is the best hand, you should raise to cut down his odds to justify a call, so you may drive him out to win for sure, or best if he calls agianst the odds.

– Raise can be used as bluff and seme-bluff, and defense against semi-bluff. As noted in previous chapters, you can win by A) oppo folds; B) outdraw them in later rounds; C) getting free card in later rounds; D) worst case you inceased pot size which you are in favoraite to win.

– Raise can get you free card if you are in right position; and will also tell you some information about your oppo(if they call/re-raise). But in general, you shouldnt raise just for the purpose of getting a free card next round or get information, there should be other factors to justify a raise(cut the pot odds;semi-bluff;at least get more in pot if you think it is small).

– Raise can drive out worse hands when you think you have the second best, and therefore increase you chance to win(you are still second best, but with a proper pot odds). Raise can even drive other better hands out, when you and a player to your right show very strong. Your chance is increased. You have to be sure they will fold though, otherwise you should just fold. Call is worst play in this case.

– In many cases, calling the a worst choice, you are purely rely on beating your oppo by outdrawing them, while with raising, you give yourself more ways to win. And if your hand does not look like promising at all, especially when there are many players, just fold. Check the last section of previous chapter to see when to call.

Chapter Fourteen – Check-Raising

– Check-Raising is check a hand with intenstion to raise it in the same round. Check-raising can drive oppo out or even win it from there.

– Two conditions are required:

  • You are pretty sure you have the best hand, but not strong enough to slow play
  • Someone behind you will bet! This is a must, otherwise you are losing a bet you could’ve win and giving a free card away.

– Position determines how it is played:

  • With a reasonable good hand, you want your probable bettor to you right, so that you will raise on his bet and make other people fold
  • With a much stronger hand, you want you propbable better to your left so that other people call his bet, then you raise again, collecting the money people call.
  • People usually fold a double bet and call a single bet twice, that makes the different plays above important.

– You can check-raise with a second best hand, so to drive other oppo out and increase your chance of winning. In this case, you want the best hand to your right.

Strong hand, there will be a bettor, and position.

Chapter Fifteen – Slowplaying

– With a very strong hand.

– Use it only when the pot is small and your strength is not out. Either case does not meet, you lose the deceptive value of slowplaying. Whats more, with large pot, you dont need to slow play to get more in pot and your oppo will call with a large pot anyways.

– Ideally, a slow play will allow your oppo to make the hand they are expecting and still lose the game.

– Watch out for the possibility that somebody can outdraw you! If there is a possible outdraw, quit slowplaying and stop giving them free/cheap card.

Chapter Sixteen – Loose and Tight play

– Good players adjust between loose and tight accordiong to games/individual opponents.

– In a tight game/against tight player

  • loose on bluff/semi-bluff, they will more likely to fold when a bluff/semi-bluff, or a scary card comes later
  • tight on legimitate hands, their requirement is high.
  • play less draw hand because the pot odd is usually not good, and you wont get much even you make your hand

– In a loose game/against loose player

  • tight on bluff/semi-bluff, as oppo will not like to fold, making outdraw them the only way to win
  • loose on legimitate hands, as oppo requirement to play is low, their average hands cannot be good
  • play more draw hand because of usually good pot-odd.

Chapter Seventeen – Position

– position is a underrated important factor in poker

– You want to be in the last postion to act because

  • you will simply have more information to make a decision
  • with a fair-to-good hand you would like to call, being in the last position you would have no fear that someone behind you raises
  • With a big hand, if you are early or middle, you have to decide whether to check-raise, bet, raise with driving out some players or just call to keep people in game; while in the last position, you just need to raise or call if nobody does that yet
  • With mediocre hand, in last position, you can call without fearing a raise, and sometimes player with better hands before you will check, giving you a free card
  • With only 2 players are left, it is more advantagous to be the last, you can call or raise; while being the first you have to decide check-raise or bet, both have potential lose.
  • Only threat is someone is playing check-raising.

– Advantages of the first position, check-raising; bet-reraising; drive out players

– When in last position, you can play loosely because there will not be raises, try to grow marginal hands; in first position, you need to tighten up, fold almost all marginal/mediocre hands because it is very likely there are some raises, and you would cost yourself unneccessary bets.

– With a strong hand, being last will win you a large pot, just wait and raise and collect bets from those who fold and double bets fronm those who call; while being first/early, you either raise to lose some bets from those who fold or call to give your oppo a cheap price to grow their hand.

– Notice: being next to first with a marginal hand, it’s usually not worth a call because even if you make the big hand, your implied odds are small because others would fold when you raise. So stop a marginal hand if you are next to first, in most cases.

– You usually want tight players after you and loose player before you. (why? less surprises from the loose one as they are before you, and you can trap(semi-bluff/bluff) them)

Chapter Eighteen – Bluffing

– bluffing is no more important than playing a legimatate hand correctly.

– in theory, the frequency you bluff should be close to the pot odd you are getting.

– in early rounds, semi-bluff is prefered. But if you did it anyway, fold when somebody raises.

– in early rounds, your bluff is usually called, so you need to calculate the effective pod odd, other than the immediate one.The chance your oppo fold should be close to the effective pod odd you are getting.

– takes exp to know when to pursue and when to give up.  if your oppo calls your first bet, he maybe has something. quit bluffing if the next card seems to improve his hand. Otherwise you may continue bluffing.

– bluffing when all cards are out can only be learned from exp. You need to learn to read hands and read players.

– first position is better than second as if you are in second, your oppo checks, it means he has a small something, otherwise he will bet(value or bluff), so when you bluff, he might think you are bluffing on the weakness he just showed.

– when you didnt make your hand, and you suspect your oppo didnt make it either, you should bet with a AJ or something, because if he really didnt make it, he will fold, so your chance increases. Also, if you make a small hand, and you suspect your oppo is small too, just check and call, other than bet. Betting will either make him call with a good hand or fold with a weak hand. But check and call can make him bet with his weak hand, which your hand will beat.

– bluffing when there are more than one oppo is usually not good, as the chance both will fold is rare. The middle one is fairly easier to fold though.

– The choice of bluff or betting for value usually come in the last round, when  you either make your hand or dont. It is usually incorrect to do neither. When you failed to make your hand, bluff; when you did make your hand, bet for value.

– even if you get caught with a bluff, it will help you to get a lot more calls later; similarly, if you played a bad call in early rounds, if will help you to get successful bluffs laters.

Chapter Nineteen – Game Theory and Bluffing

– An optimum bluffing strategy(it is actually a semi-bluff) can be described as this:

  • Randomize your bluff by making your decision on your coming hands.
  • Pick your bluffing cards so that bluffing:hand-making is same as the pod odds your oppo is getting, this makes sure a mathematical maximum expectation regardless the strategy of your oppo

– The strategy is best to play against expert players, for more “average” players, consider shifting the bluffing:hand-making odds according to the play style of your oppo.

– You can use game theory to determine whether to call a possible bluff, especially when your hand can only beat a bluff:

  • figure out the pod odds your oppo is getting
  • randomize your call so that random cards to fold:cards makes your hand = invest of your oppo:pot he wins if you fold there

– Above strategy can only reduce your lose to minimum, there is no profitable response to a optimum bluffing strategy.

– Your should only apply game theory bluffing/calling when

  • You are against better players than you, or you simply dont know the players
  • Only when the better(you or oppo) is clearly either bluffing or having the best hand(like flush draws). If he is betting a legitimate hand, possibly not the best, apply “heads-ip on the end” from chapter 21.

Chapter Twenty – Inducing and Stopping bluffs

– when you inducing a bluff, you must always call when your oppo bets

– when you stopping a bluff, you must always fold when your oppo bets, because you must have tried to stop a bluff with a bad hand, stopping people from bluffing when you have good hands is not good.

– need to be re-visitied

Chapter Twenty one – Heads up on the end

– bluffing on the end is same as anytime, you need to evaluate the pot odds and your belief of the probability that your oppo will fold. bluff-raising the more difficult as most people will call you with a not-too-bad hand except really tough ones. Maybe you can use bluff raise when you have a pretty good hand though, to lure more money.

– last position when your oppo has checked: you should only bet on your chance of winning when your bet is called/raised(he made his hand or he is bluffing). Because your bet only works when it is called/raised, when it caused a fold, it adds no value at all.

– last position when you oppo has bet: you should fold or bet based on your chance of winning(prob of your oppo makes his hand and chance of you beating his hand and chance of him bluffing) versus the pot odds you are getting. For average players, they will almost certainly call if you raise, you should raise when you are a lot favorite to win. With very, very good players tough, you may raise less, and raise with mid hand hoping him will make a tough fold.

– first position with a good hand, you should consider check-raising if you think the chance your oppo bet with your check and call your raise is more than half of the chance he called if you bet. It works better with relatively good players. With weak players, they often wont bet when you check. Same as tough players(they are following the rule of the second item.)

– if you have a good hand, and check-raising might not work, try betting.

– if you have a favorable hand, not strong enough to check-raise

  • bet if your oppo will call with more hand than he will bet on your check, most players are like this
  • check and call if your oppo will bet with more hands than he will call, these are people who are more likely bluffing.

– if you have a underdog hand

  • bet if your oppo will call more hands than he will bet.
  • check and call if your oppo will bet with more hands than call
  • check and fold if your oppo never bet with hands lower than you

– the key when you play in first position is to recognize the style of your oppo, is he more likely to call, to bet on a check, to call a check-raise? and consider your favorability of your hands, make the expected return maximized.

Chapter Twenty-Two – Reading Hands

– Analyze the meaning of check/bet/raise, and combine them with the exposed cards throughout the hand, that is how a hand is read.

– Start with guesses of what your oppo might have, and then eliminate them by the way they played. Would he have played like this if he has a mid hand/good/great hand, is this more like a bluff?

– A complementary way is t o work backward.

– After the elimination, try use math to find out the probability distribution of the possible type of hands your oppo might have.

– With multiple player in game, you should tighten up when someone before you called a initial bet, it is highly unlikely both players are bluffing, assess your hand with more caution, their value decreased. If a bet is raised, you should consider fold, unless you have a great hand and was playing slowplaying or check-raising(not likely though, you are not in early positions).

Chapter Twenty-Three – The Psychology of Poker

– Think about what your oppo is thinking of what you have.

  • If you think he thinks you have a calling hand, yet he still bets, he is very likely betting for value. You should consider fold with avg-to-good hand.
  • If you think he thinks you have a weak hand, and he bets, he might be bluffing, you should consider calling.
  • If you think he thinks you have a good hand, you may try bluffing with a low hand.
  • If you think he thinks you have weak hand, you should not try bluff because you’d get caught, you should bet for value.

– Against expert level players, applying game theory is required. And be aware of the level of your oppo, you cannot get too far away from them.

Chapter Twenty-Four Analysis at the Table

– must be quick at the table, thinking too long will give out information if your hand. Sometimes you want to pause when there is no need though.

– In theory:

  • figure out possible hands of your oppo
  • what are the chances of each hand
  • determine your best play against each hand
  • pick the play that will most often be correct
  • sometimes the best play based on chances are not the best…you need to consider the pot(that is the cost of mistake), and how much a play will cost you if it turned out to be a bad play

Chapter Twenty-Five Evaluating the Game

– two questions should be asked: is the game worth playing; and how to play it.

Forced bets, key question: what is the relation to the betting limits? Find a game suits your style first, then try different ones and adjust to them.

Betting Limits, does it increase drastically? Play tight if it does not(limit holdem), and avoid hands require luck; play loose if it does(pot-limit or no-limit holdem), as it means high implied odds in the earlier rounds.

Know the rules, do not make stupid mistakes. And adjust: if check-raise is not allowed, later position is more powerful, do more semi-bluffs and bet more when you are in later positions.

– Adjust to players is more important than structures. Make sure you are not the sucker, and find mistakes the bad players make in the game, and take advantage of it.

Loose players: play tight, enter with better hands and try less bluff required by game theory(maybe in the early stage of the game).

Tight players:

  • Tight on early rounds: steal antes, raise the force bet more than game theory demands, 2/3 maybe
  • Tight early, loose up later: Thus type is common, they play tight so they hate to give up with a good hand, quit bluffing because they wont fold. Only bet out when you think you have a good chance to have the best hand.
  • Tight all the time, bluff more then GT and semi-bluff always

Mistakes and strategies:

  • Bluff too much – induce more, then call
  • bluff too little – stop, and fold according to your strength
  • Never fold any fair hand at end – never bluff, but bet more with a good hand
  • Rarely folds a fair on any round – dont slowplay, bet for value with decent hands
  • fold too often on the end – bluff more but dont bet your fair for value
  • tight on the first round, loose later – steal ante from them if they have not call, quit bluffing if they did. And bet for value with fair hand.
  • never check-raises – bet more hands behind him(and more semi-bluff), as his checking tells he has only a fair hand
  • never bluff raises – fold fair-to-good hands when they raise. Bet weaker hands, as his response will tell you more infor

short answers to “my best”

The Returning Explorer

infinite

Draw Poker

<not catching>

The mutilated Chessboard

Hint: colors

The Fork in the Road

Hint: F(T(Q)) and T(F(Q)) will always give false answer.

Scrambled Box Tops

1

Cutting the Cube

Hint: centered cube

Bronx vs. Brooklyn

<obvious>

The Early Commuter

55′

The counterfeit Coins

1 test

The Touch Cigarettes

<skip>

Two Ferry Boats

Hint: two equations

Guess the Diagonal

10

Cross the Network

The 12 Matches

Hint: How to remove(balance out) the irrational length in non-square shape?

what about 6 square unit?

Hole in the Sphere

The Amorous Bugs

10pi/4

above is wrong

find the invariance, sometimes it’s a recursion, sometime its a state that wont change during the process.

How Many Children

Hint: Why the knowledge of the smallest family can give a definite answer?

2, 3, 4, 5

The Twiddled Bolts

The Flight around the world

3 planes.

The Repetitious Number

Becomes obvious when  you write down 7*11*13

The Colliding Missiles

150+350=500, 1317 is distraction

The Sliding Pennies

5, backward thinking.

Handshakes and Networks

start with a single handshake, and consider all cases of next possible handshake, how will that change the number of people who shakes odd times.

The Triangular Duel

P(S) = 29/120, p(B) = 14/45, P(J) = 161/260.

Work first P(B%BJ)/P(B%JB)/etc, then apply to three person case, nothing but carefulness.

If you consider sky-shot, P(S) = 3/10,  p(B) = 8/45, P(J) = 47/90.

Crossing the Desert

10/3.

No upper limit as the value is ln\infty - ln\frac{1}{2}

Load Dunsany’s Chess Problem

The Lonesome 8

\frac{10040536}{142} = 77800

find a x where 8x is 4-digit and 990<=7x<=999, notice that it must be 7x, no others, why?

above result is not correct because it’s actually the case where 8 appears in the last digit, the argument is correct though.

Dividing the Cake

The Folded Sheet

Water and Wine

not possible, the only way to reach that state is to pull all liquid from one to another, and pull it back.

The Absent-Minded Teller

linear equation on integer space, easy

Acute Dissection

not possible, recursion

How long is a “lunar”

isnt it just applying the area and volume formula of a sphere, too simple?

The game of Googol

*failed to see the prob of the event that kth item is the first one larger then first x items. knowledge actually limited me to see how simple it is. When hit with a complex process, step back and see if there is a simple way to do that. Write down the sub-problem and treat it as a full problem!

Marching Cadets and a Trotting Dog

a practice of listing unknowns, conditions and find the equations. The equation for the second is hard to solve though.

White, Black and Brown

list and reasoning, easy

The Plane in the Wind

it will be slower

What Price Pets

word problem, list equatino

The Game of Hip

more reading is needed

A Switching Puzzle

solved, hard to write solution

Beer Signs on the Highway

1/6 miles, easy

The Sliced Cube and the Sliced Doughnut

cube is easy, doughunut too hard

Bisecting Yin and Yang

didnt catch meaning of “bisecting”

The Blue-Eyed Sister

4

How old is the Rose-Red city

equations

Tricky Track

use the hint the problem give, all 1st places must be taken by the champion school, then enumarate possibilities and eliminate impossibles.

Termite and 27 Cubes

impossible, consider 3×3 first, only possible position is from one corner to opposite corner. Then consider 3x3x3, with the middle plane, after all other 2 are finished, the middle one must start with two opposite coners taken, with other possible ones taken. No matter what the layout is, it is impposible to finish the traverse with the center as the last one.

Collating the Coins

4, work backwards

Time the Toast

?

A Fixed-Point Theorem

easy

How did Kant set his clock

easy if the home clock is runing with incorrect time, and easy to make it run!

Playing twenty questions when probability values are known.

Found a strategy with avg 3 tries. Though not solid how it is constructed, reading on Huffman Tree is needed.

Dont Mate in One

Find the Hexahedrons

Out with The Onion

easy.

Cut down the cuts

Dissection Dilemma

Interrupted Bridge

backwards

Dash It All

faster being faster

Move the Queen

Read the Hieroglyphics

Crazy Cut

Find the Oddball

2, easy. consider other variations. matrix67 has a blog on this.

Big Cross-Out Swindle

Reverse the Dog

Funny Fold

Completed 2010-11-12 18:00:00 Asia/Hong_Kong TAS uncompressed filter buffersize tuning(phase II : tkbmdprop3)

A generatingfunctional solution to the problem of testing from which storey an object that is dropped will break

Problem:

There is a building, we have m identical objects to find out the number of the floor from which the object will break when it is dropped.

Despite the height of the building is fixed when the problem is normally asked, the key to this problem is to actually fix the number of tests with the m balls.

Solution:

We will be looking for a function

f(m,n)->number of the floors we can determine by m objects and with n trials.

Then we have:

f(m,n) = 1 + f(m-1, n-1) + f(m, n-1)

with f(m,1)=1, f(0,n) = f(m, 0) = 0

Look for generating function

A_n(x) = \sum_{k>=0} f(k,n)x^k

by multiplying x^k(where k>=0) on both side of the recursive equation and summing them up:

A_n(x) = \frac{x}{1-x} + xA_{n-1}(x) + A_{n-1}(x)

This will give us below:

A_n(x) = \frac{x((1+x)^{n-1} + \cdots + 1)}{1-x} = \frac{(1+x)^n - 1}{1-x}

expand above we will get

A_n(x) = \sum_{k>=0} (-1 + \sum_{i+j=k} \binom{n}{i})x^k = \sum_{k>=0} (-1 + \sum_{i=0}^{k} \binom{n}{i})x^k

so for any value m, we have

f(m, n) = -1 + \sum_{i=0}^{m} \binom{n}{i}.

When m=2, f(2, n) = \frac{(n+1)n}{2}, so when n=14, value of function f first exceeds 100.

primes that are sums of pefect squares

q: There is a theorem: if a prime number p is one more than a multiple of 4, they i can be written as the sum of two perfect squares: p=m^2 + n^2 where m,n are integers.

1) show that if a prime is one more than multiple of 8, it can be written as p=x^2 _ 16^y, x,y are integers.

2) show that if a prime is five more than multiple of 8, it can be written as (2x+y)^2 + 4y^2.

Some notes on Chapter 1 of “What is mathematics”

1. a) To find all divisors of a certain number a, we can do below:

a = p_1^{\alpha_1}p_2^{\alpha_2}\cdots p_r^{alpha_r} where p_r is prime numbers.

All divisors are in the form of b=p_1^{\beta_1}p_2^{\beta_2}\cdots p_r^{\beta_r} where 0<=\beta_i<=\alpha_i.

b) find the numbers of divisor of a.

a)

proof:

1). suppose that b has a prime divisor p_0 which is not in the set p_1,p_2,\cdots,p_r. Because b is a divisor of a, that means a can be written as p_0^{\alpha_0^`}\cdots, which contradicts with the statement that there is only a set of prime divisors for any number.

2). From 1) we know b has only prime divisors from p_1,p_2,\cdots,p_r, so

b= p_1^{\beta_1}p_2^{\beta_2}\cdots p_r^{\beta_r} for all 0<=\beta_i.

suppose there is a \beta_i > \alpha_i.

then \frac{a}{b}=\frac{\cdots}{p_i^{\beta_i - \alpha_i}} which means p_i^{\beta_i - \alpha_i} is a divisor of the nominator. Using the same contradiction from 1) we know that it cannot be true.

done.

b)

we know all divisors are in the form of b=p_1^{\beta_1}p_2^{\beta_2}\cdots p_r^{\beta_r} where 0<=\beta_i<=\alpha_i.

a divisor might have p_1^{\beta_1} with it, where \beta_1 takes the value 0,1,\cdots,\alpha_1, which are \alpha_1 + 1 different values.

Combine the cases with p_2, p_3, \cdots, p_r we have the answer to be:

(\alpha_1 + 1)(\alpha_2 + 1)\cdots(\alpha_r + 1)

2. Long I have been wondering what the general rule is-after learned the trick that if the sum-of-digits of a integer can be divided by 3, so is the integer itself-to determine if a integer can be divided by any given prime number, and here it is, from “what is mathematics”.

Any decimal number A: a_na_{n-1}\cdots a_2a_1a_0 can be written as:

A=a_0 + a_1{10} + a_2{10}^2 + \cdots + a_n{10}^n.

For a given prime number p, denote r_i as the remainder of \frac{10^i}{p}.

A=a_o(m_op+r_0) + a_1(m_1p + r_1) + \cdots + a_n(m_np + r_n) = p(a_0m_0+\cdots+a_nm_n) + a_0r_o + a_1r_1 + \cdots + a_nr_n

so we have

A \equiv (a_0r_o + a_1r_1 + \cdots + a_nr_n) \mod{p} where r_i is the remainder of \frac{10^i}{p}

because when p=3, all r_i=1, that’s why we can check by adding all the digits up. Similar process can be applied to other prime numbers.

3. Euclidean algorithm

The Euclidean algorithm is used to find the greatest common divisor(GCD) of 2 numbers. Let’s consider all common divisors of two positive integer a and b.

For any common divisor of a and $ latex b$, denoted as u, we have

a=su (1)

and

b=tu. (2)

Because we can always find n, r to satisfy below

a=bn+r.

Substituted with (1) and (2), we have

r = (s-tn)u, therefore $ latex u$ is a divisor of r.

Similarly, consider the common divisors of b and r, for any divisor v, we have

$a = bn+r = (ns+t)v$, so v is a divisor of a as well.

So (a,b) and $ latex (b,r)$ have exactly the same set of divisors.

So the algorithm is:

For any a and b, we have

a=bn_1 + r_1 where 0<=r_1<b

b=r_1n_2 + r_2 where 0<=r_2<r_1

r_1=r_2n_2 + r_3 where 0<=r_2<r_1

\cdots

r_{n-2}=r_{n-1}n_{n-1} + r_n where r_n=0.

a>=b>r_1>r_2>\cdots>r_{n-1}>0. where r_{n-1} is the GCD.

So the GCD is always the last positive integer in the r_i series.

From above equations, we have

r_1 = a - bn_1

r_2 = b - r_1n_2 = b-(a-bn_1)n_2 = (n_1n_2+1)b - n_2b

\cdots

keep doing this will show that all r_i can be written in the way

r_i = k_ia + l_ib

Using this, we can prove that: if prime number p can divide ab, then either it can divide a or b.

4. Linear diophantine equations
A linear diophantine equation is a equation on integer set:

ax+by=c where a,b,c are integers.

The equation has integer solutions only when c is divisible by (a, b).

And when c is divisible by (a, b), the equation must have solution, because there must exist integer k, l to make:

ka +lb=(a,b), so x^` = k\frac{c}{{a,b}}, y^`= l\frac{c}{(a,b)} is a solution of the equation.

Other solutions can be reached on the base of this solution, denote other solutions as x^*, y^*, we have:

a(x^*-x^`) + b(y^*-y^`) = 0.

We know equation ax + by=0 has solutions x=rb, y=ra where r is any integer.

So x^*=k\frac{c}{(a,b)} + \frac{rb}{(a,b)}

y^*=l\frac{c}{(a,b)} + \frac{ra}{(a,b)}

Where r is any integer.