A Crystal library for building Markov Chains and running Markov Processes.
What is a Markov Chain?
A Markov Chain is essentially a mechanism for guessing probable future events based on a sample of past events. For a great explanation, watch this Khan Academy video.
Visit the API Documentation for a more in-depth look at the library's functionality.
Add this to your application's
dependencies: markov: github: mccallofthewild/markov
In your terminal, install Crystal dependencies with:
$ shards install
$ crystal deps
Begin by requiring the
Basic -- Hello Markov
A classic Markov text generator. This example will work well for small (array-sized) data sets.
Markov::Chain is a generic type which contains, receives and generates elements of
We'll start with the sample text:
example_string = "how much wood would a woodchuck chuck if a woodchuck could chuck wood"
There are several
Markov::Chain constructors to choose from. The simplest one takes in a
LinkType array of elements as
sample and a
seed is the element in
sample you want to start the chain with. If not provided, a random element will be chosen.
example_arr = example_string.split(" ") #=> ["how","much","wood","would","a","woodchuck","chuck","if","a","woodchuck","could","chuck","wood"] seed = example_arr #=> "how" example_chain = Markov::Chain(String).new sample: example_arr, seed: seed
Finally, we'll generate a probable sequence of elements with the
["much", "wood", "would", "a", "woodchuck", "could", "chuck", "if", "a", "woodchuck"]
If we wanted to get the elements one at a time, we could use the
Markov::Chain#next method instead:
puts example_chain.next #=> "much" puts example_chain.next #=> "wood" puts example_chain.next #=> "would"
This implementation was built for larger data sets, with asynchronous input in mind.
In this example, we will create a
Markov::Chain which can generate realistic movie titles.
To begin, we instantiate a
TransitionTable is a mechanism for training and implementing Markov processes.
example_table = Markov::TransitionTable(String).new
Now we'll add a movie title using the
movie_one = %w(the great gatsby) # shortcut syntax for ["the","great","gatsby"] movie_one.each do |word| example_table.add(word) end
For syntactic sugar, if we have an array of elements, we can avoid looping through and
#add-ing them by using the
Markov::TransitionTable#fill method instead:
movie_one = %w(the great gatsby) # shortcut syntax for ["the","great","gatsby"] example_table.fill table_with: movie_one
A problem arises at this point:
movie_two = %w(great expectations) example_table.fill table_with: movie_two
The above code sequentially adds each word to the
TransitionTable. But The Great Gatsby and Great Expectations are two separate movie titles; the "Great" at the beginning of Great Expectations is not a probable transition from the "Gatsby" at the end of The Great Gatsby.
To solve this, use
#reset clears the
TransitionTable's last added key, allowing us to separate titles like so:
movie_one = %w(the great gatsby) example_table.fill table_with: movie_one example_table.reset movie_two = %w(great expectations) example_table.fill table_with: movie_two example_table.reset movie_three = %w(the great escape) example_table.fill table_with: movie_three
TransitionTable with a
Finally, we can put the
TransitionTable to use by passing it to a
Markov::Chain constructor as
example_chain = Markov::Chain(String).new transition_table: example_table, seed: "great"
Handling Dead Ends
With small and/or unique data sets, Markov chains are fallible to reaching dead ends. That is, they can often reach a point where there is nothing to transition to.
When this happens in the
Markov::Exceptions::EmptyTransitionMatrixException is raised.
dead_end_array = %w(some say the world will end in fire) dead_end_chain = Markov::Chain(String).new sample: dead_end_array, seed: "fire" # nothing comes after "fire", so the chain is at a dead end. dead_end_chain.next # raises `EmptyTransitionMatrixException`
To prevent this, use the
Markov::Chain#on_dead_end exception handler.
The block's return value of
LinkType fills in as the next item in the chain.
dead_end_array = %w(some say the world will end in fire) dead_end_chain = Markov::Chain(String).new sample: dead_end_array, seed: "fire" dead_end_chain.on_dead_end do |transition_table, chain, exception| "some" end dead_end_chain.next #=> "some" dead_end_chain.next #=> "say" dead_end_chain.next #=> "the"
- Fork it ( https://github.com/mccallofthewild/markov/fork )
- Create your feature branch (git checkout -b my-new-feature)
- Commit your changes (git commit -am 'Add some feature')
- Push to the branch (git push origin my-new-feature)
- Create a new Pull Request
- McCall Alexander mccallofthewild - creator, maintainer