Documentation:Tutorials:Strongly-typed GP

From Beagle

Contents

Section Prerequisites

Before starting this tutorial you should know the basics of using standard tree based GP with Open Beagle. It is also a great advantage to know a little bit of how RTTI(RunTime Type Information) works. As for the OB library it needs to be built with RTTI support.

Section Introduction

Illegal if-then-else branch
Enlarge
Illegal if-then-else branch
Legal if-then-else branch
Enlarge
Legal if-then-else branch

Strongly Typed Genetic Program was first proposed by Montana in ... Simply put STGP is a method ensuring that generated trees are "legal" with respect to the type constraints on the different nodes. E.g. let's say we want a primitive doing a if-then-else operation. This primitive has three parameters. The first is the condition, while the second and third are the "branch bodies". In our case we want the body to be of type double. So depending on the condition either the first or second double value should be passed to the node's parent in the tree. If there are no constraints when generating new trees, there will probably be some "illegal" ones in the population. A tree is considered illegal if the return type of a node's child does not match the expected type for that childs "parameter" place. Illegal trees can be generated both in the initialization and reproduction phases.

OB implements STGP by utilizing the C++ RTTI functionality.

Section Overview

In short the following is what must be done different when using STGP in your program:

  • Decide which types you need in your program
  • Decide the return type of your program
  • Implement your own primitives
    • Include typeinfo header
    • Override getArgType and getReturnType functions
    • Make sure the execute function use the correct types
  • Implement your evaluation operator.
    • Make sure to pass in the correct type to the individual.run(..).
  • Implement the main method
    • Make a primitiveset. Use the RTTI specific constructor
    • Insert your primitives and evaluation operator into the set.
    • Use the constrained package instead of the base package
  • Make sure the configuration file is setup with the constrained versions of the evoluationary operators

The following sections will describe each point more in-depth.

Section Example

In this tutorial we will try to use OB for stock market predicition. Note that this is a purely educational example, and has no base in actual reality(All values are just random generated). Let's say that we have a data set with the last five days closing price for a specific stock. These values are represented as doubles. The only information we want to predict is if the closing price tomorrow will be higher than what is was today. We do not want to know the exact closing price, so the result of an individual can just be a boolean (true or false). If it goes up, the result should be true else false. We have a "training set" of 100 vectors(randomly generated) which is the base for our fitness calculation. An individual's fitness is calculated as the percentage of correct predicitons in the dataset.

Section Primitives

OB contains a lot of already implemented primitives, such as Add and AND. These primitives handles only one type. For example one of the Add version use the double type while the AND use booleans. In our example we want to use double as input to the program and get a boolean value out. If there are no primitives taking doubles as parameters and returning a boolean no "legal" trees can be generated. For this simple example we'll use two.

Greater-Than

Our first STGP primitive is the GreaterThan class. It takes in two booleans and returns true if the first is greater than the second.

 #include <typeinfo> 

This is the first important STGP line. This includes the typeinfo header which gives us access to important data structures and methods.

 virtual const std::type_info* getArgType(unsigned int inN, Beagle::GP::Context& ioContext) const; 

This line is for overriding a function from the Primitive class. The function is used by OB when it checks if trees are legal. It takes in the parameter number, and the function should return the type for it.

 virtual const std::type_info* getReturnType(Beagle::GP::Context& ioContext) const; 

This function should return the type information for the return value of this primitive.

That was all the STGP specific stuff in the header file. Now let us see what is important in the cpp file.

const std::type_info * GreaterThan::getArgType(unsigned int inN, Beagle::GP::Context & ioContext) const
{
  // Since there are only two parameters allowed for this primitive, parameters with number 2 and higher are not allowed.
  Beagle_AssertM(inN < 2);
  // Both parameters should be Doubles so return the same for both of them.
  return &typeid(Beagle::Double);
}

This code states that there should be only two arguments for this node, and that both are of type Beagle::Double

const std::type_info * GreaterThan::getReturnType(Beagle::GP::Context & ioContext) const

Here we return the type info for the return value of this primitive.

Last but not least we need to take STGP into account in the execute function. This primitive will become a node in a solution tree. In OB the execution of a tree will start at the top. The values for the parameters are gathered in a "lazy" way, which means they won't be available before you use one of the getArguments functions. When one of these are called it the childs execution function is called, which again must gather its parameters and so on. This goes on until e.g. a terminal node is encountered. Then the flow goes back up the tree.

void GreaterThan::execute(Beagle::GP::Datum & outDatum, Beagle::GP::Context & ioContext)
{
  // All valid trees will give a Bool datum to this primitive.
  Beagle::Bool& lResult = Beagle::castObjectT<Beagle::Bool&>(outDatum);
  // Prepare an array of two Doubles(the two arguments to this primitive)
  Beagle::Double lArgs[2];
  // The getArguments function will execute children nodes and put the correct results in the Double array
  getArguments(lArgs, sizeof(Beagle::Double), ioContext);
  // If the returned value from the left child is greater than the one returned from the right child put True in the datum passed down by the parent
  lResult = lArgs[0].getWrappedValue() > lArgs[1].getWrappedValue();
}

Most of this code is self explanatory. First the Datum is converted to a Beagle::Bool which is the return type of this primitive. Then two an array of two Beagle::Double are made and passed to the children, for value insertion, with the getArguments section. The last line puts the correct boolean value into the object passed down by the parent node.

If-Then-Else

Now you should know the basic primitive implementation. The IfThenElse class is implemented in the same way. However, it does some parts a bit differently so some of the code will be briefly explained here.

const std::type_info * IfThenElse::getArgType(unsigned int inN, Beagle::GP::Context & ioContext) const
{
  // Make certain there are no more than three parameters
  Beagle_AssertM(inN < 3);
  // The first parameter is a Bool(decision parameter)
  if(inN == 0) return &typeid(Beagle::Bool);
  // The other parameters are Doubles
  return &typeid(Beagle::Double);
}

This primitive has different parameter types, so we need to return type info depending on the parameter number. Since the first parameter is boolean, the Beagle::Bool is return if the number is one, else Beagle::Double is returned.

void IfThenElse::execute(Beagle::GP::Datum & outDatum, Beagle::GP::Context & ioContext)
{
  // Make and get the first parameter.
  Beagle::Bool lCondition;
  get1stArgument(lCondition, ioContext);
  // Check the condition and just pass the datum to the correct child
  if(lCondition == true) get2ndArgument(outDatum, ioContext);
  else get3rdArgument(outDatum, ioContext);
}

This code show an alternative way of getting arguments.

Section Evaluation Operator

Now that the primitives are finished, we need an evaluation operator. This operator is basically implemented as in standard tree based GP. However, there is one thing that needs to be kept in mind. When running the individual you need to make sure that the object passed to the root node is of the correct type.

Beagle::Bool prediction;
inIndividual.run(prediction, ioContext);

Here we make an object of the correct type(Beagle::Bool) and pass it down to the root node in the inIndividual.run() call.

Section Main Method

The main method are implemented in almost the same way as for standard GP. However, a few things must be done differently.

#include <typeinfo>

First of all we need to include the "typeinfo" header.

GP::PrimitiveSet::Handle lSet = new GP::PrimitiveSet(&typeid(Bool));

Here comes the first important part. In the constructor for the primtiveset the typeid of the wanted return type for the tree must be given.

Then the operators are added etc as in normal GP, but because of some configuration file specific things, we need to use the constrained package.

lSystem->addPackage(new GP::PackageConstrained(lSet));

Section Configuration File

Almost finished! The only thing left is setting up the configuration file. The only difference from a normal setup is that all the evolutionary operators must be switched out with their constrained equivalent.

<?xml version="1.0" encoding="ISO-8859-1"?>
<Beagle>
  <Evolver>
    <BootStrapSet>
      <GP-InitHalfConstrainedOp/>
      <StockEvaluationOperator/>
      <GP-StatsCalcFitnessSimpleOp/>
      <TermMaxGenOp/>
      <TermMaxFitnessOp fitness="1.0"/>
      <MilestoneWriteOp/>
    </BootStrapSet>
    <MainLoopSet>
      <SelectTournamentOp/>
      <GP-CrossoverConstrainedOp/>
      <GP-MutationStandardConstrainedOp/>
      <GP-MutationShrinkConstrainedOp/>
      <GP-MutationSwapConstrainedOp/>
      <GP-MutationSwapSubtreeConstrainedOp/>
      <StockEvaluationOperator/>
      <GP-StatsCalcFitnessSimpleOp/>
      <TermMaxGenOp/>
      <TermMaxFitnessOp fitness="1.0"/>
      <MilestoneWriteOp/>
    </MainLoopSet>
  </Evolver>
  <System>
    <Register>
      <Entry key="lg.file.level">3</Entry>
      <Entry key="lg.console.level">3</Entry>
    </Register>
  </System>
</Beagle>

Section Finish

And now you hopefully have a working STGP example. You shouldn't expect much increase in fitness during evolution, since the values are randomly generated :)