Documentation:Manual:Introduction
From Beagle
This document introduces the Open BEAGLE framework for Evolutionary Computations (EC). It contains a tutorial, an overview of the object oriented software architecture, and a complete user manual. It is aimed at both the novice and expert user of EC techniques. For more implementation details, the reader is referred to the Open BEAGLE Reference Manual which was distiled from the source code, using the doxygen documentation generator. The detailed installation and compilation instructions are given in the Compilation HOWTO.
Contents |
What is Evolutionary Computations
Evolutionary computations is a discipline that involves simulating the natural evolution process on computers. This can be seen as an optimization process in which a population evolves over time, to fit a given environment. The EC principles have been successfully applied to numerous situations where classical methods of optimization, classification and automatic design were not able to produce adequate solutions. Generally, EC is presented in four major flavors: Genetic Algorithms (GA), Genetic Programming (GP), Evolution Strategy (ES), and Evolutionary Programming (EP).
Genetic Algorithms (GA) involve the evolution of a population of individuals representing possible solutions to a problem. An individual is usually a string of symbols defined over a given alphabet, most often bits (individual = string of bits). The idea is inspired from genetic DNA that composes every living creature. The search process is made by an iterative application of genetic operators, such as crossover, mutation, and Darwinian selection operators biased toward the fittest individuals. Using this Darwinian paradigm, a population of solutions evolves until some stopping criterion is reached. GA has been widely used as a numerical method to optimize parameters of systems without having any a priori knowledge about the search space. Only a feedback criterion is needed to give an objective value that guides the search.
Genetic Programming (GP) is a paradigm that allows automatic programming of computers by heuristics inspired from the same evolution principles as GA: genetic operations of crossover and mutation, and natural selection operation. The difference between GA and GP lies mainly in the representation used, for which GP is similar to a computer program structure. Canonical GP was first formally expressed by Koza in the beginning of the 1990s. Koza's GP represents programs as trees, that is acyclic and undirected graphs, where each node is associated to an elementary operation specific to the problem domain. Others have experimented with different representations, such as linear programs or cyclic graphs. GP is particularly adapted to the evolution of variable length structures.
The Evolution Strategy (ES) paradigm was developed by I. Rechenberg and H.-P. Schwefel at the Berlin Technical University in the 1960s. In ES, each individual is a set of characteristics of a potential solution. This set is generally represented as floating-point vectors of fixed length. ES is applied to a parent population (of size μ) from which individuals are randomly selected to generate an offspring population (of size λ >> μ). Offsprings are modified by mutation, which consists in adding a randomly generated value that follows some parametrized probability density function. The parameters of this probability density function, called the strategy parameters, themselves evolve over time following the same principles. To engender a new population of size μ, the best μ individuals are chosen within either the λ offsprings (approach (μ,λ)), or the μ parents and λ offsprings (approach (μ+λ)). Modern ES can also be nested.
Evolutionary Programming (EP) has been developed by L.J. Fogel in the 1960s and later by D.B. Fogel et al. in the 1990s. EP was initially designed to evolve finite state machines and has been later extended to parameter optimization problems. The approach is more focused on the relation between parents and offsprings than on the simulation of nature-inspired genetic operators. Contrary to the three first EC flavors, EP doesn't involve the use of a particular representation, but rather a high-level evolutionary model and a representation appropriate for the problem to solve. Only a mutation operator specific to the representation is needed. To do EP, a population of μ solutions is randomly generated. Each individual of the population produces λ offsprings resulting from mutation. Then, a natural selection operation is applied to produce a new population of μ individuals. The mutation - selection process is applied iteratively until a good solution is found.
What is Open BEAGLE
Since 1999, we have developed Open BEAGLE, a versatile C++ environment primarily designed to do GP, but general enough to do other kinds of Evolutionary Computations (EC). The name BEAGLE stands for the acronym the Beagle Engine is an Advanced Genetic Learning Environment (in French, BEAGLE stands for Beagle est un Environnement d'Apprentissage Génétique Logiciel Évolué). Beagle is also the name of an English vessel, the HMS Beagle, in which Charles Darwin engaged himself as a naturalist for a circumnavigation of the earth. The name Beagle was previously used in a very old (in term of technological advancements) software that does pattern recognition using EC principles. Although, the adjective Open was added to the current BEAGLE framework to distinguish the two designations, it also puts the emphasis on the open source aspect of the project. In this document, the term BEAGLE always designates the actual framework, Open BEAGLE.
The Open BEAGLE architecture follows the principles of Object Oriented (OO) programming, where some abstractions are represented by loosely coupled objects and where it is common and easy to reuse code. Open BEAGLE is built on a generic OO foundation, independent for the different EC flavors. Over this foundation, both GP and GA (vector-based) frameworks are currently implemented. Open BEAGLE is made to be extensible, to allow the user to implement his own algorithm, with little efforts and minimal code writing.
Open BEAGLE was designed with the objectives of providing an EC framework that is generic, user friendly, portable, efficient, robust, elegant, and free.
Genericity
With Open BEAGLE, the user can execute any kind of EC, as far as it fulfills some minimum requirements. The only necessary condition is to have a population of individuals to which a sequence of evolving operations is iteratively applied. So far, Open BEAGLE supports most mainstream EC flavors such genetic programming; bit string, integer-valued vectors, and real-valued vectors genetic algorithms; and evolution strategy. It also includes support for advanced EC techniques such multiobjective optimization and co-evolution. The user can take any of these specialized frameworks and modify them further to create his own specialized flavor of evolutionary algorithms.
User Friendliness
Considerable efforts were deployed to make the use of Open BEAGLE as easy and pleasant as possible. Open BEAGLE possesses several mechanisms that offer a user friendly programming interface. For example, the memory management of dynamically allocated objects is greatly simplified by the use of reference counting and automatic garbage collection. The programming style promoted is high-level and allows rapid prototyping of applications.
Portability
The Open BEAGLE code is compliant with the C++ ANSI/ISO 3 standard. It requires the Standard Template Library (STL). The framework also includes a subset of the Portable Agile C++ Classes (PACC) collection, which encapsulates calls made to the operating system in a portable fashion.
Efficiency
To insure efficient execution, particular attention was given to optimization of critical code sections. Detailed execution profiles of these sections were done. Also, the fact that Open BEAGLE is written in C++ contributes to its overall good performance.
Robustness
Many verification and validation statements are embedded into the code to ensure correct operation and to inform the user when there is a problem. Robust mechanisms for periodically saving the current evolution state have also been implemented in order to enable automatic restart of interrupted evolutions.
Elegance
The interface of Open BEAGLE was developed with care. Great energy was invested in designing a coherent software package that follows good OO and generic programming principles. Moreover, strict programming rules were enforced to make the C++ code easy to read, understand and, eventually, modify. The use of XML as file format is also a central aspect of Open BEAGLE, which provide a common ground for tools development to analyze and generate files, and to integrate the framework with other systems.
Free Sourceness
The source code of Open BEAGLE is free, available under the GNU Lesser General Public License (LGPL). It can thus be distributed and modified without any fee. Open BEAGLE is available on the Web.
Survey of Existing Frameworks
NOTE: This section can be reviewed taking the survey of existing frameworks available in the IJAIT paper.
Open BEAGLE shares a lot of features with other GP and EC systems available on the Web (because there are currently few generic EC systems that do GP, it is natural to compare Open BEAGLE to both GP and EC packages). We have identified four of them for comparison with Open BEAGLE: lil-gp, gpc++, EO and ECJ.
lil-gp is a C implementation of the original little-lisp code described in the Genetic Programming I book from Koza. It provides the majors features of a complete GP environment. It is coded in the widely used C language, providing a fast and efficient implementation. However, it has several drawbacks, essentially from the functional structure of the code. Indeed, it is harder to extend this kind of GP implementation, compared to an object oriented one. There is a numbers of patches for lil-gp currently available\footnote{We know the existence of four of them: a multi-threads safe implementation (mtlilgp), a constrained GP version (cgp-lilgp), a Windows 95 port, and mix of these patches (patched lil-gp).} that are more or less compatible. It might be a long work to adapt and make them work together and add new functionalities.
gpc++ is one of the first known C++ framework for tree-based GP. Although gpc++ and Open BEAGLE share some philosophic and implementation aspects, gpc++ contains C-like constructs which do not promote good OO practices. For example, the GP tree is structured as a prefix list of the function and terminal names (a list of char*). To evaluate the trees, the user needs to define a complex switch case that is very hard to recycle. Also, gpc++ does not profit from design pattern.
EO, which stands for Evolving Objects, is a C++ framework for EC. Open BEAGLE and EO share some design concepts, essentially by separating the algorithms (like genetic operators) from the data structure (the populations for instance). But their implementations are somewhat different: EO uses generic programming extensively (sometimes called static polymorphism), as opposed to Open BEAGLE which uses some generic programming concepts to enhance the user experience, but mostly uses polymorphism by inheritance (sometimes called dynamic polymorphism). We are thus convince that dynamic polymorphism is necessary for a programmer-friendly EC environment and for rapid application developments.
Finally, ECJ is a complete, Java-based environment for EC. We chose ECJ among all available Java-based EC systems because it is a full featured, well-made OO system. Like Open BEAGLE, it is designed following an OO methodology, and uses polymorphism by inheritance extensively. However, although Java is a nice coherent language, it suffers from relatively poor execution speed and big memory footprints, which is indeed a serious limitation for resource-intensive tasks like EC. (In principle, using an optimizing compiler that produce machine code, Java and C++ programs should be able to achieve comparable speeds. However, at the time of writing, several benchmarks on the Web report that this is not true in practice.)
Document Content
The document is divided into three parts. First is presented a tutorial on how to implement a GP application with Open BEAGLE. This section presents an overview on how to use the framework, and is targeted not only for the GP users, but also for any other EC users. Second, the object oriented architecture is specified. It presents the important entities and mechanisms deployed in Open BEAGLE, and gives a comprehensive description of the framework. Finally, the document ends with a detailed user manual where all essential details needed to rapidly implement an EC application with Open BEAGLE are given.
Users familiar with the C++ programming language can read the present document from the first page to the last. For those less familiar with C++ and object oriented concepts, the first reading should focus on the GP tutorial chapter, and sections Miscellaneous Architecture Elements, Guidelines, Building a System for an Evolution, and GA User Manual. Thereafter, with a good C++ reference manual on their side, they can read the entire document to better understand the general organization of the framework.
Next chapter: GP Tutorial.
