Gdf.dvi

GDF: A tool for function estimation through Ioannis G. Tsoulos (1)∗, Dimitris Gavrilis(2), Evangelos Dermatas(2) (1)Department of Computer Science, University of Ioannina, P.O.
(2)Department of Electrical & Computer Egineering, University of Abstract
This article introduces a tool for data fitting that is based on genetic programming and especially on the grammatical evolution technique. The user needs to input a series of points and the accompanied dimensionality n and the tool will produce via the genetic programming paradigm a function f : Rn → R that is optimal in the least squares sense. The tool is entirely written in ANSI C++ and it can be installed in every UNIX PACS:02.30.Mv;02.60.-x;02.60.Ed ;07.05.Mh;
∗Corresponding author. Email: sheridan@cs.uoi.gr PROGRAM SUMMARY
Program available from: CPC Program Library, Queen’s University of Belfast, Computer for which the program is designed and others on which it has been tested : The tool is designed to be portable in all systems running the GNU Installation: University of Ioannina and University of Patras, Greece.
Programming language used : GNU-C++.
Memory required to execute with typical data: 200KB.
Has the code been vectorised or parallelized? : No.
No. of bytes in distributed program,including test data etc.: 30 Kbytes.
Distribution format : gzipped tar ﬁle.
Keywords: Function approximation, stochastic methods, genetic programming, Method of solution: Genetic programming.
LONG WRITE UP
Introduction
The problem of function estimation consists of ﬁnding a function that will best approximate a set of n-dimensional points given their output. Function es- timation ﬁnds many applications in physics, chemistry, signal processing etc.
and it can be formulated as follows: Given M points and associated values (xi, yi) , i = 1, . . . , M with xi ∈ Rn estimate a function f : Rn → R that minimizes the least squares “Error” Through the years many methods have been proposed for this problem, such as spline based [2, 3] or neural network based [4, 5]. Although these techniques have been applied successfully to many data ﬁtting problems, they produce functional forms which consist of applications of speciﬁc functions such as polynomials or sigmoidal functions. The proposed programming tool takes as input the points (xi, yi) and creates a functional form that minimizes the quantity in equation 1 through the procedure of Grammatical Evolution [1]. Grammatical Evolution is an evolutionary processes that can create programs in an arbitrary language. The production is performed using a mapping process governed by a grammar expressed in Backus Naur Form. Grammatical evolution has been applied successfully to problems such as symbolic regression [7], discovery of trigonometric identities [8], robot control [9], caching algorithms [10], ﬁnancial prediction [11] etc. The rest of this article is organized as follows: in section 2 the contents of the distribution are presented, in section 3 the installation steps for any UNIX programming environment are expressed in detail, in section 4 the main parts of the distribution such as the underlying algorithm, the grammar speciﬁcations and the gdf program are thoroughly analyzed and ﬁnally in section 5 some conclusions from the application of the tool are listed.
Distribution
The package is distributed in a tar.gz ﬁle named GDF.tar.gz and under UNIX systems the user must issue the following commands to extract the associated These steps create a directory named GDF with the following contents: 1. bin: It is a directory which is initially empty. After the compilation
of the package it will contain the executable gdf and the text ﬁle named grammar.txt. The ﬁrst ﬁle is the created programming tool and the second one is an auxiliary ﬁle that contains the grammar of the tool, expressed 2. doc: This directory contains the documentation of the package (this ﬁle)
in diﬀerent formats: A LYX ﬁle, A LATEX ﬁle, a PostScript ﬁle and a pdf 3. include: The directory which contains the header ﬁles for all the classes
4. src: The directory with the source ﬁles of the package.
5. Makefile: It is the ﬁle that will be read by the make utility in order to
built the tool. There is no need for the user to modify this ﬁle.
Installation
The following steps are required in order to build the tool: 1. Uncompress the tool as described in the previous section.
After the compilation the binary ﬁle gdf will be placed in the bin subdirectory accompanied with the text ﬁle grammar.txt The tool is entirely written in GNU C++ version 3.2.3, but it can be installed in systems with diﬀerent ANSI C++ compiler. The only modiﬁcation required is to replace the line in the ﬁle Makefile under the src subdirectory with the following one where mycpp is the name of the corresponding ANSI C++ compiler in the Program details
The underlying algorithm
The programming tool is based on the following stochastic algorithm: Initialization Step:
1. The program reads the data to be ﬁtted from a text ﬁle.
2. The program reads the used grammar from a text ﬁle.
3. Every chromosome in the genetic population is initialized. The initializa- tion is performed by a randomly selection of a number in the range [0,255] for every element of each chromosome.
4. The values for the parameters selection rate and mutation rate are
selected. The selection rate denotes the fraction of the number of chro-
mosomes that will go through unchanged to the next generation. That means that the probability for crossover is set to 1 - selection rate. The
values for these parameters are mutually independent and they must be 5. Set k = 0, where k is the amount of the generations.
6. Set the value for the parameter maxK, where maxK is the maximum Evolution Step:
1. For every chromosome in the population, a function is created through the process of Grammatical Evolution.
2. The ﬁtness of each member of the population is evaluated.
3. The chromosomes are sorted in descending order according to their ﬁtness 4. A bunch of (1-selection rate)×population size new chromosomes is cre- ated. Every new chromosome is formed from two selected individuals (parents) of the current population with one - point crossover. In that procedure the chromosomes are cut at a randomly chosen point and their right-hand-side subchromosomes are exchanged, as shown in ﬁgure 1. For every new chromosome the selection of every parent is performed through (a) A group of K ≥ 2 randomly selected chromosomes is created.
(b) The chromosome with the best ﬁtness value in the group is selected, 5. The mutation procedure is applied to each member of the population with probability equal to mutation rate.
6. Set k = k + 1
7. If k > maxK or the best ﬁtness value has fallen below a predeﬁned thresh- old, then the Evolution Step is terminated.
Grammar specification
The ﬁle that contains the grammar speciﬁcation must be determined by the user with the -g option from the command line. The grammar must be speciﬁed in any simple text (ASCII) ﬁle with the format shown in ﬁgure 2.
The start symbol (<S>) is required and must be speciﬁed in the above form. The start symbol can give only one non-terminal symbol (e.g. <expr>).
The available non-terminal symbols are <expr>, <func>, <op>. The available terminal symbols are +,-,*,/,(,). The numbers are represented as lists of digits (including “.”) and can be speciﬁed by <terminal>. The subrules for the Figure 3: The rules of the symbol <terminal> <digit>::= 0 | 1 | 2 |3 |4 |5 |6 |7 |8 |9 <terminal> symbol are ﬁxed in the code and they can not changed by the user. These rules are shown in ﬁgure 3. The symbol d in the rule for <xxlist> denotes the dimensionality of the objective function. The available functions are: sin, cos, log, exp, log10, tan, abs, sqrt, int, atan, acos, asin. If a non- teminal speciﬁcation has more that one rules, those rules can be speciﬁed with a “|” instead of typing the entire left hand (e.g. in the second rule of <expr> , “<expr> ::=” is replaced by “|”). In this way, the user can easily alter the program parameters by specifying a diﬀerent grammar. If, for example, it is known that log or log10 cannot exist in the desired output, the user can remove them from the grammar speciﬁcation.

The main program gdf
The created executable gdf takes the following series of parameters in the com- 1. -h: The program prints a help screen to the user with a description for
each command line parameter and it terminates.
2. -g grammar ﬁle: The parameter grammar ﬁles determines a ﬁle with a
valid grammar for the tool. The user must have read access to the speciﬁed ﬁle. The default value for this parameter is grammar.txt, which is the default grammar and it is copied after the installation in the subdirectory 3. -p problem ﬁle: The parameter problem ﬁle determines a ﬁle containing
the points where the data ﬁtting procedure will be applied. The user must have read access to the speciﬁed ﬁle and the contents of the ﬁle must conform to the format of the ﬁgure 4. The integer number D in the ﬁle determines the dimensionality of the speciﬁc problem, the number M determines the amount of points in the ﬁle and each consecutive line deﬁnes a point where the data ﬁtting procedure will be applied. This parameter is the only one required from the program 4. -t test ﬁle: The parameter test ﬁle determines a ﬁle in the same format
as the problem ﬁle, where the produced function will be tested after the termination of the genetic algorithm. The user must have read access to

xM1 xM2 . . . xMD yM the speciﬁed ﬁle and the dimension in the ﬁle test ﬁle must be the same 5. -c count: The parameter count speciﬁes the number of chromosomes in
the genetic population. The default value for this parameter is 500.
6. -l length: The parameter length speciﬁes the length of each chromosome
in the genetic population. The default value for this parameter is 100. The standard GE approach uses variable - length chrosomes, but the tool GDF uses chromosomes with static length in order to prevent it from creating 7. -s srate: The parameter srate speciﬁes the value for the parameter selec-
tion rate of the genetic algorithm. The default value for this parameter
8. -m mrate: The parameter mrate determines the value for the parame-
ter mutation rate of the genetic algorithm. The default value for this
9. -n generations: The integer parameter generations determines the max-
imum number of the generations allowed for the genetic algorithm. The default value for this parameter is 2000.
10. -r seed: The parameter seed speciﬁes the seed for the random number
generator. The default value for this parameter is 1.
In each generation the program prints in the screen the following quantities: 1. The number of generations passed.
3. The ﬁtness value of the best discovered function.
Test runs
The performance of the proposed tool was measured by using 5 diﬀerent datasets: one for the continuous function f (x) = x sin(x2) and 4 real life problems.
The function f (x) = x sin x2
The tool was tested on this function using a dataset with 100 random points from the function in the range [-2,2]. The tool was issued with the following

where xsinxx.data is the ﬁle containing the points for the data ﬁtting. The last 10 lines from the output of the above program are the following: generation=156 f(x)=sqrt(sqrt((log(2.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-1.260575367e-06 generation=157 f(x)=sqrt(sqrt((log(2.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-1.260575367e-06 generation=158 f(x)=sqrt(sqrt((log(2.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-1.260575367e-06 generation=159 f(x)=sqrt(sqrt((log(2.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-1.260575367e-06 generation=160 f(x)=sqrt(sqrt((log(2.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-1.260575367e-06 generation=161 f(x)=sqrt(sqrt((log(2.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-1.260575367e-06 generation=162 f(x)=sqrt(sin((log(4.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-4.10576003e-07 generation=163 f(x)=sqrt(sin((log(4.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-4.10576003e-07 generation=164 f(x)=sqrt(sin((log(4.72))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-4.10576003e-07 generation=165 f(x)=sqrt(sin((log(4.82))))*sin(abs(exp(log(((x1))*x1))))*x1 fitness=-4.832536538e-11 The Ailerons problem
This problem has 40 attributes and consists of 7150 points. This data set ad- dresses a control problem, namely ﬂying a F16 aircraft. The attributes describe the status of the aeroplane, while the goal is to predict the control action on the ailerons of the aircraft. The original owner of the database is Rui Camacho (rca- macho@garﬁeld.fe.up.pt). The program gdf were trained with 200 points from the dataset and the resulting expression was tested over the rest 6950 points.
Ten independent experiments were conducted with diﬀerent random seeds each time and in all cases the absolute value of the ﬁtness was below 3 × 10−6. The 1187 − x8 − 756.92x3

with ﬁtness −1.65 × 10−6 and test error per point 1.07 × 10−7. The resulting function depends only on 5 from the 40 attributes .
The Elevators problem
The original source of this problem is the experiments of Rui Camacho (rca- macho@garﬁeld.fe.up.pt). The problem has 18 attributes and this data set is also obtained from the task of controlling a F16 aircraft, although the target variable and attributes are diﬀerent from the ailerons domain. In this case the goal variable is related to an action taken on the elevators of the aircraft. From this dataset 200 points were used for training and 8452 for testing. The best − 9.8x13 − x18 + − 9.91x10 exp (x4) with ﬁtness value −1.64 × 10−3 and mean test error 3.41 × 10−5.
The Pyrimidines problem
http://www.ncc.up.pt/~ltorgo/Regression/DataSets.html and it is a problem of 27 attributes and 74 number of patterns. The task consists of Learning Quantitative Structure Activity Relationships (QSARs).
The Inhibition of Dihydrofolate Reductase by Pyrimidines.The data are de- scribed in: King, Ross .D., Muggleton, Steven., Lewis, Richard. and Sternberg, Michael.J.E. Drug Design by machine learning: the use of inductive logic pro-

gramming to model the structure-activity relationships of trimethoprim analo- gies binding to dihydrofolate reductase. From the above dataset 50 patterns were used for training and 24 for testing. The best discovered function was: (x) = cos (cos ( x20)) cos (x22 − log (sin (exp (x6))))+ with ﬁtness value −1.33 × 10−1and mean test error 7.25 × 10−3.
The Basketball problem
The source of this dataset is from Smoothing Methods in Statistics available which is a problem of four attributes and it tries to identify the points scored per minute from the attributes “assists per minute”, “player height”,”time played” and “player age”. From the 96 available patterns 60 were used for training and 36 for testing. The best discovered function was: 33.1−0.245 cos(x3x2) with ﬁtness value −3.78 × 10−1 and mean test error 6.99 × 10−3.
Conclusions
The introduced tool is a program aimed to ﬁt a function in a series of points of an arbitrary dimension. The applied function is created through the evolutionary process of Grammatical Evolution and as a consequence there is no guarantee that the goal will be achieved. However, the user can apply the tool even in cases where the existence of an analytical solution is diﬃcult to be found. Also, the tool is provided with the ability of changing the underlying grammar according References
[1] M. O’Neill and C. Ryan, “Grammatical Evolution,” IEEE Trans. Evolu- tionary Computation, Vol. 5, pp. 349-358, 2001.
[2] De Boor C., “A practical guide to splines”, Springer - Verlang, New York, [3] Kincaid D., and Cheney W., “Numerical Analysis”, Brooks/Cole Publish- [4] Hornik K., Stinchcombe M., and White H., Neural Networks 2 (1989) 359.
[5] Cybenko G., “Approximation by superpositions of a sigmoidal function”, Mathematics of Control Signals and Systems 2 (1989) 303-314.
[6] J. R. Koza, Genetic Programming: On the programming of Computer by Means of Natural Selection. MIT Press: Cambridge, MA, 1992.
[7] M. O’Neill and C. Ryan, Grammatical Evolution: Evolutionary Automatic Programming in a Arbitrary Language, volume 4 of Genetic programming.
[8] C. Ryan, M. O’Neill, and J.J. Collins, “Grammatical Evolution: Solving Trigonometric Identities,” In proceedings of Mendel 1998: 4th International Mendel Conference on Genetic Algorithms, Optimization Problems, Fuzzy Logic, Neural Networks, Rough Sets., Brno, Czech Republic, June 24-26 1998. Technical University of Brno, Faculty of Mechanical Engineering, pp.
[9] Collins J. and Ryan C., “Automatic Generation of Robot Behaviors using Grammatical Evolution,” In Proc. of AROB 2000, the Fifth International Symposium on Artiﬁcial Life and Robotics.
[10] M. O’Neill and C. Ryan, “Automatic generation of caching algorithms,” In Kaisa Miettinen, Marko M. Mkel, Pekka Neittaanmki, and Jacques Peri- aux (eds.), Evolutionary Algorithms in Engineering and Computer Science, Jyvskyl, Finland, 30 May - 3 June 1999, John Wiley & Sons, pp. 127-134, [11] A. Brabazon and M. O’Neill, “A grammar model for foreign-exchange trad- ing,” In H. R. Arabnia et al., editor, Proceedings of the International con- ference on Artiﬁcial Intelligence, volume II, CSREA Press, 23-26 June 2003,

Source: http://www.wcl.ece.upatras.gr/publications/gavrilis/gdf.pdf

Lmx grease

LMX GREASE Lithium complex grease DESCRIPTION LMX Grease is a lithium-complex thickened mineral oil based grease of NLGI No.2 consistency having extreme pressure properties andinhibited against oxidation and corrosion. APPLICATIONS Lithium complex greases have similar basic properties to conventionalSuch performance characteristics mean that lithium complex greaseslithium soap grease

Viagra®

Was ist Viagra und wann wird es angewendet?Viagra ist der erste Vertreter einer neuen Medikamentengruppe mit der Bezeichnung Phosphodiesterase-Typ 5-Inhibitoren. Es wirkt, indem es bei sexuel er Erregung die Entspannung der Blutgefässe in Ihrem Penis unterstuetzt. Dadurch kann Blut leichter in den Penis fliessen und Sie erreichen auf natuerliche Weise eine Erektion. Sie sol en Viagra nicht einn