Next: , Previous: Using, Up: Marsyas Expression Syntax


9.6.4.12 Extending

There are two ways to extend the libraries with your own functions. The hard way is to hand code the function as a class in C++ then modify a corresponding loadlib function in ExNode.cpp. The other way is to use the code generation script in tools/scheduler.

Useful conversion functions

There are a few built in functions for converting to and from the std::string type. These functions can be used in your new functions.

Adding new libraries or library functions (C++ way)

Library functions are added by creating a new ExFun class. The first step is creating a new ExFun class for the new function. The second step is adding it to the loadlib functions in ExNode.cpp.

Creating a new ExNode

The ExFun class derives from the ExNode which is an expression tree node. The example here is a function for extracting a substring from a given string.

Functions taken a set number of parameters. The function class is supplied information on the number of parameters and their types through the setSignature function when the function class is instantiated in the symbol table. For now we assume this set of parameters (mrs_string,mrs_natural,mrs_natural) for the original string, the start index, and the end index.

Creating an Exfun class requires the definition of the constructor, a calc() method that returns the result of the function call, and a copy method that returns a copy of the function without its parameters. The example shows the

     class ExFun_StrSub : public ExFun { public:
         ExFun_StrSub() : ExFun("mrs_string","String.sub(mrs_string,mrs_natural,mrs_natural",true) { }
         virtual ExVal calc() {
             std::string str = params[0]->eval().toString();
             int s = params[1]->eval().toNatural();
             int e = params[2]->eval().toNatural();
             int l = str.length();
             if (s<0) { s=0; } else if (s>l) { s=l; }
             if (e<s) { e=s; } else if (e>l-s) { e=l-s; }
             return str.substr(s,e);
         }
         ExFun* copy() { return new ExFun_StrSub(); }
     };

Figure 9.12: String Substring ExFun class.

Note that the constructor takes two parameters where t is the type that the function evaluates to and n is the signature of the function. These parameters are simply passed on to the ExFun parent constructor along with a third boolean parameter for the pureness of the function. Pureness is a flag that determines whether the function is free of side-effects or not. If the parameters to the function can be determined to be constant then a pure function can be evaluated at parse time to a constant value.

The calc() method uses the three parameters from the params[] array. These parameters are set at parse time and placed in the params[] array. Each param[] is an expression so they are of type ExNode*. Therefore, you need to evaluate each expression prior to using it. To evaluate, call the eval() method of the ExNode not the calc() method. eval() will make sure that each expression in a list of expressions will be evaluated whereas calc() only calculates the current node.

Adding the function to the library

The function can be added to the library by adding a line to the loadlib_String method in ExNode.cpp. The addReserved call made to the symbol table adds a reserved word. There is some flexibility in how the name appears in the symbol table which in turn defines how it may be used in an expression. The path to a function is separated by the '.' symbol. Multiple names for a segment of the path can be defined by separating them with the '|' symbol where the first among several is the 'true' name. For example String|Str|S.sub defines three different leading names where the true name is String.sub but S.sub will get the same thing. The same is possible for parameter tuples but not the individual parameters. For example Real|R.cos(mrs_real)|(mrs_natural) allows the call Real.cos(0.5) as well as Real.cos(1) as 1 is a natural and not a real. This type information is used to type check function calls in the parser.

     st->addReserved("String|S.sub(mrs_string,mrs_natural,mrs_natural)",
                     new ExFun_StrSub("mrs_string",
                                      "String.sub(mrs_string,mrs_natural,mrs_natural)"));

Figure 9.13: Adding a function to the library.

The second parameter to the addReserved call is a new instantiation of the function object. Here the return type is the first parameter to the constructor and the 'correct' or 'true' full function call is the second parameter. This information is used for type checking parameters. While type errors are not possible if the first parameter to the addReserved call is correct, the type information in the second parameter to the constructor is actually used for type coercion - promoting naturals to reals, etc.

Adding new libraries or library functions (Code Gen way)

In the tools/scheduler directory is a python script for generating library functions from a simplified source code. The easiest way to explain the process is through an example:

      1: lib Foo|F.Bar|B
      2:
      3: pure mrs_string fun|alt(mrs_real a, mrs_natural b)
      4:     mrs_string s="hello";
      5:     mrs_bool x;
      6: {
      7:     mrs_natural z=a+b;
      8:     if (x) { s=s+" "+ltos(z); }
      9:     x = z < 0;
     10:     return s;
     11: }

Figure 9.14: Adding a function to the library.

Though not a useful function it does demonstrate the full extent of the code generation syntax.

Line 1. library definition starts with keyword 'lib' the names following denote a path to the library. The true path is Foo.Bar, all functions defined after this statement until a new lib definition will be in this library. This means that the function fun is called by 'Foo.Bar.fun'. Alternate names or aliases for portions of the path can be defined using the | symbol. In the above example F is an alias for Foo so the path to fun could also be written as 'Foo.B.fun' or 'F.B.fun' etc.

Line 3. the function definition may start with 'pure' where pure implies that if the parameters to the function are constants then the function can be evaluated at parse time to a constant, ie no side-effects. If pure isn't specified then the function is not pure. the return type must be a type supported by the ExVal class (names starting with 'mrs_'). The function name can also have aliases divided by the | symbol where the first name is the true name. Parameters must be defined using the 'mrs_' names.

Line 4. Normally functions do not have state but as a bonus variables whose values persist may defined after the parameters definition and prior to the opening function body brace. These types can be the 'mrs_' types or valid C++ types.

Line 6. The function body begins with a opening brace {.

Line 7-10. The function body contains valid C++ code and will likely use the parameter values defined on line 3.

Line 11. The function body ends with a closing brace }.