Thursday, October 16, 2014

What's new: ELENA 1.9.17 - generic methods

In this post I will discuss generic methods introduced in 1.9.7 version.

Let's start with two basic things. Any message in the language consist of a verb (predefined action : get, set, insert, add, ...), a signature (user defined) and a number of parameters

For example in the following expression

aBinary insert &index:0 &literal:"0"

a message

insert&index&literal[2] 

is used. insert is a verb, index&literal is a signature (consisting of two subjects: index and literal) and 2 is a number of parameters.

Or in this message call

anS length

get&length[0] is used

And secondly the message can be created dynamically at run time combining a signature symbol and a generic message (a message without a signature)

anS ~ %length get.

These two principles are used in the generic method implementation. A generic method may responds to the message with the same verb and the parameter number but with any signature. Original signature is presaved and can be used inside the method.

Let's consider a simple example. Suppose we have the class containing coordinates

#class Point
{
   #field theX.
   #field theY.

   #constructor new &x:anX &y:anY
   [
       theX := anX.
       theY := anY.  
   ]

   #method x = theX.

   #method y = theY.
      
   #method set &x:anX
   [
       theX := anX.
   ]
      
   #method set &y:anY
   [
       theY := anY.
   ]

   #method clone = Point new &x:theX &y:theY.

   #method literal = "Point(x:" + theX literal + ", y:"
                          + theY literal + ")".
}

Let's create a variable which can contain a point

#class PointVariable
{
   #field thePoint.

   #constructor new &point:aPoint
   [
      thePoint := aPoint.
   ] 

   #method value = thePoint.
}

And let's define the operations with the point coordinates. We will use generic methods

#class PointVariable                                                                
{  
   ...

   #method(generic) append : aValue
   [
      thePoint~$subject set:(thePoint~$subject get + aValue).
   ]
       
   #method(generic) reduce : aValue
   [
      thePoint~$subject set:(thePoint~$subject get - aValue).
   ]
       
   #method(generic) multiplyBy : aValue
   [
      thePoint~$subject set:(thePoint~$subject get * aValue).
   ]
       
   #method(generic) divideInto : aValue
   [
      thePoint~$subject set:(thePoint~$subject get / aValue).
   ]
}

How it works? Let's consider the simple use case

   #var aVar := PointVariable new 
                   &point:(Point new &x:1 &y:1).

   aVar append &x:2.
   
   console writeLine:(aVar value literal).

When append&x[1] message is sent to an instance of PointVariable, a class dispatcher tries to resolve the message directly, if no match was found it will call the generic method - append, a built-in variable contains our original signature - %x.

thePoint~ %x get 

is similar to

thePoint x

and

thePoint ~%x set:aValue

similar to

thePoint set &x:aValue

Note that our example will work with a Point class containing arbitrary number of coordinates (one, two, three and so on)

This principle is used in system'dynamic'DynamicStruct class.

   #var r1 := system'dynamic'DynamicStruct new.
   #var r2 := system'dynamic'DynamicStruct new.

   r1 set &Price:20.5r set &Count:3.
   r2 set &Name:"John" set &LastName:"Smith".
   r1 set &Supplier:r2.

Monday, October 6, 2014

Lexical Structure

An ELENA module consists of one or more source files. A source file is an ordered sequence of Unicode characters (usually encoded with the UTF-8 encoding).

There are several sequences of input elements: white space, comments and tokens. The tokens are the identifiers, keywords, literals, operators and punctuators.

The raw input stream of Unicode characters is reduced by ELENA DFA into a sequence of <input elements>.

	<input> :
			{ <input element> }*
		
	<input element> :
			<white space>
			<comment>
			<token>
			
	<token> :
			<identifier>
			<full identifier>
			<local identifier>
			<keyword>
			<literal>
			<operator-or-punctuator>

Of these basic elements, only tokens are significant in the syntactic grammar of an ELENA program.

White space

ELENA White space are a space, a horizontal tab and line terminators. They are used to separate tokens.

	<white space> :
		SP (space)
		HT (horizontal tab)
		CR (return)
		LF (new line)

Comments

ELENA uses c++-style comments:

   /* block comment */

   // end-of-line comment

	<comment> :
		<block comment>
		<end-of-line comment>
		
	<block comment> :
		'/' '*' <block comment tail>
		
	<end-of-line comment> :	
		'/' '/' { <not line terminator> }*
		
	<block comment tail> :
		'*' <block comment star tail> 
                <not star> <block comment tail>
		
	<block comment star tail> :
		'/' 
                '*' <block comment star tail> 
                <neither star nor slash> <block comment tail>
		
	<not star> :
		any Unicode character except '*'
		
	<neither star nor slash> :
		any Unicode character except '*' and '/'

	<not line terminator> :
		any symbol except LR and CF

ELENA comments do not nest. Comments do not occur inside string literals

Identifiers

An identifier is a sequence of letters, underscore and digits starting with letter or underscore. An identifier length is restricted in the current compiler design (maximal 255 characters)

	<identifier> :
		<letter> { <letter or digit> }*
		
	<letter> :
		Unicode character except white space, 
                        punctuator or operator
		'_'
		
	<letter or digit> :
		<letter>
		Digit 0-9

ELENA identifiers are case sensitive.

Full identifiers

A full identifier is a sequence of identifiers separated with "'" characters. It consists of a namespace and a proper name. A full identifier length is restricted in the current compiler design (maximal 255 characters)

	<full identifier> :
		[ <name space> ]? "'" <identifier>		
		
	<name space> :
		<identifier> [ "'" { <identifier> } ]*

Local identifiers

A local identifier is a sequence of letters, underscore and digits starting with '$' character. A local identifier length is restricted in the current compiler design (maximal 255 characters)

	<local identifier> :
		'$' <identifier>

Keywords

A keyword is a sequence of letters starting with '#' character. Currently only following keywords are used though others reserved for future use: #class, #symbol, #static, #field, #method, #constructor, #var, #loop, #define, #type, #throw, #break. Keywords can be placed only in the beginning of the statement.

	<local identifier> :
		'#' { <letter> }+
	
	<letter> :
		Unicode characters

Literals

A literal is the source code representation of a value.

	<literal> :
		<integer>
		<float>
		<string>

Integer literals

An integer literal may be expressed in decimal (base 10) or hexadecimal(16).

	<integer> :
		<decimal integer>
		<hexadecimal integer>
		
	<decimal integer> :
		[ <sign> ] { <digit> }+

	<sign> :
		"+"
		"-"
		
	<digit> :
		digit 0-9
		
	<hexadecimal integer> :
		<digit> <digit or hexdigit>* 'h'
		
	<digit or hexdigit> :
		<digit>		
		one of following character - 
                       a b c d e f A B C D E F

Floating-point literals

A floating-point literal has the following parts: a whole-number part, a decimal point, and fractional part, an exponent. The exponent, if present, is indicated by the Unicide letter 'e' or 'E' followed by an optionally signed integer.

At least one digit, in either the whole number or the fraction part, and a decimal point or an exponent are required. All other parts are optional.

	<float> :
		{ <digit> }* '.' { <digit> }* [ <exponent> ] 'r'
		{ <digit> }+ <exponent> 'r'
		
	<digit> :
		digit 0-9

	<exponent> :
		<exponent sign> <integer>
		
	<exponent sign> :
		either 'E' or 'e'
		
	<integer> :
		<sign>? <digit>+
		
	<sign> :
		"+"
		"-"

Real literals are represented with 64-bit double-precision binary floating-point formats.

String literal

A string literal consists of zero or more characters enclosed in double quotes. Characters may be represented by escape sequences.

	<string> : 
		'"' <string tail> '"'
		
	<string tail> :
		<string character> { <string tail> }*
		<escape sequence>  { <string tail> }*
		'%' '%' { <string tail> }*
		'"' '"' { <string tail> }*
		
	<string character> :
		any character except CR or LF or '"'

String literal escape sequences

The string literal escape sequences allow for the representation of some non-graphic character as well as the double quote and percent character.

	<escape sequence> :
		'%' <decimal escape>
		
	<decimal escape> :
		{ <digit> }+
		<alert>
		<backspace>
		<horizontal tab>
		<carriage return>
		<new line>
		
	<digit> :
		digit 0-9

	<alert> :
		'a'

	<backspace> :
		'b'

	<horizontal tab> :
		't'

	<carriage return> :
		'r'

	<new line> :
		'n'

Operators and punctuators

There are several kinds of operators and punctuators. Operators are short-cut form of messages taking one operand. Punctuators are for grouping and separating.

	<operator-or-punctuator> : one of
		'(', ')', '[', ']', '<', '>', '{', '}',
                '.', ',', '|', ':', '::', '=', '=>', 
		 '+', '-', '*', '/', '+=', '-=', '*=', '/=', 
                 '||', '&&', '^^', '<<', '>>', ':='