The Excellent Language for Executing Strongly Typed Objects

Telesto

The Telesto Programming Language is a general purpose, multi-paradigm language. This page is about both the language and the compiler which are in concurrent development.

 
Navigation

About
Language
Standard Library Compiler
Contact

Links

About

Telesto is intended for just about any programming task. We envision Telesto being used for everything from scripting to web and desktop applications to embedded development. It is based on C++, but includes features found in other languages such as Java. It is far easier to parse than C++, this makes it both easier for newbies to learn and easier to experienced programmers to read and write. It also means that it will be far easier for IDEs to parse quickly.

The compiler for Telesto will use a hand-written (mostly) recursive-decent parser, written in C++. The compiler initially will generate LLVM assembly code which can then be assembled and linked using LLVM.

This makes the compiler cross-platform as long as LLVM has a backend for the desired platform (and because it can generate free-standing binaries, that's ALOT of platforms). It is the intent of the language designer that the structure of the compiler be documented so that the internal API can be exposed such that it can be used by a project to extend the language for a particular program's needs.

Telesto makes every attempt to make the functioning of the language easy to figure out. There are fewer rules that can cause bugs. No cross-promotion between signed and unsigned types, for example. If you want an unsigned int, and try to assign to it from a signed short, you need to use the zero extending cast. The type object is the base of the inheritance tree. Variables of type object can be assigned to anything, but doing so may cause an InvalidCastException to be thrown. If you use a cast, the check is not performed. Also, all object variables are references. There are no pointers or C++'s auto objects (ones on the stack). The compiler may decide to place an object on the stack, and there's even a way to insist to the compiler that the variable be placed on the stack, but the variable is still a reference. This being said, for every mechanism, there's either an override or convenience functions to help out. There's no reason why an operating system could not be written in Telesto.

Ultimately Telesto aims to find the sweet spot between being brief while also being expressive enough with only a few easy to remember rules that aren't reflected in the written code.


Language

Telesto is a general purpose, multi-paradigm programming language. This means that while Telesto has features that may be categorized as functional, object-oriented, or structural/procedural, it does not consider itself to be any of those things. While some other languages think there's only one (possibly obvious) way to do it, or that there should be more than one way to do it, in Telesto, once figured out, the way to do it should be easy to write without error and frustration. Note: the following is written as an intro/tutorial for experienced programmers, for an informal description of the language grammar, see the Compiler section.

How about some code blocks, eh? Telesto is a curly-brace language. All code in Telesto is surrounded by "{}" (except the typical one liner after a if/for/while/etc). Telesto descends from C in almost this way only. The other C-like aspect is that the semi-colon is a statement terminator. Oh, and comments and array initializers.

Functions in Telesto can look like either JavaScript or Pascal-like (except with curly brackets). Types are infered from expressions. For the most part, if no type declarations are used, Telesto can look indistinguishable from JavaScript , although Telesto is strongly typed. Sometimes a type declaration is necessary, however. For example, initializing a variable with the literal number 5 will result in a signed 32bit number (on 32bit systems). If an unsigned number, or a number of different size (8bits for example) was desired, the type must be specified. Here is a convoluted code sample that computes a factorial using all possible type declarations, which are all also optional (which includes the preceding colon character)

	function factorial( n : long ) : long
	{
		var result : long = n;
		if( n == 1 )
			return 1;
		result *= factorial( n - 1 );
		return result;
	}

The three type declarations in the preceding code sample being optional makes another point. Function templates are easy to write in Telesto. Without any type declarations, Telesto will either dynamically use reflection, or make the function a template --depending on optimization settings and code attributes--. Notice that only one variable is being initialized. If initialization is desired, only one variable can be declared in that var statement. This will be changed eventually.

Objects

In Telesto, all variables, except base number types are references to objects. This is similar to Java. It it up to the compiler whether the object is placed on the stack or heap, usually, however the programmer can use auto instead of new to place a variable explicitly on the stack.

This effectively means that function arguments are call-by-value if you think of the reference variables as non-alterable pointers. To pass an actual reference to a variable to a function, precede the arguments name in the function declaration with an ampersand. To pass a reference to a copy of an object, use the copy() method of the object class. It has several overloads specifying how deep the copy should go.

The typical way to create an object is with the new operator. The object will typically be placed on the heap, but if the compiler can determine that it doesn't go anywhere, it may put it on the stack instead. An object can also be explicitly placed on the stack by creating it with the auto operator. In this case, if passed to another function or assigned to a non function-local variable, the copy() method will be implicitly called to create a shallow copy.

Objects that are part of the same inheritance tree can be assigned to each other. This is an implicit cast and will cause extra checking along with an InvalidCastException if the objects don't have a linear ancestral relationship. To avoid the check (and possibly just allow the program to crash), use the ugly looking cast (the only built-in cast) cast<type>(). Objects of type object can be assigned to anything, as object is the parent of all inheritance trees. The same semantics apply.

Classes / Namespaces

Classes work essentially just like C++, except that only single inheritance is supported, all class methods are virtual, and that methods are implemented directly in the class declaration. Also, object is the parent of any class that does not declare one explicitly. Templates do not require a separate keyword and are placed in angle brackets directly after the class name. Template type names are merely listed, there is no typename keyword. Following the template typename, a colon and an existing type will place a restraint of the type that can be used that the used type must inherit from the type after the semi-colon. Here's a contrived example:

	class hashtable< temptype : string >
	{
	public:
		function hashtable()
		{
			/* constructor */
		}
	private:
		static var tables = 0;
	}

Any type used to instantiate the template must now inherit from string. There will be some yet-to-be-determined limits on templates in order to specifically prevent template metaprogramming and also the excessive templating (the two are not necessarily the same thing) that helps make the C++ Standard Library have such annoying error strings.

Namespaces work similarly to Java's packages. The namespace declaration, if any, must be the first non-comment code line of a source file, except for any she-bang line for an interpreter. The declared namespace is the namespace of all code in the source file and cannot be changed. Global variables declared private are visible only in their namespace.

Native APIs / Structs

Telesto is most often used compiled to native code. There will eventually be a Native Library Description Language along with various interfaces and functions to assist in marshalling between Telesto objects and functions and native API function parameters. There are, however, some simple rules to make it easier. All base number types are almost always identical to their native counterparts. String objects are automatically converted to C-style NULL terminated ASCII strings (similar to calling c_str() on a C++ std::string). All of this can be changed, however, if the NLDL for the native API specifies a different transformation.

Unlike Java, Telesto supports structures. Telesto structures work similar to C++, but they can only include base number types, string objects, and references to other structures. These structures are strictly for use with native APIs and other C libraries. If you think you need a struct, but you only call Telesto functions, what you need is a class with everything declared public. Here's an example of a struct:

struct named_struct 
{
	var count : int;
	var cstring[12] : char;
	var str_ptr : c_string;
	var other_struct: named_struct2;  //a pointer to a named_struct2
	static var yet_another : named_struct2; //direct composition
}

Note the use of the c_string type and the array of characters. Any assignment from a regular string object to a character array will assign into the array from the beginning of the string as many characters as the array can handle. Any assignment from a string object to a c_string will cause the string object to change its internal representation into a c_string, and return a native pointer to it. There will likely be a static hashtable so that such native C-string pointers can be ref-counted independently of their Telesto string objects (a string object will not release any sections of itself (i.e. the beginning) that have been c_str()'d).

Arrays / Pointers

There are no plain arrays in Telesto. This being said, the various collections available are very smart about their bounds checking, as will be the compiler. Bounds checking itself, however, can be toggled with a code attribute (either the old style #pragma, or the new style #attr{...} ). Also, pointers are available. They look like templated classes, but are implemented as plain pointers, methods called on the pointers affect the compiler's use of them in the current function only. Said methods are also available as attribute options to affect the rest of a source file.

I've essentially lied that there are no plain arrays in Telesto. Adding empty square brackets to a variable declaration automatically makes that variable a dynamic sized array (like C++ vector) of the type of variable otherwise declared. If the brackets are filled with an expression (which need not be constant), the dynamic array gets that initial size rather than the default.

Functional features

Telesto supports limited functional programming. Variables can be of type function which can be assigned the name of a function, or a pointer object casted to type function. Functions can therefore be passed to each other easily. When declaring a function variable, the prototype of the function (is optional, but) can be included in order to make it easier for the compiler to type check (and prevent slow templated results). In such a prototype of the function, no name is provided, and it must be complete (while argument names can be left out, no types can be left out). Functions can be nested, and any time a function takes a variable that is a function, that function can be written either inline like any other function, or shortened using the function(params) -> expr syntax.

When objects are caught in a closure, only the references are caught. This is probably the only non-intuitive special rule of Telesto. If a copy of an object is desired, it must initialize a different variable with exactly the results of a copy() method call. The compiler will initialize all such variables of a closure before creating it. Note that such variables cannot be placed on the stack; such is a limitation of closures without multi-stack continuations. When it comes to basic number types, they are always closed upon by value copy. A closure is implemented as an object containing a pointer to the closure code and the various "closed-upon" data (number values and object references). Such closure objects are reference counted, this is made possibly by such objects being immutable making circular references impossible. Note that object references in a closure can leak memory if no forms of garbage collection are being used. It is recommended that at least reference counting (see garbage collection) be enabled for programs using extensive functional features.

Lists

Currently, the only notation for a list is equivalent to C's array initialization syntax. Unlike C, though, literal lists in Telesto can be assigned to any collection object.

Properties

Properties can only be a member of a class. They provide get/set functions for a value, while making access appear as a member variable. Syntax looks like the following:

	property size :long
	{
		get { return size; }
		set(value) { 
			if( value < 0 )
				size = 0;
			else
				size = value;
		}
	}

Parenthesis are optional after get. A type declaration for the set parameter is not valid, because a property MUST have a declared type.

Delegates

Telesto also provides the delegate statement as a way to notate aliases for functions. For example, one can write a "size" function for a class, and also provide a length function by using a delegate statement. One can also delegate from within a member variable; useful for simulating multiple inheritance. The full delegate syntax has not been finalized, but some of the more basic uses look like:

	delegate size -> length; // old -> new aliased name
	delegate buffer.size -> buffer_length; //"buffer" is a member variable

Operator Overloading

Operator overloading works much like C++, except that an operator overload function must always be a member of a class. Said class must be the explicitly declared type of the first parameter except in the case of binary operators when the function represents the opposite ordering of operands to the operator. All operator overload functions are always implicitly static.

The cast operator can also be overloaded. In this case, the name of the function is ~cast and it must take no parameters (empty param list) and the casted to type MUST be declared as the return type.

Garbage Collection

There are three garbage collection modes that can be chosen for a Telesto program. The first is manual control. This works just like C++, you must release all memory yourself using the delete operator. You can also choose to use destructors (function ~delete()). The second garbage collection mode is reference counting. You still have full control over deletion of objects, but extra checking will be generated by the compiler to try to detect unused objects by having each object maintain a count of references to it. This adds an overhead of speed and code size, but generally this is the best mode to use. The third available garbage collection mode is full automatic garbage collection. In this mode, all delete operator usage is ignored, and the destructor is used as a finalizer.

Eventually it is hoped that code attributes will allow the various forms of garbage collection to be applied more granularly than the whole program.


Compiler

The compiler is being written in C++. Once the compiler is feature-complete, the compiler may be rewritten in Telesto as to be self-hosting. An informal grammar is currently the most complete technical an outdated specification of the language.

Status

Scanner

The Scanner is mostly complete. The major omissions include character literals and user defined delimiters for string literals. The only other thing to add would be to track the current column, but that's not an essential feature.

Parser

The Parser currently parses most of what constitutes/passes for functions in Telesto, sufficient enough to move on to further passes.

Symbol Table Builder

The Symbol Table Builder takes a list of file ASTs and builds the Global Symbol Table, including the standard library and any installed libraries. It is at this time that any duplicate items are found (and errors added to the Error Manager); compilation continues, however, just without any of the duplicate items appearing to exist to any downstream parts of the compiler.

Code Generator

The code generator takes the same list of file ASTs, and uses LLVM to actually compile the code. Any names referenced by code being compiled is resolved either using Local Symbol Tables or the Global Symbol Table. Templates are queried from the Global Symbol Table, and the code generator may pause to compile a template if the template has not yet been instantiated in the desired way.

The code generator somewhat takes on the traditional tasks of Semantic Analysis as it checks basic type promotion and casting on the fly, otherwise checking if types are the same or are related (ie, inheritance) is not unrelated from code generation as to require its own pass (in fact, with templates, it would be rather difficult to implement as its own pass).

Metacode Generator

Not sure what to call this, but this phase may likely be what generates type information to be used with reflection.

Other Modules

Currently the design for the parser only includes limited error checking. The compiler will terminate execution as soon as it finds an error. A module to be written in the future should compile the list of found errors. It should have the following features:


Contact

If you want to join the project, you can do so through the SourceForge project page. To send questions, comments, or suggestions feel free to use the project page or send me email through the SourceForge user alias address.


Links

This is just a series of very loosely sorted links to sites of interest:

Useful Libraries / Toolkits

LLVM - Compiler back-end with support for heavy optimization and many platforms.
GTK+ - Not used in a compiler, but highly recommended for cross-platform user interface development