Design Document

This document outlines several decisions and design considerations for the SquirrelJME Virtual Machine. This document should reflect the most recent path that SquirrelJME will be taking in its design. Note that the design may change and this is not forever concrete (that would be foolish).

For day to day notes on my current thought process you may look within the developer notes.

For the current route of development see the release route document.

Because Java ME is different from Java SE, there are some considerations, advantages, and disadvantages to consider when reading this document. Note that this document is mostly within the scope of Java ME and not Java SE.

There are also disadvantages however:

For specific APIs, one should read the Project Scope document which outlines the APIs which exist for Java ME and whether they would be implemented in SquirrelJME or third party vendors.

Java SE vs SquirrelJME

This section is more a bit in depth of the differences between SquirrelJME and Java SE.

Memory

Java SE has a standard garbage collector which in general will free objects when they are no longer referenced using a mark and sweep algorithm. In most cases it will try to avoid freeing memory in the event that it could be used again, also doing major garbage collection sweeps can be computationally expensive. Despite being very virtual machine specific, this gives Java the appearance of using quite the amount of memory (it has historically and currently is seen as a memory and resource hog) especially if code running on it has not be designed in an optimal way (which is usually the case). The garbage collector in large instances may take awhile to execute and can take considerable resources. The garbage collector generally is designed to run in another thread and hopefully can concurrently garbage collect objects making use of extra CPUs to decrease cleanup time. In general this garbage collector is made for speed where it only affects program execution speed when needed. Recent JVM advancements do have stack allocations which do not operate with the heap at all, which makes cleanup much easier if these optimizations can be used.

SquirrelJME on the other hand will try to use the lowest amount of memory it can by freeing objects from memory as soon as possible (in well designed code, one that does not use circular references). Once it is known that a given object can be freed from memory it will be freed to make up space for other programs and objects (within SquirrelJME). So virtually in most cases SquirrelJME will use the minimum memory footprint needed to run it. Although the lowest amount of memory will be used, a reference counted garbage collector has some slight overhead in that it needs to count objects and be able to free the objects when they are no longer counted. Parts of code which heavily use objects will run a bit slower due to the counting required. This is definitely a trade-off as it gives memory a more important consideration than speed. Since SquirrelJME targets low end systems, this extra memory can be very important. However, unlike Java SE all allocations are on the heap since the size of the stack may be very small.

Programming Language

SquirrelJME is written entirely in Java. This means that it only requires a Java virtual machine and a Java compiler to be built. There are also no other dependencies apart from what is within SquirrelJME itself, it is entirely standalone and self contained.

One may ask why Java and not another language such as C? Well, Java is a much simpler language compared to C when it comes to syntax (C has the preprocessor, structures, pointers, typedefs, function pointers, etc.). One main advantage of Java is the consistency of the code.

One misconception about using Java is that it is impossible to use native code or one will require and assembler to assemble assembly code for things which Java cannot do. This is not the case for SquirrelJME. The major and most important part of SquirrelJME is the compiler which can turn Java byte code into native machine code. Since the compiler is very much integrated into SquirrelJME this means that certain aspects of interacting with the host environment can be accessed by changing compilation for certain aspects in a way where it remains compatible with Java but also provides native access when needed. Native access is provided by replacing method calls to special static methods within a special class by the appropriate machine code rather than invoking a method call. These special rewrites affect everything within a special package. This is used to provide support for multiple operating systems and environments without causing name collisions (since you cannot have classes with the same name in multiple projects when they are merged together).

Portability

I intend for SquirrelJME to be very portable so that it can be built for and built on a large number of systems.

Self Hosting

I intend SquirrelJME to be self hosting in that it can build itself.

In the future a Java compiler will be written which can run on SquirrelJME itself and allow building and compiling itself from source. This would also allow other programs to be built from source and can be used as a self contained Java development environment.

Environment

This details the environment in which SquirrelJME operates within the host operating system.

APIs

This describes the design of standard implementations.

javax.microedition.lcdui.*

The LCDUI API is used by a large number of older J2ME applications to display widgets and graphics on the screen. On SquirrelJME everything is done by SquirrelJME itself on its own framebuffer which it draws into, this makes portability easier.

javax.microedition.lcdui.Image

Since many systems may have varying image formats and supported native pixel formats, instances of this class may be specific to the given hardware and may have the screen display limitations. Images generally may use 32-bits to store their pixel data, but some implementations may use images with a lower bit density (such as 256 color images).

Pathname Handling

Instances of the Path class will be strictly limited to the limitations of the host system and will not provide support for allowing limitations to be skirted as that complicates compatibility.

As an example for DOS, there are severe filename limitations such as a maximum of 8 characters for a file name and 3 characters for an extension along with other naming restrictions. As such getting a path which does not produce a valid DOS pathname will result in an exception being thrown.

Compilation Time

This section contains information related to the operation of the Ahead-Of-Time Compiler and the Just-In-Time Compiler.

Virtual Machine

This section contains information related to the target independent virtual machine at run-time.

Garbage Collection

For simplicity the garbage collector is a reference counter with sweeping when no more memory is available (or GC is called manually). As such, cyclic object references will not be freed unless one or both directions are weakly referenced (using WeakReference). This means that the following situations would permit both objects to be potentially collected:

Although reference counting may increase lock contention on the CPU and memory buses it simplifies the design greatly by not requiring complex garbage collection algorithms. In most cases with reference counting, SquirrelJME is capable of using always a minimum footprint of memory depending on whether that memory should be freed to the operating system or within SquirrelJME's own memory for other programs running in it.

Objects will have two counts: The number of strong references pointed to this object and the number of weak references pointed to this object. These two counts determine if an object can be garbage collected. If any object has a strong reference count that is non-zero then it will be garbage collected as long as it has zero weak count also. Thus, an object which is never referenced at all will be garbage collected. In a standard Java VM, a WeakReference in most cases will only give an object if it has at least one strong reference.

Strongly reached, weakly reached (WeakReference)

In Java ME with Java being garbage collected, there are two types of references to objects which affects how garbage collection is performed. Similar to other languages this can be seen as a smart pointer or a special kind of reference counting pointer.

A strong reference is a normal reference to an object, like a value which is placed within a field (a static field or an instance field) or a local variable in a method. Any objects that are strongly referenced cannot be garbage collected because they are used.

A weak reference is another kind of reference to an object which is provided by the WeakReference class. Essentially it does not have a strong bond to the object it points to and that object may be garbage collected and return null as long as no strong references point to it. Weak references cannot really be used as a cache due to the way the garbage collector works, for a cache a SoftReference should be used by Java ME does not have such a class. At least with weak references, the virtual machine will in the most average case always use the least amount of memory. In short, weak references are garbage collected as soon as possible. If used for memoization it will not have the best intended effect of reducing calculations but it would reduce the memory footprint.

Strong Count of Zero, Non-Zero Weak Count

When an object has no strong references and only weak references, that means that it can soon be garbage collected. For simplicity when WeakReference detects that a target object has no strong references to it, it will detach itself from that object, reduce the weak reference count, and return null. If a WeakReference is no longer strongly referenced it will also cause a detach to occur.

If the system is out of memory then all objects will be iterated and any objects which only have weak references to them will be removed.

Strong Count of Zero, Zero Weak Count

This object can be garbage collected, it will be removed and that memory will be made available for other allocations.

Tasks

MIDP 3 allows multiple programs to be ran at the same time (provided they are actual different MIDlets). One thing to simplify the design of SquirrelJME without needing much work.

Synchronization and Locks

Java naturally provides synchronization which is used for writing code which is thread safe. Since synchronization and monitors are very intertwined, the design will reference future information. Since there are a number of different ways different CPUs and targets could have thread safety, those details have been removed and replaced with easy to determine common means.

Loop Threading for Multi-Threading

Since Java is multi-threaded and SquirrelJME may run on top of a number of system which may have different threading models, the following differences determine what happens when a loop needs to be repeated due to a failed operation.

Preemptive

The loop should perform a given number of checks, then once a certain threshold is reached a longer duration sleep should be entered so that CPU cycles are not spent deadlocking. Essentially it waits upon a signal where possible.

Cooperative

The loop should yield and not attempt another try (because only a single thread can run at one time) that way another thread which is able to be ran can executed, potentially one which controls the monitor for the given object.

It is possible that an internal threading manager can determine the best thread to choose for consecutive execution.