# Design Document This document outlines several decisions and design considerations for the SquirrelJME Virtual Machine. This document should reflect the most recent path that SquirrelJME will be taking in its design. Note that the design may change and this is not forever concrete (that would be foolish). For day to day notes on my current thought process you may look within the [developer notes](assets/developer-notes/index.mkd). For the current route of development see the [release route](route.mkd) document. Because Java ME is different from Java SE, there are some considerations, advantages, and disadvantages to consider when reading this document. Note that this document is mostly within the scope of Java ME and not Java SE. * Reduced library subset. * The main library is much smaller which means it will load faster and use less memory. Less memory means it can run on smaller/weaker systems. * No reflection. * There is `Class.forName()` and `Class.newInstance()`, however they are trivial to support. * More secure because access checks do not have to be performed at run-time to determine if it is permissible to access an object. * Random object fields (and finals) can be cached because their reference or primitive values will never change. * More secure because for example changing `Boolean.TRUE` to be `false` may cause security exploits with code that relies on it being `true`. * The resulting virtual machine is smaller because information such as the types of fields and methods that exist and their type information does not have to be included within the virtual machine. This reduces the bloat within the virtual machine. * Produces faster code because `final` variable and especially `static final` values can directly be accessed rather than requiring a read of a field pointer value. * No finalizers (the `finalize()` method). * Finalizers are very integrated with how the garbage collector works. * It is never known when they will actually be called (if ever). * Timing attacks could be performed when finalizers are called between garbage collection runs. * No serialization. * No objects are `Serializable`. * Serialization uses virtual machine magic to access internal details, since there is no reflection the information that would be used for serialization does not exist. * The `transient` keyword becomes obsolete. * Simplifies implementation. * No `invokedynamic` instruction. * Simplifies virtual machine operation, at the cost of lambdas (which could be smartly wrapped in anonymous classes by a compiler). There are also disadvantages however: * Without reflection, one cannot include plugins dynamically from the program. * However, Java ME 3 (or so) added LIBlets which may be optional and provide a slight alternative to plugins. These however are fixed to the JAR/JAD which means that the difficulty is increased. * Alternatively, Java ME 8 has `ServiceLoader` which enables JARs to potentially be merged to provide services. Also using `Class.forName()` and `Class.newInstance()`, plugins using a common interface can be initialized when they are not directly known. For specific APIs, one should read the [Project Scope](scope.mkd) document which outlines the APIs which exist for Java ME and whether they would be implemented in SquirrelJME or third party vendors. # Java SE vs SquirrelJME This section is more a bit in depth of the differences between SquirrelJME and Java SE. ## Memory Java SE has a standard garbage collector which in general will free objects when they are no longer referenced using a mark and sweep algorithm. In most cases it will try to avoid freeing memory in the event that it could be used again, also doing major garbage collection sweeps can be computationally expensive. Despite being very virtual machine specific, this gives Java the appearance of using quite the amount of memory (it has historically and currently is seen as a memory and resource hog) especially if code running on it has not be designed in an optimal way (which is usually the case). The garbage collector in large instances may take awhile to execute and can take considerable resources. The garbage collector generally is designed to run in another thread and hopefully can concurrently garbage collect objects making use of extra CPUs to decrease cleanup time. In general this garbage collector is made for speed where it only affects program execution speed when needed. Recent JVM advancements do have stack allocations which do not operate with the heap at all, which makes cleanup much easier if these optimizations can be used. SquirrelJME on the other hand will try to use the lowest amount of memory it can by freeing objects from memory as soon as possible (in well designed code, one that does not use circular references). Once it is known that a given object can be freed from memory it will be freed to make up space for other programs and objects (within SquirrelJME). So virtually in most cases SquirrelJME will use the minimum memory footprint needed to run it. Although the lowest amount of memory will be used, a reference counted garbage collector has some slight overhead in that it needs to count objects and be able to free the objects when they are no longer counted. Parts of code which heavily use objects will run a bit slower due to the counting required. This is definitely a trade-off as it gives memory a more important consideration than speed. Since SquirrelJME targets low end systems, this extra memory can be very important. However, unlike Java SE all allocations are on the heap since the size of the stack may be very small. # Programming Language SquirrelJME is written entirely in Java. This means that it only requires a Java virtual machine and a Java compiler to be built. There are also no other dependencies apart from what is within SquirrelJME itself, it is entirely standalone and self contained. One may ask why Java and not another language such as C? Well, Java is a much simpler language compared to C when it comes to syntax (C has the preprocessor, structures, pointers, typedefs, function pointers, etc.). One main advantage of Java is the consistency of the code. One misconception about using Java is that it is impossible to use native code or one will require and assembler to assemble assembly code for things which Java cannot do. This is not the case for SquirrelJME. The major and most important part of SquirrelJME is the compiler which can turn Java byte code into native machine code. Since the compiler is very much integrated into SquirrelJME this means that certain aspects of interacting with the host environment can be accessed by changing compilation for certain aspects in a way where it remains compatible with Java but also provides native access when needed. Native access is provided by replacing method calls to special static methods within a special class by the appropriate machine code rather than invoking a method call. These special rewrites affect everything within a special package. This is used to provide support for multiple operating systems and environments without causing name collisions (since you cannot have classes with the same name in multiple projects when they are merged together). ## Portability I intend for SquirrelJME to be very portable so that it can be built for and built on a large number of systems. ## Self Hosting I intend SquirrelJME to be self hosting in that it can build itself. In the future a Java compiler will be written which can run on SquirrelJME itself and allow building and compiling itself from source. This would also allow other programs to be built from source and can be used as a self contained Java development environment. # Environment This details the environment in which SquirrelJME operates within the host operating system. ## APIs This describes the design of standard implementations. ### `javax.microedition.lcdui.*` The LCDUI API is used by a large number of older J2ME applications to display widgets and graphics on the screen. On SquirrelJME everything is done by SquirrelJME itself on its own framebuffer which it draws into, this makes portability easier. #### `javax.microedition.lcdui.Image` Since many systems may have varying image formats and supported native pixel formats, instances of this class may be specific to the given hardware and may have the screen display limitations. Images generally may use 32-bits to store their pixel data, but some implementations may use images with a lower bit density (such as 256 color images). ## Pathname Handling Instances of the `Path` class will be strictly limited to the limitations of the host system and will not provide support for allowing limitations to be skirted as that complicates compatibility. As an example for DOS, there are severe filename limitations such as a maximum of 8 characters for a file name and 3 characters for an extension along with other naming restrictions. As such getting a path which does not produce a valid DOS pathname will result in an exception being thrown. # Compilation Time This section contains information related to the operation of the Ahead-Of-Time Compiler and the Just-In-Time Compiler. # Virtual Machine This section contains information related to the target independent virtual machine at run-time. ## Garbage Collection For simplicity the garbage collector is a reference counter with sweeping when no more memory is available (or GC is called manually). As such, cyclic object references will not be freed unless one or both directions are weakly referenced (using `WeakReference`). This means that the following situations would permit both objects to be potentially collected: * A has a strong reference to B, B does not reference A. * A has a weak reference to B, B does not reference A. * A has a strong reference to B, B has a weak reference to A. * A has a weak reference to B, B has a strong reference to A. * A and B both have weak references to each other. Although reference counting may increase lock contention on the CPU and memory buses it simplifies the design greatly by not requiring complex garbage collection algorithms. In most cases with reference counting, SquirrelJME is capable of using always a minimum footprint of memory depending on whether that memory should be freed to the operating system or within SquirrelJME's own memory for other programs running in it. Objects will have two counts: The number of strong references pointed to this object and the number of weak references pointed to this object. These two counts determine if an object can be garbage collected. If any object has a strong reference count that is non-zero then it will be garbage collected as long as it has zero weak count also. Thus, an object which is never referenced at all will be garbage collected. In a standard Java VM, a `WeakReference` in most cases will only give an object if it has at least one strong reference. ### Strongly reached, weakly reached (`WeakReference`) In Java ME with Java being garbage collected, there are two types of references to objects which affects how garbage collection is performed. Similar to other languages this can be seen as a smart pointer or a special kind of reference counting pointer. A strong reference is a normal reference to an object, like a value which is placed within a field (a static field or an instance field) or a local variable in a method. Any objects that are strongly referenced cannot be garbage collected because they are used. A weak reference is another kind of reference to an object which is provided by the `WeakReference` class. Essentially it does not have a strong bond to the object it points to and that object may be garbage collected and return `null` as long as no strong references point to it. Weak references cannot really be used as a cache due to the way the garbage collector works, for a cache a `SoftReference` should be used by Java ME does not have such a class. At least with weak references, the virtual machine will in the most average case always use the least amount of memory. In short, weak references are garbage collected as soon as possible. If used for memoization it will not have the best intended effect of reducing calculations but it would reduce the memory footprint. ### Strong Count of Zero, Non-Zero Weak Count When an object has no strong references and only weak references, that means that it can soon be garbage collected. For simplicity when `WeakReference` detects that a target object has no strong references to it, it will detach itself from that object, reduce the weak reference count, and return `null`. If a `WeakReference` is no longer strongly referenced it will also cause a detach to occur. If the system is out of memory then all objects will be iterated and any objects which only have weak references to them will be removed. ### Strong Count of Zero, Zero Weak Count This object can be garbage collected, it will be removed and that memory will be made available for other allocations. ## Tasks MIDP 3 allows multiple programs to be ran at the same time (provided they are actual different MIDlets). One thing to simplify the design of SquirrelJME without needing much work. ## Synchronization and Locks Java naturally provides synchronization which is used for writing code which is thread safe. Since synchronization and monitors are very intertwined, the design will reference future information. Since there are a number of different ways different CPUs and targets could have thread safety, those details have been removed and replaced with easy to determine common means. ### Loop Threading for Multi-Threading Since Java is multi-threaded and SquirrelJME may run on top of a number of system which may have different threading models, the following differences determine what happens when a loop needs to be repeated due to a failed operation. #### Preemptive The loop should perform a given number of checks, then once a certain threshold is reached a longer duration sleep should be entered so that CPU cycles are not spent deadlocking. Essentially it waits upon a signal where possible. #### Cooperative The loop should yield and not attempt another try (because only a single thread can run at one time) that way another thread which is able to be ran can executed, potentially one which controls the monitor for the given object. It is possible that an internal threading manager can determine the best thread to choose for consecutive execution.