This weeken is holiday weekend, with Monday being the 4th of July.
I should rework the documentation a bit and have a completely standalone port section of sorts. Then I can have user and developer bits in their specific sections.
One thing I need when it comes to the class file decoder is remembering and storing the class flags for potential usage later. I will need to document the blob format in a way where it allows blobs to be output without needing future details. So as such this means that the table of contents in a blob will be last.
I should have a class which can be given a byte buffer or some other class
which is used to read from say an
byte array to access the
details within a blob.
As alternative to a table of contents kind of thing, I can have a linked list
of sorts through the executable. However, backlinks would not operate at all.
I would say that for simplicity, the blob can be directly memory mapped and
have its structure accessed directly. Also it may be reasonable to have a case
where there are two binaries, one which contains the raw data and another
which contains the table of contents. If the table of contents remains apart
from the executable code they can be linked into the binary using different
means. Alternatively, instead of an
OutputStream passed into the SSJIT,
there is instead a result of a compilation. So this "smarter" class would have
stuff such as "create new field" and other such things. Then an implementation
of the given writer can be used. Then this way the
SSJIT is not locked to
a single output format but one which could be wrapped using multiple means.
Then if the output format is not that nice, it can completely be replaced with a better one without changing any code. So I could literally have an output which writes ELF binaries.
Essentially what could happen for example when it comes to Linux, is that classes could be compiled to ELFs, then when they need to be opened they can be linked in as such. Although, I am not sure if Linux supports dynamic linking of libraries that exist only in memory. Looking into it, it does not. So when it comes to generation, I will need to use blobs and such.
When it comes to the output, I should support multiple classes being output
into a single
SSJITOutput. This way when working with the initial SquirrelJME
binary which the user would use, all classes exist and are pre-merged into it.
The native code would have to be generated in a way where all code acts
together as a single unit. So in this case, entire JARs will be merged into a
single fragment. However, I will need to devise a means where there can be
multiple namespaces within the output (for multi-JAR support).
For example, for the ELF format all symbols and such will be generated and placed in an output ELF file. Although doing this within an ELF may be complex because there may be some things which are unknown (such as how large a given section is).
So I suppose for simplicity, keep with the blobs and instead of a container. Ther container would be the ELF which loads and initializes the classes stored within the blob. But still allow for multiple JARs and resources to be namespaced in a single blob.
Thinking about it, when it comes to generating the bootstrap code, it will be very difficult to have a split apart JIT when it comes to building the initial binary which would generate the machine code. Another consideration is the number of objects that will need to be initialized for every class. So I suppose I need a separated native code generator, which there is a single instance for (with functions as I currently have it) where it is then attached and associated with a state tracker. This way there is a single instance, if multiple functions need to interact with the same state, they can use the passed state tracker instead.
SSJIT is given a
NCGManager will then create any
associated output with a given state and singular set of instances for
Also going to place anything related to the JIT in
net.multiphasicapps.squirreljme.jit, then branch from that for example. The
code generators will be a bit higher level. I would suppose below the
code generator there would be the assembler.
Then this way, when it comes to generating native code I can have similar means of generating. I will have variants, but instead of operating system modifications of functions, I instead will have modifications of the code generator. The assembler, code generator, and JIT should at best be of a single state which will enable only a single assembler or other instance to exist at a time. This would reduce the allocation count. I will suppose that for the assembler, it will be given an output stream. The only consideration is that there may be the potential for sub-variants of variants (such as big endian and such).
So what I will need is a means where I can easily specify all of the flags
which may change how an assembler operates, such as which features are
available and such. I suppose what I can do is have a standard setup of sorts.
There would be the variant, but that would define a standard set of flags
which are supported by a system. So
powerpc32+g4 will turn into
powerpc32+g4,altivec,big. A variant will by default give a set of flags and
otherwise which are set to on or off. So the big flag for big endian would
be set by default, so having a suppose
~ before it so it is ~big would
disable it, even if it were on by default with a given variant. The
variant would just choose a set of flags to use. So I suppose for simplicity,
a given CPU could have their variants mapped to enumeration entries
potentially as a kind of additive set of extra variants and such.
I would suppose instead of this, that I should have an assembler configuration which is used to initialize the assembler.
Actually, having it where the assembler has add operations with types and such
will be a bit complex. What I could have instead is a kind of pipeline of
sorts similar to Java 8's streams. So basically there will be register
selection, essentially something such as
source(RegisterType.INT, "r1") which
will act as the source register. Then there will be the same for the
destination. Then following that, there will be an execution which performs the
operation (such as an
add). The native part of the assembler can check if the
given operation is valid between two given registers.
I will also need some kind of native assembly provider functions, ones that
could be given a single instance, where they are given the
options can be read from (such as the selected source/dest registers), the
configuration, and the output stream. However the constant handling of this
would be a bit ugly in a way. So I suppose the assembler becomes abstract.
However a problem with this is for each architecture I could only ever have a
single assembler. The assembler would never be patchable and third parties
would never be able to add support for an unsupported CPU.
So I suppose something similar as I have previously talked about. Basically I
will have an
ASMGenlet which would be an interface. Multiple genlets could be
created and have a given priority. The first argument to the genlet would be
the assembler itself. However, that would create much complexity. There would
be multiple instances of genlets created for every method. This definitely
would not be fast. Then genlets in every case would have to check which CPU
was selected for code generation. So when it comes to code generation, it will
definitely not be small, nor will it be fast, nor will it be simple. I suppose
I am thinking of far too complex of a solution. So instead, just have an
Assembler with the source and destination as I have thought up of. If done
correctly, I can even have it where a single assembler could be created and
shared across all JIT instances (there would need to be a state reset). Since
some operations might not be supported by a given CPU, I would need to have
failure cases that throw an unsupported assembly operation.
I can then layer the code generator on top which performs virtualized operations in the event the assembler does not support it (so if hardware float is not supported, then it will be handled using software instead by replacing the code with method calls potentially). However, the layering of the JIT goes deep through all levels. Having three levels will keep things complicated. So instead of three things, the JIT should be merged into one. When it comes to support for native architectures I can have a generator class and a set of registers which are supported by a given CPU. The operating system variant could change how registers are used and such. Although it is still complex.
One thing I do not want to have is a few thousand different implementations of the native JIT for a given CPU. I can have an architecture specific one for a given operating system however. But generally I only want a single for each architecture, so the PowerPC one would target only PowerPC and there would be no other implementation of a PowerPC assembler.
Ultimately I could support only a single CPU variant and potentially if it has
floating point or not. I could ignore vectorization (that is a whole complex
another subject anyway). So generated code for
PowerPC would quite literally
target the G1 (603) processor for the most part (the original). However some
CPU architecture have better instructions in later generations or have
removed older ones.
So right now I am going in circles. I want a small and fast JIT, but one that
is simple to write and does not consist of overly complex code. So basically
something similar to
SSJIT except there is one instance of it and it is
given an input class to transform (instead of one instance per class). It
would not be multi-threaded capable. I would also say that support for a given
CPU and its variant can be used. Support for a given architecture would extend
the actual JIT class to provide the native functionality. To keep internal
details hidden, package private will be used in many places. This way I can
have the class decoder be external and call into the JIT.
Actually a single instance
JIT would be very ugly.
I can have a slight JIT modifier be used which if found will be passed a JIT. It can then see which instance the JIT is and possibly perform register banning or other minor tweaks.