07:54
Thinking about it, ByteArrayOutputStream
could use ByteDeque
which would
minimize buffer allocations. I just need co-dependencies for that however.
07:58
So what I can do following this is create a class which holds the individual blob buffers and creates a sort of table of contents that I can use. One thing the output will need is some very simple bootstrap code which sets some values and then jumps to the entry point. Just a single variable is needed and all it would do is be told where the combined namespace blob is located. So that would be about 4 or 5 MIPS instructions.
15:12
I should probably use sections regardless even though they technically are not needed. This would be for objdump purposes.
15:47
QEMU just gives me invalid argument when trying to load the binary in user mode Linux. I suppose it may be due to the lack of an entry point.
16:39
The current blob format is not going to work out. What I need instead is an
executable output of sorts wrapped to the cache manager which stores the
executable data in memory (can use ByteDeque
for this). Having code and
data together will be quite nasty and lead to vulnerabilities with
specially crafted classes in the event there is an exploit. I would also want
a kind of symbol table that I can use to debug the actual output binaries if
the object format supports such things.
16:43
So GenericClassWriter
just writes to a different format, say a
GenericExecutable
kind of thing. Then the ELF loader would use such a thing
where it provides code, symbols, and other such things. GenericExecutable
would best be an interface although there can be a default implementation. I
would say that the best result would be to keep that in jit-generic
. Then
the generic binary executable handlers in the VM can use a basic interface for
handling the blob format. Thinking about it, it can still be placed within the
cache, except the generic writer is a bit more sane (it keeps code and data
separated and has symbol information as needed). Then when that writer is
closed the stuff is written to the cache form. Then when it comes time to link,
a generic executable handler parses the raw byte data to provide access to the
binary data as appropriate (such as data and code sections). So really the
thing that needs to be changed is the GenericNamespaceWriter
and the
GenericClassWriter
. Then this way, the conditions of the generic writer does
not interfere with the writers in other cases.
17:01
I also need a ByteDequeOutputStream
so I can add the bytes to the end of it
for the EDOS usage.
17:10
And ByteDequeOutputStream
was easy to write.
18:40
I should likely pre-encode method descriptors and such, that would be much more efficient rather than parsing them at run-time. I should also use a generic table that can be used for linking and such. This would be like a giant link table which refers to classes and such. The actual generated code would use such a table to refer to other things. When a namespace is loaded this would be intialized to point to the appropriate structures and such. This information would really be needed for each process that runs on the JVM however. However, not all of it is needed at once. One thing which would be useful is if I could statically do everything and not really require initialization much at all. So when the eventual ELF is created, all of the classes statically refer to another directly in memory. The initial JVM could use a prelinked table with the correct classes linked in and such. I would suppose that storage of statics would be on a per-namespace basis. Static fields are quite literally known at compile time. So instead of initializing a class and determine where statics are located I can quite literally just refer to a given offset in a structure directly. The only thing that would be needed to actual initialization of the data (for anything that is constant). So there would be a template. Then code that refers to static variables would just reference the static namespace storage area via a pointer. One issue however is that namespaces would need to be linked together somehow.
18:49
Also, classes can refer to static variables that exist in other namespaces, so everything would essentially be done by a set of pointers. So essentially, each namespace is given a size and a data template which stores all static values. This would essentially be an array of pointers to static variable locations.
18:55
This could actually be done for classes and methods also, refer to them just by
pointer. The only issue are fields, however due to single inheritence this can
be done by using offset pointers. So when a field is read, it can just do
*(object_ptr + field_offset)
to access the data.
18:59
Just at the entry of every method there will need to be a way to point to the currently active method so it knows the import table to be used.
19:04
It would be best if it were made so that during the final linking process, that all strings are combined into one. So I suppose that any references to strings in the code could also be done via the import table. Since it is defined in the virtual machine that constant strings are like interned strings.
19:06
I would support for simplicity, instead of having blobs with sizes and such
where all classes belong, all of the classes appear in a single tread. They
import a string for their name, super class, and interfaces. However, super
and interface classes could be referred to via the import table. So while the
name is known, the super and interface classes are treated as if they were
an import. So then this way super classes and interfaces are initialized when
the import table is initialized. This would save a bunch of processing time on
the output and would make Class.getSuperClass()
virtually instant since it
would refer to a class object already (or similar).
19:11
Since everything would be in memory, this essentially would make limited systems a bit unable to cross compile if namespaces and the output ELFs get too large. That however could be remidied by swap and/or a virtual machine. At a speed penalty an emulated environment could be used for building that can run the actual self binary.
19:15
This then means that the order checks in the namespace writer are not required although I should still have them for other targets and such.
20:10
Ok so generally when it comes to writing classes, they will not write any data and they will just instead write to the code area.