2016/03/20

00:12

Refactor is going well, things are much nicer now.

00:20

First day of spring also.

15:21

This refactored code is much simpler now and likely much faster too.

15:57

I believe I have a short or too long of a read somewhere.

16:04

Based on the dump and the values I am seeing, I am short two bytes.

16:07

And looking at the hex dump of the class, those missing two bytes are for the ignored attribute, since the length is an integer and not an unsigned short.

16:15

Method handles will require slightly more complex work to make them cachable because identifiers can contain ( and ). They are also not in fixed positions. So I suppose what I will do then is have a fixed offset and length similar to the binary name, then substring as required.

18:49

sloccount gives me java: 13646 (100.00%). Also, the parsing of the block of code will be inline so no byte arrays have to be filled by the attribute and then parsed that way. Thus, the class parsing code will remain light and not require too heavy allocations beyond the normal allocations. During my pass, I can also use what is used by the method and ignore stuff that is not used. Since the constant pool is shared with everything, some stuff might not be needed at all for specific methods.

19:01

The good thing is that the subroutine byte codes are completely unsupported, which means that parsing classes is simpler and I less likely have to worry about recursive states where a byte code can have a few thousand different states.

19:05

Looks like StackMap is a bit less compact version of the StackMapTable. That is, the information appears to be similar so far, except that it is simpler to parse. Basically think of if everything in the StackMapTable were entire frames. I see the very familiar verification_type_info. And reading the specification, the type information is exactly the same. So when writing the checker, I can combine and check for both states accordingly. This means that StackMap and StackMapTable will be folded into entire full states which are then checked on input instructions.

21:34

Thinking about it, the StackMapTable might not even be needed. I could technically skip it if I force the requirements as if StackMapTable were there. The way Java byte code runs on moderm VMs is that each address in the byte code operations must have only a single valid state. The Java compiler enforces this. However, I would have to check if older J2ME classes also share this property. Ignoring the stack map table would simplify things greatly because I would not need to parse it at all. So there would quite literally only be two attributes. However, if a StackMapTable is illegal then the normal VM would fail. However, the table does help in cases where exceptions are handled to know if local variables and such are filled or not. As for the byte code, I am going to transform it to a kind of register based format which will be used by the interpreter and in the future the native recompiler. Ignoring the table would be faster and would produce code faster. It is kind of redundent, although skipping it would violate the specification.

22:07

So the thing to do would be to generate a variant of the Java byte code which uses virtual registers instead of a stack. With registers, it simplifies the operation of execution and translation to generally register based machines. For speed and simplicity I would likely not choose SSA form (register coloring is fun). There would be basic operations, where then certain operations are instead just compacted (push 2 r2, push 2 r3, followed by add) into a single form (add r2 r3 r4).

22:12

So I will need a kind of linear program editor which can act on a bunch of buffers to store the register based code. I am not going to use a large number of classes to represent single operations. Essentially it would be like a ListIterator where there is a current instruction pointer which points to register instruction code. I do however have to have a compact operation form however to not waste space and such. RISC would be likely the best choice. I should also probably have a fixed instruction size for simplicity where they all share the same form. This would be something similar to for example MMIX. Alternatively, I could just use MMIX operations to represent my programs and then use that as a IL. MMIX is a 64-bit instruction set however. MMIX is however big endian. It also implements IEEE 754 which should match Java.

22:19

However MMIX is really for a real CPU, while I just need an intermediate language which is speedy and compact. It needs to be compact because there may be a JIT which is running on systems with only for example 64KiB of memory available. With SquirrelJME in memory there might only be 32KiB. So the code would have to be compiled to native code very compactly.

22:22

Time to do some thinking.

111111100 00000000
654321098 76543210
========= ========
ooooodddd ddssssss

If I have 5-bits for opcodes then I can have 32 instructions, which is quite small. Then with 6-bit operands I have 64 registers. Using a source and destination would be compact. Thus add would be d = s + d for example. I should however have an "infinite" number of registers to choose from however.

22:42

I suppose I can design something novel then as an intermediate code. I can use a prefix so to speak as a kind of huffman kind of detail. However, if I were to use 3-byte operations, things might get simpler.

23:07

Here is an idea. It would register based, but a CISC (for compactness). Basically all operations would start at a single byte. 6-bits are used for the operand (add, subtract, etc.) while 2 bits are used for flags which specify registers to use. There are two register slots, A and B. When a flag is set in a register then another byte is expected which is used to set the new A or B register. The next operation which lacks the bits will use the register set by the previous operation. Operations such as new, anewarray, multianewarray, and a few others will just be method calls (since that is sane) or traps. I would limit the number of active registers to around 256 though. However if methods are that complex that they use so many registers then they would have to fail due to the exceeded limitations. This would mean that methods which are gigantic and extremely complex (probably horrible code) would not work. However, this environment is made for embedded systems so it would be insane to have code such as that.

23:13

One problem though are branches going back and forward. The implied register may vary between calls, so I suppose the thing to do would be to instead have a linear fallback to find the next implicative register. However, seeking backwards will be complicated because it might not be known if an instruction is a register specification or not. So I suppose to fix that, I would then limit the instruction ID to 5-bits and registers to 127. If the uppermost bit is set then it specifies an instruction, otherwise it is a register implication. This could also be used for consistency checking to make sure instructions are valid. The next thing to consider is which 32 instructions are used to represent operations. I suppose what I should do is write a documentation for the intermediate language format. That would keep the details together and allow for any future work around it.

23:26

I should perhaps make a poll as to how many registers are truly needed. I also need to consider floating point too.