2016/04/10

09:31

Now to continue with code generation and such.

10:21

CPVariableStates may require the list of jump sources and the verification state so derivation can determine what is what.

10:48

Actually it is not needed to pass it to them, the derivation can determine such things.

12:14

I do wonder though if the current classprogram is overly complex and cached. Right now I have a mashup of states which are all mutable and rather intertwined with other states. However due to my recent refactoring and split placement I should be able to change the entire classprogram structure without requiring that I rework parts of the interpreter and such since I do like some existing things. I like the operation handlers and the computational state. The one thing that I would change is all the dynamic variable states and the operations. The operations would be kept logical except their type would be known and there could be possible derivations of CPOp for specific instructions. Operations could still use computations. However the goal of this refactor would be a very broadband simplification of the code. I would still have SSA in the end however. I would propose that I:

  1. Get logical instruction orderings like I already do.
    • This is good because there are no gaps, the next instruction is always +1 and not +3 in some cases +2 in others.
    • Simplifies jumps and checking of states.
  2. Have CPOp for every instruction precreated with their initialized state with verifications.
    • These operations would chain to others potentially.
  3. Determine the variable states for every instruction along with their type inside of CPOp.
  4. Handle exceptions and such.
    • The SSA variable IDs would be merged and potentially "phi"ed if an instruction has multiple entry points and local variables contain different states.
    • Would also need to handle conditions where a variable may be used in an exception handlers and branches so that their values are saved even if they are pointless operations. However, this would only be done when it is a local to local copy, copies from locals to the stack are still pointless.
    • As such, the stack NEVER gets "phi"ed.
  5. Verify the states are correct by parsing the StackMap/StackMapTable and then dropping the verification info since it would not be needed.
    • Local variables which get removed away would not require that they be saved.
  6. Each operation would be able to be passed to a computational machine.
    • Operations that generally have no effect (they are copies) would just not perform any computations.
    • However, if a variable is "phi"ed then a computation is performed on it.
    • The general cases, storing values into locals would result in them potentially getting stored even if they are not actually used in the future. Later optimization passes could determine this however.

I would also keep the combined local and stack elements since uniformity is nice.

As for "phi", the variable states for each operation would have a flag which specifies that it is a junction of other types and that its current value is not really that known yet.

12:29

This should in general result in cleaner and easier to maintain code, with an added speed bonus at the cost of using extra memory (since now entire methods would have to exist within memory). Thus most of the existing classes I have would generally still exist, however they would be radically different compared to before.

12:32

CPProgramBuilder would be merged into CPProgram and become its constructor.

19:11

Instead of jump target, the following addresses could just be operations. I know of a way where I can determine the jump targets and have it all in the operation code as required. The same could be said for operation sources. With this I would completely remove the jump source and jump target list.

19:17

After I determine the natural and exception flow of instructions I can follow with determine the states of variables and the SSA required bits such as "phi" junctions and uniquely set values. However in retrospect, this new operation handling is far simpler than before despite it having some recursion involved. In general it should result in code that is much cleaner to maintain since the current existing program representation was just a gigantic mess of mutables, references, and other things. In the current case, the program description is completely mutable which means multiple threads can safely work on it without requiring locks. Thus as a result, the interpreter speed would be increased.

22:02

It would be too complex in the constructor to calculate the SSA states and such, thus that will be delayed so that it is calculated when it is needed for example.

22:16

I suppose I shall sleep soon, however with part of my pre-existing base and refactoring the operations, things are a bit simpler now. The control flow graph is more direct in that operations are used rather than jump targets and jump sources. So a slight level of indirection was removed. Following this, the variable types and SSA variables will need to be initialized. For that, the types can be implicit as specified by the stack map, or explicit which is based on actual operations. So the compute machine and the operation handlers will return as such. They have not really gone away at all anyway. The class program code is still the most line heavy code however at 1733 sloccount lines. However once every operation is handled and the variable states are setup, I predict it may be aroung 4000 lines total. 4000 is quite a high number, however the operation handlers for each instruction that exists will add to the count by quite a number. Once the SSA states and other such things get enough of a base I can continue interpretation. The operation initialization is a bit recursive however. The exception handler linking stuff by having it scan around makes it a O(n^2) operation which is quite slow. So a gigantic method containing 65535 logical instructions will require about 4 billion iterations. I would say tommorrow that I do something I did similarly to the jump sources. Otherwise it will just be really really slow when more bulky methods come into play.

22:32

While I prepared to go to sleep, I found a bug in my implementation. I just assign exceptions to every single instruction and use all of them instead of checking to see if they are in range or not. So this needs fixing, also in the code which does that I can do the similar thing to jump sources.