Figured out what that bl_count means.
For the sample input, I now have:
DEBUG -- B=<0,3> H= 10 DEBUG -- B=<1,3> H= 11 DEBUG -- B=<2,3> H= 100 DEBUG -- B=<3,3> H= 101 DEBUG -- B=<4,3> H= 110 DEBUG -- B=<5,2> H= 0 DEBUG -- B=<6,4> H= 1110 DEBUG -- B=<7,4> H= 1111
Which appears to be correct. I just need to figure out how this stuff is used now that it is correct.
Also figured out how I can have a huffman tree just be based on an array of sorts that kind of builds up as entries are added. I can probably use that instead because it would be much smaller, although it does not have to be limited to specific data types. Instead of having nodes for every part of the tree I can instead just have an array of index references. Each index is in a pair always, so the array is a power of two. If the uppermost bit is set then a value is used, otherwise then a pointer is used instead. So basically it would be in a linear jump fashion so to speak.
Using the new huffman code, need to figure out a way to find the associated symbol and the bitmask, if applicable.
Appears a thunderstorm is approaching.p
Appears my huffman tree is rather lopsided.
And my traverser is not correct, it stops a bit short of a value so instead I have it stop on an actual value.
Seems my file server disk is failing, it only boots and loads the OS when it is on its side, but it does it rather slowly. Either that or it is a loose cable or the SATA port is not working correctly. If it is the SATA port then Intel makes rather horrible ports if they die so often.
I can access my data just fine when using another system and the same cable, so I suppose that the SATA ports are flaky and not working correctly, this means I will have to replace the board and such. I may be able to find a replacement, hopefully.
Well, I suppose that board lasted about 5 years. I could though use a PCI SATA adapter but that might not be worth it, I will just find another system to use instead. This will be done after I sleep and wake up, since I have been up all night programming since I could not sleep. However I did get a bunch of huffman work going on, which is good. A slight break from the interpreter compiler part.
Reading of the distance codes and literal codes are combined, so I should handle that case.
Suppose to solve the huffman data, I just needed to read the top of the RFC document and just make the stuff there.
My fixed huffman parser seems wrong. I read bits for the distance and the length yet it works.
Yes it was swapped. Also I kept missing that the literals was combined with the length, so the same tree is used so to speak.
Actually my length and distance methods were entirely swapped, the length method was handling distances and the distance one was handling lengths.
I believe my sliding window has an off by one where the distance code which was off by one worked, but now that it has been corrected it is incorrect.
Got spare computer in the mail today. Oddly it has a UK keyboard layout I believe anyway. It has the Euro and the british pound sign. However It has a plus over minus key which also has that double S section symbol. It was not working properly, however I just needed new RAM for it. It has a few dents in it but with the aluminum case I can sort of bend it back into shape, I just need to replace the LCD frame though since it has a dent in it. But anyway, I must fix the sldiing window offset off by one now.
However, it seems I cannot install Mac OS X on it, so I suppose I would instead just go pure Linux instead if I cannot get it installed at all.
Ok so the distance can exceed the end of the sliding window (just not the start). Looks like for any extra bytes it just repeats the sequence. Thus I will have to handle that specially as a kind of overlap thing.
Ok so it now appears to work, however the input and output class file are not
matched properly. My current inflater has a single extra byte of output. There
are also a bunch of shifted areas too. The input has a sequence containing
0100046d61696 but mine is instead
0100036d61696e at least for that
portion. The debug sequence is this
DEBUG -- c 103 g DEBUG -- W 67 g DEBUG -- Read DC 59 DEBUG -- c 59 ; DEBUG -- W 3b ; DEBUG -- in 60 DEBUG -- in d2 DEBUG -- Read DC 257 DEBUG -- c 257 ? DEBUG -- Lent Code 257 > 3 DEBUG -- Length 3 DEBUG -- Dist Code 10 > 43 DEBUG -- Distance 43 DEBUG -- W 01 ^A DEBUG -- W 00 ^@ DEBUG -- W 03 ^C ---- Should be 4 DEBUG -- Read DC 109 DEBUG -- c 109 m DEBUG -- W 6d m DEBUG -- in 78 DEBUG -- in 7f DEBUG -- Read DC 97 DEBUG -- c 97 a DEBUG -- W 61 a DEBUG -- Read DC 105 DEBUG -- c 105 i DEBUG -- W 69 i DEBUG -- Read DC 110 DEBUG -- c 110 n DEBUG -- W 6e n DEBUG -- in f1 DEBUG -- in 96 DEBUG -- Read DC 1 DEBUG -- c 1 ^A DEBUG -- W 01 ^A
Unrelated but rather freaky is that when the screen is off on this used laptop I can see the Apple logo shining through the LCD. It is dim however so I suppose the Sun is bright enough even in cloudy weather to cause this event.
Actually, getting the smaller of length and distance to use as the distance back is not correct. I need to check if it exceeds the bounds, then cap to the distance if it is larger.
Actually the corruption happens much earlier than I expect.
DEBUG -- W 00 ^@ DEBUG -- W 09 DEBUG -- W 00 ^@ DEBUG -- in 39 DEBUG -- in 8f DEBUG -- Read DC 39 DEBUG -- c 39 ' DEBUG -- W 27 ' DEBUG -- Read DC 258 DEBUG -- c 258 ? DEBUG -- Lent Code 258 > 4 DEBUG -- Length 4 DEBUG -- Dist Code 8 > 21 DEBUG -- Distance 21 DEBUG -- W 08 ----- should be 0a DEBUG -- W 00 ^@ DEBUG -- W 23 # DEBUG -- W 0a
My code sequence is as follows, which is correct:
However it appears looking back that 08 is read instead of 0a, so it is possible that my sliding window is reading from the wrong end.
DEBUG -- BACKFRAG dx=7 sz=4 bf=1 ln=2 lbfs=3 DEBUG -- src=0 rat=0 (0/6) DEBUG -- BACKFRAG dx=6 sz=4 bf=1 ln=2 lbfs=2 DEBUG -- src=0 rat=1 (1/6) DEBUG -- BACKFRAG dx=5 sz=4 bf=1 ln=2 lbfs=1 DEBUG -- src=0 rat=2 (2/6) DEBUG -- BACKFRAG dx=4 sz=4 bf=1 ln=2 lbfs=0 DEBUG -- src=0 rat=3 (3/6) DEBUG -- BACKFRAG dx=3 sz=4 bf=0 ln=2 lbfs=3 DEBUG -- src=1 rat=0 (4/6) DEBUG -- BACKFRAG dx=2 sz=4 bf=0 ln=2 lbfs=2 DEBUG -- src=1 rat=1 (5/6)
Tells me that entries are being read I believe in the reverse order. The output index 0 should be the oldest of the distance and not the first.
This looks wrong:
DEBUG -- hlit/dist 290: 6
Actually it is part of the distance codes, they are combined together as one.
Well, reading the values correctly leads to the uncompressed class file being the same length while the one sent through my version of the algorithm ended up with an extra byte or so.
Now I believe the sliding window is not working correctly, since I see many
cafebabe and similar to that in the output stream.
Well I made a bunch of decompressor progress, however I appear to be stuck on the sliding window part. I made lots of progress on the code since I last touched it however and it is mostly complete except for this last remaining issue. So I shall put it on hold to clear my mind of it.