I should also refactor and redo the
ZipFile code also, drop plans for 64-bit
ZIP files and remove the very slow structured reading code. Basically simplify
everything so that it is much nicer.
One issue with the chunks is that when a byte is added or removed, all of the chunks following must get their size recalculated. I do wonder if I can avoid doing that. I am thinking of a binary tree-like structure for this which just has relative size. It would very well have to remain sorted and well balanced. It would have to be perfectly balanced perhaps using a red/black tree. Then chunks would have to be placed in a way where they all represent a part of the file. However this would be a bit complex. Alternatively chunks in a given area could be for the part of the file they are a part of. The only issue with that is moving things over. So things will get complicated when it comes to buffer management. One thing however, is that there can be cards associated with a chunk of data. Each chunk has a head and tail for where data is stored. However since the position...
...is not known, cards can be used as a note to where data should be placed and keep that in a sorted list of sorts. Each card would be associated with a chunk. However this still complicates things in a way.
The big slow down would always be keeping the buffer positions valid. Inserting and removing of bytes would also add some issues. I like the buffer for quick insertion and removal of data. Perhaps what could be used is a kind of sub- partitioning scheme of sorts. Actually, I could have Chunks which just manage data which know nothing of their storage location. Basically, each junk would be associated with a partition map which does know the positioning. Something that might be a bit more efficient would be to have dynamically sized chunks. For example when a bulk write is performed at a location, if there is no room to store the data in the given chunk then the bulk operation just splits the chunk at the write position and then creates a new chunk with the bulk written data. It would write to the appropriate locations on two chunks before creating a new one.
The default block size could then just act as a minimum block size. Going to take a walk and figure something out hopefully.
Actually something that might work would be a partition list of sorts. A logical partition location and a physical partition location. Bytes could literally be written anywhere as long as there is free space somewhere. When bytes are added to a position, a new partition could be created if there is no free space anywhere. Then the logical partition information will just point to where it really is. Removal of bytes would just entail shifting the partition start/end over accordingly or splitting it. There would need to be two lists which point to the same partition data, each sorted for logical partitions and physical ones. However since the data is very much the same, they can share the same objects. If any bytes are removed then logical partitions that follow just get their positions shifted accordingly. So this would be like a hard disk filesystem.
To simplify things and not require renumbering on large buffers, each chunk's physical and logical positions would be to that chunk only. Then each chunk would have a note of the number of bytes that are in the chunk.
Each chunk would have then have an indicator which says where it is located. For simplicity all chunks will be linear physically, so a chunk at a higher index will always have a greater address. If insertion in the middle causes a chunk to be filled then a new chunk will be created in the middle. So with the partition scheme, if there are ever any removals a single value just has to be incremented or decremented (or a partition created). Since partitions can appear and disappear, it might be best to have a pool of partitions waiting to be used so they can be reassociated in a chunk. However, they should just remain to a single chunk.
It might also best best for the default chunk size to be managed by the chunk manager itself (unless a system property is used). Using small chunk sizes such as 32 (the current default) may be a bit inefficient.
Any refactored ZIP file code should not rely on the full NIO classes so it can be used by lighter JVMs.
Something that could work is a contributory setup in a way. That is, similar to
src I also have
contrib where packages can be placed. This way extra
packages which are not part of the SquirrelJME source tree can be compiled as
if they were part of it. This could be used for ports.