Note: This is the second part of a tutorial, read first part one in case you haven’t.
Compressing the data set
The delta coordinates stored in 16-bits Unicode characters method was not efficient if the prod is going to be crushed by a semantic compressor like JSCrush. By using this kind of characters, the compressor will not be able to collect repeating ones together with the used in the code itself. I decided to pack the data into simple chars, 8-bits. And then headaches and problems started. Looks like javascript doesn’t have a full compatible ASCII 8-bits table, and when using some of them (specially over 127), they got encoded into multi-byte 16-bits one, increasing the size of the prod.
While messing with this issue, I noticed a particular thing over the coordinates set. Check the following data plot, they’re all the coordinates scattered in a X/Y plane.
The majority of them are in the [-3, 4] range for the X axis and [-7, 8] range for the Y axis. There are only 8 deltas out of those ranges. To store the X coords, only 3 bits are needed [0, 7], or [-3, 4] subtracting 3 to the limits. For the Y coords, 4 bits are needed [0, 15], or [-7, 8] applying the same rule. 7 bits in total, storing all of them in simple 7-bit chars was achievable.
At this point, I was really tired of manipulating values manually in the array and checking the results. Also lot of deltas were not needed, Hershey font over use lot of points to have a good glyph rendering, specially for big font sizes. But in this case I knew the font size was not going to be really big, so I was able to discard many of the intermediate ones specified. With all this manual tweaking, it was clear I need a font editor. I made it fast, using javascript.
With the editor in place, the data manipulation went very fast. Tweaking out of range coordinates and changing the starting point of several glyphs to fix the kerning problems were easy tasks. The dataset was already inside the 7-bit storing ranges.
Still around 500 coords, equal 500 chars for the whole font set. 100 bytes saving achievement. Looking into the glyph shapes, is easy to notice lot of the glyphs segments are repeated between several of them.
For example, the chars ‘f’, ‘h’, ‘k’ and ‘l’ shares the same upper part, the same happens with the bottom of ‘f’, ‘g’, ‘j’, ‘y’ and ‘z’. When these segments are stored only once, the code can draw them from callings on different places of the dataset. The only thing I need is a way of storing this “segment calls” inside the data. If you look into the plot before, there is a big empty area with no coords usage below the -3 value in the X axis. This means the glyphs are not using any of the [-3,-1] to [-3, -7] points. I was able to use this points to store up to 7 different segment calls, the drawing code will draw the complete segment when finding any of this point in the drawing loop.
I tweaked the editor with this improvement very fast. Several glyphs shapes were changed to take usage of the segment defined. Letters ‘r’ and ‘q’ made a radical change of shape. During this update, many points were removed, as the font rendering size was already small and they were non increasing the drawing quality. The rendering loop was jumping to the next glyph starting point after drawing the current one, there was an array variable storing all the glyph widths. The font editor allowed me to tweak all glyphs so they can start and finish on the same Y height always, in this way the rendering loop will be finishing all glyph drawing already with the correct width. The problem was the ‘x’ character, I had to change it completely to a continuous drawing one to fit into this modification. With all these improvement in place, I achieved to reach 246 chars only. Check the rendering difference between the original Hershey font and the compressed one:
'86670S050s0Sp`0Js5)LxeT0)0C0.Rsh:02020.RsiW090,Rsh:00T#*jge0bRQQdxKhvs0QQC00b9l)Wfc0pw`00S0C0Sg[%ege%[gS00cg[j-Dsts0SJfc0S\n$qtud0TbR$\n+.0)*ZgV0)JWeT0sIRcgK0/Xfb4\\v'
That’s the complete dataset, 246 (there are characters not appearing because they’re on the lower part of the ASCII table, escape ones and so on).
The segments appears in different color. First one is the red one, used 7 times. Second is the green, 4 times. Third the black, five. Next the orange one, six times. Gray one 7 times. Yellow and violet are the ones least used, only 3 times. All the blue lines are the data not belonging to segments, needed to complete the glyphs.
Scripting animation
Once the compression was finished, I got enough space available to program the code for the rendering and the notebook styles. The code to do the font rendering is very simple, all the chars are translated to the delta coords needed in a sequence array (Z), considering the segment calls. Then a loop is getting values from the array and drawing using the canvas lineTo method until the array is empty.
p=Z.charCodeAt(); a.beginPath(); a.moveTo(x,y); a.lineTo(x+=((p>>4)-3)*1.2,y+=((p%16)-7)*1.2); a.stroke();
There are several special cases controlled in the code, like pushing keys out of the [A..Z] range or the dots for ‘i’ and ‘j’ glyphs.
The flower glyph
During all the developing time, I was in touch with my good friend Javier Guerrero (aka Infern0). He was testing and checking for bugs, and at the same time giving me feedback on improvements or better look styles. To fit into the contest topic “spring”, a hand-written flower can’t be better election. I sent to Javi all segments used in the compression stage, he came back to me very fast with a proposal.
Just lot of ‘c’ segments and the lower part segment from the ‘j’. Special cases were added to the rendering loop to move the drawing coordinates and make the correct jumps for drawing the segments in the flower shape.
That is the complete explanation of the released prod. In the next chapter, what didn’t make it. A nice paper texture code generated which couldn’t fit in the 1024 bytes size.
Very nice. Have you considered coding this in glsl so it might be used within webgl?
Hi Dan, didn’t consider it. You mean there is no vector-font libraries or font render capabilities in webgl? Should be no prob to port the Hershey package to webgl then. The font data can be used for font rendering on screen or more funny things, like carving letters into 3D blocks or animated paths.
I can do that, when I’ll get some spare time…
There’s nothing in WebGL itself (which only seems to care about triangles and pixels); various Javascript libraries add that capability, e.g. http://threejs.org/examples/webgl_geometry_text.html … and for serious text handling with lots of control, you’d want to do that sort of thing instead.
The experiment in http://glsl.heroku.com/e#9725.16 is about asking whether something basic but readable could be done completely on the GPU using these tiny A-Z fonts you’ve been exploring.
Update: http://glsl.heroku.com/e#9743.15 (just to link also from here since I ended up posting comments on both your blog posts by accident)