GSoC Weekly update #2

Coding period for GSoC has started the past week and I have been working on a very simple implementation of the proposal in C and two tiny bash scripts. My code is available here:

The first thing to be done to test using these scripts is create a file that contains a set of words to be tested to see if their rendering is correct. Here I have taken a sample test data file created by SMC a while ago (ml-harfbuzz-testdata,txt). Now pass this file through the script along with the necessary font file. That is:

./ ml-harfbuzz-testdata.txt /path/to/fontfile

This will create a file named rendered_glyphs.txt that contains the output of hb-shape function of harfbuzz, i.e. the glyph name followed by some additional numbers (which will be ignored for now).

Now create a file that contains the actual glyph names of the words in the the test data wordfile. I got the data from font forge. This has to be created manually and, as of now, obeying the following structure:






Also make sure that glyph names of each word is in the same order as that of the corresponding words in the test data file. I have named it orig_glyphs.txt Once this is done, we can pass the above two files through the executable of the script rendering_testing.c, say rendering_testing. That is:

./rendering_testing orig_glyphs.txt rendered_glyphs.txt

This script will compare the glyphs in order and if it find any pairs that doesn’t match, it will write to a file, result.txt, the line number in which the word appears in the test data file. Otherwise it will tell you the renderings are perfect.

Once this is done, to see the words with wrong renderings we will have to run the third script It takes as input the result.txt file, the test data file and also the font file. That is:

./ result.txt ml-harfbuzz-testdata.txt /path/to/fontfile

This script will create png images of the wrongly rendered words in the current directory.

That is all about my scripts. But the C code is very much inefficient. It even spits segmentation faults with some files. Once I make sure that I am on the right path after discussing with my mentor, I will be working on improving my algorithm and making this code better. That would be my next week’s work.


GSoC – Community engagement period

GSoC 2013 approved project list was published on May 27th and the community engagement period was started from May 29th onwards. During this period the students are supposed to bond with their mentors, read the documentations and finalize your plans so you can have a head start with your project. The project topic for which I have got accepted for is “Automated rendering testing” and I will be completing that project under Swathanthra Malayalam Computing. I could learn a lot a new stuff so far during this community bonding period with a heavy deal of help from my mentor Rajeesh K Nambiar, although I haven’t started actual coding yet. I will try to explain my proposal status and further steps here, in detail.

Basically, my project idea is to create an automated way to test the rendering of Indic fonts by rendering engines like harfbuzz. The procedure I wish to follow here is quite simple. Create a test file that contains a set of words, mostly characters with ligatures that will be used for testing the rendering. Along with that I will be maintaining a file that contain the glyph infos of the words/characters in the test file for a particular font, say Malayalam font Rachana. As of now I am preparing it manually, can switch to font forge scripts if required.

Once I have got all the test data, my main script will accept the entries in the test file and render it using Harfbuzz for the font Rachana. The words will be rendered using hb-shape and the output glyph values will be compared with the original glyph indices of these words that I have collected manually. If the glyph indices doesn’t match, an error flag will be set for that particular word. At the end of the comparisons, the words with error flag set can be rendered using hb-view and stored in another html file. This file can be looked up to see for rendering issues.

This is what I will be implementing first. Depending on its efficiency, will move to any other solutions. In the above procedure, the most inefficient step, I think, is collecting the test file step and collecting the glyph index step. We can resolve the latter by, may be, using a scripts for extraction or using the .ttx file of the font (which is quite complex). But the former is a real issue. If the user wants to check for rendering issues in a font, she will have to create this file with a set of words manually. Will have to think of a way to overcome this issue.

That’s it for now!