If I would have any knowledge about .po files I wouldn't mind having a go at it.
If I understand any of it, then msgid and msgstr (if not "") should hold the same format arguments?
I assume msgstr is the translation of msgid?
I also assume we would check the created bla.xx.po agains bla.po?
Yes, msgstr is the translation of msgid and msgid is originally defined in a pascal source file under a resourceString section.
For the validator program's purposes, the "master .po file", bla.po in your example can be used as a main source for resource strings.
The format params (%x) in those strings should be compared with the translated strings. The country code in translation files is typically 2 chars (like .de.po) but can be 5 chars (like .pt_BR.po).
This should not be to difficult then?
What exactly do you mean by unused resourcestrings?
Can you also give an example of a duplicate resourcestring?
"Unused resourcestrings" means that a string is defined but not used anywhere in the project's pascal source.
To find out if the string is used or not you need to scan all the source files. A simple search operation without context checking should be enough.
You will always find one instance of the string which is its definition. You can make a rule:
- if no instances are found -> something is wrong, should not happen.
- one instance found -> unused in source.
- more than one instances -> ok, used.
"Duplicate resourcestring" means that 2 or more string definitions have the exact same text.
However they can't always be combined into 1 resourcestring because their meaning may depend on context and they may need a different translation.
In practice you need to keep all the string names and values in a hash- or tree-map for a fast lookup.
Would we want it to be a gui or console program (with batch processing all po files?)?
Being a Lazarus package it should have a GUI. It could show reports of its findings in listboxes for example.
It could be simple first. It usually happens that people (and the author himself) start to get ideas for improvements after a first simple version is done.
Like in my Tools -> Example Projects ... feature. It was very simple first. Then Martin suggested a load of improvements and I figured some more myself.
It means, initially keep it simple. The refinement and complication comes later by itself.
This validator would benefit all translated applications made with Lazarus, not only the Lazarus project.
Juha