About the disk storage.
You may want to consider using a separate file for storing the info which words were already played.
That way you do not need to modify the original data file (or update packs).
If you do a new release, and the data file comes with more entries (added to the same main data file), you can replace the old data file, and still have the "played" info in the separate file.
It also would be in line with the recommendations of most OS. At least OS like Win,Linux,Mac that are multi-user.
If you distribute the app, it is possible that it is played by more than one person (OS User account) on one PC. So the "played" file could be saved in the users home directory.
Also consider that if the app is installed normally (c\program files\ or /usr/local/bin|share) then the folder with the "words" file may not be write-able to the user. So it can not be updated. (Of course, you can install somewhere else...)
Next question is, if you want the files to be human readable. But probably that is a good idea.
Then keep it simple. Any line based file format will be fine.
For the static ("words" / "categories") files, ini files are ok, since there is existing code to read/write them.
For the "already played" I would suggest to just dump the words, one per line into a text file (TStringList can do that).
Note that if you keep the order in which they were played, then you can limit how many words you save.
You could limit to store the last 1000 played words. After that they can be played again. (If you want).
When you load the data, you may have to re-arrange it.
Note that if you sort "played" you may need to make a copy, so you can also keep the original order for limiting the total amount remembered.
Once the "words" are loaded and the "played" are loaded, you need to combine the info, so you can mark/remove already played items.
With big amount of words (and lots already played that becomes crucial).
Say you have 1000 words, 500 played.
If you did
for i := 0 to words.count -1 do
for j := 0 to played.count - 1 do
if words[i].text = played[i].text
That makes 500 thousand runs of the loops.
Make it 2000 words, and 1000 played => 2 million runs.
If you did
played.sort;
for i := 0 to words.count -1 do
if played.BinarySearch(words[i].text)
BinarySearch is BigO(log n). So 500 words means a loop of 9 iterations. For 1000 words = 9000 iterations total.
For 1000 of 2000 = 20000 iterations total.
And if you sort both list
played.sort; words.sort;
p := 0;
for i := 0 to words.count - 1 do begin
while (p<=played.count-1) and (played[p].text < words[i].text) do inc(p);
if words[i].text = played[p].text then MarkPlayed;
end;
You only need words.count (or played.count / whichever is bigger) iterations
As for the Data format in memory. Create a record (or class) and specialize a list for it.