I imagine a 95MB text file might take some time to read in - not that I'll have that many words but anyway...
Well, you may not need to read it all. Though you may not get "perfect randomness".
If you have a line based file, with avg word size 10 of 95MB
1=> compute a random number between 0 and 9.5millions.
2=> seek to the given byte pos in the file, and read 1kb.
3=> search for the next new line
4=> check if the word is marked as "used", repeat from step3 (also check, if end of the 1k is reached and load the next 1k, looping from the end of file to its start).
However, this only works if you have a reasonable amount of "still unused" words, otherwise you will search through more and more of the file.
Yet, if you keep the used words in a separate file (as "byte pos, word len") and you have loaded that file completely, then you can check for the next unused word pos, without reading the word file.
Of course that again only works, if you can keep the "already used" file reasonable small. So again keep the last 100000 used words (then recycle them). Store the used info in binary, as
"record filepos: integer; wordlen: word end;" the that is 600 kb, that you can still load in a timely manner.
If you keep them in order, so that the first entry in the file is the one used furthest back, and the last entry is the most recent used one, then you need no timestamp. But you must order the entries when you read them (or put them into a tree). So in memory you need the list twice (with some tricks you can get away with once).
If you do not want to store the used files as "byte pos", you can have an index into the words file, that stores the bytepos of every word, and then store a the pos(1st,2nd,...) of each used word.
If words do not differ to much in length, you can pad all words to be stored as exactly 20 bytes. Then you can store an number and calculate the pos.
You can group words into many files, one file for each word-len. And store for each file the start number of its first word, in such a way that you do not have overlaps.
The easy option is a database, that will do all that kind of work for you...