In the mean time I found the original TP lex/yacc archive file which contains several examples including magic.l which demonstrates the use of start states. Using that I now have this code that seems to work:
%{
{$mode objfpc}{$H+}
uses LexLib;
var
strval: string;
%}
%start initial str
%%
<initial>[0-9]+ begin
writeln('integer: ' + yytext);
end;
<initial>do begin
writeln('keyword - do');
end;
<initial>loop begin
writeln('keyword - loop');
end;
<initial>\" begin
strval := '';
start(str);
end;
<str>\" begin
writeln('string: ' + strval);
start(initial);
end;
<str>\\\" begin
strval := strval + '\"';
end;
<str>[^\"] begin
strval := strval + yytext;
end;
. writeln('Caracter desperado!');
%%
begin
start(initial);
if yylex=0 then ;
end.
In the main loop I start with the initial state (start(initial)). When the lexer encounters an " it switches to the str state and keeps adding characters to strval until it encounters the closing " <str>\\\" accounts for escaped " characters within the string. Here is an example session
[jon@test test2]$ ./lexer
123
integer: 123
123 "abc 456" 789
integer: 123
string: abc 456
integer: 789
string: "\t line \n line \n\r"
string: string: \t line \n line \n\r
"\""
string: \"
"\nstring\t\"line"
string: \nstring\t\"line
Using this should allow me to parse quoted strings, comments etc.