I want to manipulate the case of a cyrillic text via regex. IIUC according to the info on
this page, Lazarus uses TRegExpr by Sorokin which supports Unicode categories like "\p{L}".
When I use "\p{L}*" in my expression, I get the following error:
TRegExpr compile: incorrect {} braces (pos 4).
If I change "\p{L}*" to "\w*", search fails. If I define WordChars to 'абвгдежзиклмнопрстуфхцчшщьыъэюя', search succeedes but case change fails.
So the following code works ok
procedure TForm1.Button2Click(Sender: TObject);
var
r: TRegExpr;
s: string;
begin
r := TRegExpr.Create();
try
s := 'hello';
r.InputString := s;
r.Expression := '(\w*)';
if ( r.Exec() ) then
begin
ShowMessage(r.Match[0]);
s := r.Replace(s, '\u$1', true);
ShowMessage(s);
end;
finally
r.Free();
end;
end;
The following doesn't work as expected.
procedure TForm1.Button3Click(Sender: TObject);
var
r: TRegExpr;
s: String;
begin
r := TRegExpr.Create();
try
s := 'привет';
r.InputString := s;
r.Expression := '(\w*)';
// r.Expression := '(\p{L}*)'; // TRegExpr compile: incorrect {} braces (pos 4)
r.WordChars := 'абвгдежзиклмнопрстуфхцчшщьыъэюя';
if ( r.Exec ) then
begin
ShowMessage(r.Match[0]);
s := r.Replace(s, '\u$1', true);
ShowMessage(s);
end;
finally
r.Free();
end;
end;
What is the proper way?
Lazarus 2.2.6 x64 on Win11