Recent

Author Topic: Hyphenation routine to Spanish  (Read 18532 times)

typo

  • Hero Member
  • *****
  • Posts: 3051
Hyphenation routine to Spanish
« on: March 08, 2010, 11:48:28 am »
Does anyone know of any hyphenation routine to Spanish, Portuguese or even to English?
« Last Edit: March 08, 2010, 02:30:56 pm by typo »

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: Hyphenation routine to Spanish
« Reply #1 on: March 10, 2010, 04:37:27 pm »
Please answer.

jesusr

  • Sr. Member
  • ****
  • Posts: 484
Re: Hyphenation routine to Spanish
« Reply #2 on: March 10, 2010, 09:09:21 pm »
if you find some implemented in pascal let me know.

if nobody answer is probably because nobody knows.

... the problem is who is nobody  :D


typo

  • Hero Member
  • *****
  • Posts: 3051
Re: Hyphenation routine to Spanish
« Reply #3 on: March 10, 2010, 09:45:20 pm »
I would be satisfied with a routine that recognizes 90% of syllables without using dictionaries.

jesusr

  • Sr. Member
  • ****
  • Posts: 484
Re: Hyphenation routine to Spanish
« Reply #4 on: March 10, 2010, 10:00:43 pm »
There is a rudimentary hyphenator routine who use undeclared rules in LazReport. Check BreakWord routine in Lazarus/components/lazreport/source/lr_class.pas:2458

José Mejuto

  • Full Member
  • ***
  • Posts: 136
Re: Hyphenation routine to Spanish
« Reply #5 on: March 10, 2010, 10:11:00 pm »
I would be satisfied with a routine that recognizes 90% of syllables without using dictionaries.

Hello,

here you have one spanish in C http://tip.dis.ulpgc.es/es/silabas/descargar which should be quite easy to port to pascal, take care that it uses ANSI characters and maybe you need UTF8, in that case the UTF8Tools package will help you.

And here there is a description of the basic rules to decompose words in spanish http://f01.middlebury.edu/SP210A/gramatica/acentu-index.html

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Hyphenation routine to Spanish
« Reply #6 on: March 11, 2010, 11:05:29 am »
I've once written a wrapper class for the hyphen lib.
http://sourceforge.net/projects/hunspell/files/ see hyphen 2.5
There are many dictionaries for it:
http://wiki.services.openoffice.org/wiki/Dictionaries#Spanish_.28Spain.2C_....29
See "Hyphenation" under "Spanish"

This is probably the most professional solution you can get.

There are different ways to use the wrapper. For ex:

Code: [Select]
procedure TForm1.Button1Click(Sender: TObject);
var
  hList: TList;
  hyp: THyphen;
  i, Start: integer;
  AWord: UTF8String;
begin
  hList := TList.Create;
  hyp := THyphen.Create('/home/theo/install/hyphen-2.5/hyph_es_ES.dic',
    '/home/theo/install/hyphen-2.5/.libs/libhyphen.so');
  AWord := 'electroencefalografistas';
  Start := 1;
  hyp.Hyphenate(UTF8LowerCase(AWord), hList);
  for i := 0 to hList.Count - 1 do
  begin
    Memo1.Lines.add(Copy(AWord, Start, PtrUInt(hList[i]) - Start + 1));
    Start := PtrUInt(hList[i]) + 1;
  end;
  Memo1.Lines.add(Copy(AWord, Start, Length(AWord) - Start + 1));
  hyp.Free;
  hList.Free;
end;

Result:

Quote
elec-
tro-
en-
ce-
fa-
lo-
gra-
fis-
tas

If you need the wrapper, tell me.

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: Hyphenation routine to Spanish
« Reply #7 on: March 15, 2010, 03:38:55 pm »
Yes, I want it.

BTW, trying to compile uHunSpellLib.pas I receive this error message:

uHunSpellLib.pas(64,28) Error: Illegal type conversion: "ShortString" to "^Char"

in this line:

DLLHandle := LoadLibrary(PAnsiChar(libraryName));

Any suggestions?
« Last Edit: March 15, 2010, 03:57:24 pm by typo »

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Hyphenation routine to Spanish
« Reply #8 on: March 15, 2010, 04:04:55 pm »
Yes, I want it.

OK, http://www.theo.ch/lazarus/hyphen.zip

You do not need Hunspell for Hyphenation.

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: Hyphenation routine to Spanish
« Reply #9 on: March 15, 2010, 04:10:19 pm »
Thanks.

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: Hyphenation routine to Spanish
« Reply #10 on: March 15, 2010, 04:15:50 pm »
Is it necessary the Hyphen.Dll? How can I get it? Is it possible to get the source code of it?
« Last Edit: March 15, 2010, 05:33:46 pm by typo »

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Hyphenation routine to Spanish
« Reply #11 on: March 15, 2010, 05:57:05 pm »
How can I get it? Is it possible to get the source code of it?

All written above. Direct donwload: http://sourceforge.net/projects/hunspell/files/Hyphen/2.5/hyphen-2.5.tar.gz/download

Easy to build on Linux. I don't know about Windows. Probably needs mingw to build.
http://www.mingw.org/
or
http://www.bloodshed.net/devcpp.html
« Last Edit: March 15, 2010, 05:58:51 pm by theo »

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Hyphenation routine to Spanish
« Reply #12 on: March 15, 2010, 06:56:27 pm »
Here we go: http://www.theo.ch/lazarus/hyphen.zip
The dll is now included. I don't know exactly what I did, but I built it somehow with Dev C++ and it seems to work :-)

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: Hyphenation routine to Spanish
« Reply #13 on: March 15, 2010, 07:40:35 pm »
OK, the project is now compiling and running. Thank you very much.

theo

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1927
Re: Hyphenation routine to Spanish
« Reply #14 on: March 15, 2010, 09:12:52 pm »
OK, the project is now compiling and running. Thank you very much.

Good. You should probably convert the *.dic files to UTF-8 so that they work properly.
You can see in the first line of the file hyph_pt_PT.dic that it is in ISO8859-1.
Convert the file and change the header to UTF-8.
This could also be solved in code, but would be more complicated.

 

TinyPortal © 2005-2018