Lazarus
Programming => General => Topic started by: wytwyt02 on November 14, 2019, 09:46:07 pm
-
I wanna to parse some html file, I do not wanna to use regex
-
Maybe FastHTMLParser? I posted some examples here in the forum.
-
The chm compiler and compilelatexchm.pp in fpcdoc/ repo resp use fasthtml and fcl-xml's sax_html
-
I wanna to parse some html file, I do not wanna to use regex
FPC includes fcl-xml/src/sax_html.pp. It has THTMLReader and convenience routines like ReadHTMLFile (should've been called ReadHTMLDocument) and ReadHTMLFragment.
-
The chm compiler and compilelatexchm.pp in fpcdoc/ repo resp use fasthtml and fcl-xml's sax_html
How to use fasthtml?, I cannot see it in the package list and online package manager
-
The chm compiler and compilelatexchm.pp in fpcdoc/ repo resp use fasthtml and fcl-xml's sax_html
How to use fasthtml?, I cannot see it in the package list and online package manager
That's because it's not a package on its own... it's a unit. See fpc/packages/chm/src/fasthtmlparser.pas. Example usage is in fpc/packages/chm/src/htmlindexer.pas.
It's not a DOM parser though, as you requested. It signals events for tags and text. It does not build a DOM tree.
-
this is also an option https://benibela.de/sources_en.html#internettools
-
It's not a DOM parser though, as you requested. It signals events for tags and text. It does not build a DOM tree.
That's the sax_html, despite the name afaik the sax parser default feeds the dom
-
this is also an option https://benibela.de/sources_en.html#internettools
Only if your application is GPL.
-
It's not a DOM parser though, as you requested. It signals events for tags and text. It does not build a DOM tree.
That's the sax_html, despite the name afaik the sax parser default feeds the dom
Yes, sax_html has THTMLToDOMConverter.