Recent

Author Topic: thtmldocument get text nodes  (Read 366 times)

BubikolRamios

  • Full Member
  • ***
  • Posts: 188
thtmldocument get text nodes
« on: June 13, 2019, 06:40:06 am »
Code: Pascal  [Select]
  1. ...
  2. <a>foo</a><a>foo1</a>
  3. ...
  4.  

something like
pseudocode:
Code: Pascal  [Select]
  1. TDomNodeList := thtmldocument.GetTextNodes()

would return element list  foo,foo1

Is there something like that ?
« Last Edit: June 13, 2019, 06:42:13 am by BubikolRamios »
lazarus-2.0.2-fpc-3.0.4-win32

wp

  • Hero Member
  • *****
  • Posts: 6158
Re: thtmldocument get text nodes
« Reply #1 on: June 13, 2019, 07:27:38 am »
Lazarus trunk / fpc 3.0.4 / all 32-bit on Win-10

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7363
Re: thtmldocument get text nodes
« Reply #2 on: June 13, 2019, 02:31:08 pm »
Afaik this is walking the tree and looking for attribute #text in the dom nodes.  Code in compilelatexchm.pp in the "fpcdocs" repository does something like that.

BubikolRamios

  • Full Member
  • ***
  • Posts: 188
Re: thtmldocument get text nodes
« Reply #3 on: June 14, 2019, 11:40:08 am »
To clarify, I don't need complete text, thtmldocument.TextContent already does that.


Examlple: have a list of elements and modify them inside thtmldocument
Code: Pascal  [Select]
  1.     domNodeList:= thtmldocument.GetElementsByTagName('img');
  2.     for i := 0 to domNodeList.Count-1 do
  3.     begin
  4.        tdomelement(domNodeList[i]).SetAttribute('src',someString);
  5.     end;
  6.  

so needed something instead of

Code: Pascal  [Select]
  1. thtmldocument.GetElementsByTagName('img');
  2.  

To get me text elements. need to modify them one by one, by regex replace in place, inside thtmldocument .
« Last Edit: June 14, 2019, 11:42:35 am by BubikolRamios »
lazarus-2.0.2-fpc-3.0.4-win32