Forum > Networking and Web Programming
Is there any library that would make process of making a website scrapper easier
(1/1)
Rave:
Basically I want to make a web scraper for one of the sites I love that has awful UI so that I can interact with it better. Is there any Lazarus/Free Pascal library that would make this process easier? I'd rather not parse HTML by hand.
Thaddy:
Our member benibela has a good library for that:
https://www.benibela.de/sources_en.html#internettools
Simple example that extracts all hrefs from a page:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---uses simpleinternet, xquery; var a: IXQValue;begin for a in process('https://freepascal.org', '//a/@href') do writeln(a.toString);end.You need to undefine USE_PASDBLSTRUTILS_FOR_JSON in internettoolsconfig.inc.
Navigation
[0] Message Index