Recent

Author Topic: Get HTML page  (Read 14746 times)

xinyiman

  • Hero Member
  • *****
  • Posts: 2211
    • Lazarus and Free Pascal italian community
Get HTML page
« on: March 23, 2010, 09:32:09 am »
I have a problem, I wish I could see the html code for a specific page on the Internet, say I want to see the html page www.liuheschool.com (which is my site) as I do? I wish I could put into a string and then analyze it! Ideas Suggestions?
Win10, Ubuntu and Mac
Lazarus: 2.1.0
FPC: 3.3.1

Peter_Vadasz

  • New Member
  • *
  • Posts: 35
Re: Get HTML page
« Reply #1 on: March 23, 2010, 11:56:54 am »
Try ararat synapse **Send Unit.
Here is an example (sorry the texts are hungarian in the example):
Code: [Select]
{$MODE objfpc}{$H+}
program getsite;

uses BaseUnix,**Send,Classes,SysUtils;

var ** : T**Send;
    page : TStringList;
    i : longint;
    talalt : boolean=true;
    mp : word;
    c: integer;

procedure help;
begin
  writeln('A program hasznalata :');
  writeln('   ./getsite url keresett_szo varakozasi_ido_masodpercben');
  writeln('    ahol az ');
  writeln('       url a weblap cime,');
  Writeln('       keresett_szo a szo amit a lapon keresunk es a');
  writeln('       varakozasi_ido_masodpercben az ido amig a program var');
  writeln('       mielott ujra keresni kezd, ha elsore megtalalta a keresett');
  writeln('       szot az oldalon');   
  writeln;
  halt(1);
end;

   
begin
  if (paramcount<3) or (pos('--HELP',UpCase(ParamStr(1)))<>0) then
    begin
      help;
      halt(1);
    end;
   
  **:=T**Send.Create;
  page:=TStringList.Create;
  try
    repeat
      if not **.**Method('GET',ParamStr(1)) then
        begin
          writeln('Hiba tortent!!!');
          writeln('A program kilep...');
          halt(1);
        end
      else
        begin
          page.LoadFromStream(**.Document);
          for i:=0 to page.Count-1 do
            if pos(UpCase(ParamStr(2)),UpCase(page[i]))<>0 then
              begin
                talalt:=true;
                write('Tal+ílat'+#10);
                val(ParamStr(3),mp,c);
                flush(Output);
//                writeln(TimeToStr(Time));
                fpSleep(mp);
//                writeln(TimeToStr(Time));
              end
            else talalt:=False;
        end;
        **.Clear;
        page.Clear;
      until talalt; 
  finally
    **.Free;
    page.Free;
  end;
end.
This code is try to find a given text in the source of the downloaded webpage.
OS: Ubuntu 12.04.2 32 bit
Lazarus: 1.0.8
FPC: 2.6.2

xinyiman

  • Hero Member
  • *****
  • Posts: 2211
    • Lazarus and Free Pascal italian community
Re: Get HTML page
« Reply #2 on: March 23, 2010, 12:59:41 pm »
Try ararat synapse **Send Unit.
Here is an example (sorry the texts are hungarian in the example):
Code: [Select]
{$MODE objfpc}{$H+}
program getsite;

uses BaseUnix,**Send,Classes,SysUtils;

var ** : T**Send;
    page : TStringList;
    i : longint;
    talalt : boolean=true;
    mp : word;
    c: integer;

procedure help;
begin
  writeln('A program hasznalata :');
  writeln('   ./getsite url keresett_szo varakozasi_ido_masodpercben');
  writeln('    ahol az ');
  writeln('       url a weblap cime,');
  Writeln('       keresett_szo a szo amit a lapon keresunk es a');
  writeln('       varakozasi_ido_masodpercben az ido amig a program var');
  writeln('       mielott ujra keresni kezd, ha elsore megtalalta a keresett');
  writeln('       szot az oldalon');   
  writeln;
  halt(1);
end;

   
begin
  if (paramcount<3) or (pos('--HELP',UpCase(ParamStr(1)))<>0) then
    begin
      help;
      halt(1);
    end;
   
  **:=T**Send.Create;
  page:=TStringList.Create;
  try
    repeat
      if not **.**Method('GET',ParamStr(1)) then
        begin
          writeln('Hiba tortent!!!');
          writeln('A program kilep...');
          halt(1);
        end
      else
        begin
          page.LoadFromStream(**.Document);
          for i:=0 to page.Count-1 do
            if pos(UpCase(ParamStr(2)),UpCase(page[i]))<>0 then
              begin
                talalt:=true;
                write('Tal+ílat'+#10);
                val(ParamStr(3),mp,c);
                flush(Output);
//                writeln(TimeToStr(Time));
                fpSleep(mp);
//                writeln(TimeToStr(Time));
              end
            else talalt:=False;
        end;
        **.Clear;
        page.Clear;
      until talalt; 
  finally
    **.Free;
    page.Free;
  end;
end.
This code is try to find a given text in the source of the downloaded webpage.

Ok, this method is cross platform?
Win10, Ubuntu and Mac
Lazarus: 2.1.0
FPC: 3.3.1

Peter_Vadasz

  • New Member
  • *
  • Posts: 35
Re: Get HTML page
« Reply #3 on: March 23, 2010, 02:00:06 pm »
Yes, Ararat Synapse is cross platform (I used under windows and linux).
OS: Ubuntu 12.04.2 32 bit
Lazarus: 1.0.8
FPC: 2.6.2

xinyiman

  • Hero Member
  • *****
  • Posts: 2211
    • Lazarus and Free Pascal italian community
Re: Get HTML page
« Reply #4 on: March 23, 2010, 04:24:12 pm »
Thanks, but the parameters that you use the program that I've just posted what are they? Why I can not understand what language is and I have no idea how tested!
Win10, Ubuntu and Mac
Lazarus: 2.1.0
FPC: 3.3.1

Peter_Vadasz

  • New Member
  • *
  • Posts: 35
Re: Get HTML page
« Reply #5 on: March 23, 2010, 06:03:05 pm »
First, Sorry for my bad english.

The program that I posted is a simple free pascal program. You can use it, if you give 3 parameters in the command line. The first parameter is the URL of the site that you want to check. The second parameter is a word (sort text) that the program is try to find in the source code of the webpage (which was given in the first parameter) and the third parameter is a number.  If the program found the given text (second parameter) in the webpage then check again the URL every number seconds (third parameter) while the text (second parameter) is found on the page.
I wrote in my first post that the program contains Hungarian texts.
Here is the english version of the program:
Code: [Select]
{$MODE objfpc}{$H+}
program getsite;

uses BaseUnix,**Send,Classes,SysUtils;

var ** : T**Send;
    page : TStringList;
    i : longint;
    talalt : boolean=true;
    mp : word;
    c: integer;

procedure help;
begin
  writeln('Usage :');
  writeln('   ./getsite url word second');
  writeln('    where ');
  writeln('       url is the URL of a webpage,');
  Writeln('       word is a short text (only 1 word) that we want to find in the site');
  writeln('       second is time in second while the program waits before the next search if it found the word in the first/previous search');
  writeln;
  halt(1);
end;

   
begin
  if (paramcount<3) or (pos('--HELP',UpCase(ParamStr(1)))<>0) then
    begin
      help;
      halt(1);
    end;
   
  **:=T**Send.Create;
  page:=TStringList.Create;
  try
    repeat
      if not **.**Method('GET',ParamStr(1)) then
        begin
          writeln('Error!!!');
          writeln('Program halted');
          halt(1);
        end
      else
        begin
          page.LoadFromStream(**.Document);
          for i:=0 to page.Count-1 do
            if pos(UpCase(ParamStr(2)),UpCase(page[i]))<>0 then
              begin
                talalt:=true;
                write('Found'+#10);
                val(ParamStr(3),mp,c);
                flush(Output);
//                writeln(TimeToStr(Time));
                fpSleep(mp);
//                writeln(TimeToStr(Time));
              end
            else
            begin
              talalt:=False;
            end; 
        end;
        **.Clear;
        page.Clear;
      until not talalt;
  finally
    **.Free;
    page.Free;
  end;
end.
OS: Ubuntu 12.04.2 32 bit
Lazarus: 1.0.8
FPC: 2.6.2

xinyiman

  • Hero Member
  • *****
  • Posts: 2211
    • Lazarus and Free Pascal italian community
Re: Get HTML page
« Reply #6 on: March 24, 2010, 08:02:55 am »
Thank you!
Win10, Ubuntu and Mac
Lazarus: 2.1.0
FPC: 3.3.1

 

TinyPortal © 2005-2018