Recent

Author Topic: Redirect problems downloading web file.  (Read 4245 times)

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 592
    • Double Dummy Solver - free download
Redirect problems downloading web file.
« on: December 15, 2018, 08:57:49 pm »
With the code below I'm getting an empty file.
Code: Pascal  [Select][+][-]
  1. uses httpsend, ssl_openssl;
  2. procedure SaveURL(URL, aFile: string);
  3. var
  4.   SL: TStringList;
  5. begin
  6.   SL := TStringList.Create;
  7.   try
  8.     if HttpGetText(URL, SL) then
  9.       SL.savetofile(AFile);
  10.   finally
  11.     SL.Free;
  12.   end;
  13. end;
  14.  

When I used
Code: Pascal  [Select][+][-]
  1. var
  2.   HTTPSender: THTTPSend;
  3.   k: integer;
  4. begin
  5.   URL := 'http://www.bridgebase.com/myhands/hands.php?traveller=5532-1544797794-37619840&username=jec';
  6.   HTTPSender := THTTPSend.Create;
  7.   try
  8.     HTTPSender.HTTPMethod('GET', URL);
  9.     k := HTTPSender.ResultCode;
  10.  
k is equal to 302...redirect, and I've been unable to read the web page.

When I use something like
 
Code: Pascal  [Select][+][-]
  1.   OpenURL(PChar(URL));  
from LCLIntf, the web page shows up as expected in my default browser.

How can I get this page into a stringlist?
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

sash

  • Sr. Member
  • ****
  • Posts: 366
Re: Redirect problems downloading web file.
« Reply #1 on: December 15, 2018, 09:04:08 pm »
You should repeat request to the URI specified in header "Location".

https://en.wikipedia.org/wiki/HTTP_302
Lazarus 2.0.10 FPC 3.2.0 x86_64-linux-gtk2 @ Ubuntu 20.04 XFCE

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: Redirect problems downloading web file.
« Reply #2 on: December 15, 2018, 09:07:00 pm »
[...] k is equal to 302...redirect, and I've been unable to read the web page.

When I use something like
 
Code: Pascal  [Select][+][-]
  1.   OpenURL(PChar(URL));  
from LCLIntf, the web page shows up as expected in my default browser.

How can I get this page into a stringlist?

Follow the redirection. That's what the browser does and how it shows you the page.

Edit: Yeah, what Sash says

ETA: Also see this page in the wiki: download a file over http. It doesn't use Synapse, but it should work.
« Last Edit: December 15, 2018, 09:29:03 pm by lucamar »
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Redirect problems downloading web file.
« Reply #3 on: December 15, 2018, 10:20:34 pm »
Indeed, I wrote an overview of the different methods in the wiki. You should look at the last example or the second last one (c and/or d).
It uses fcl-web, because that is a standard package in FPC and Synapse is third-party. Parts of fcl-web are based on Synapse, though. With permission. Like the ssl code.
« Last Edit: December 15, 2018, 10:23:17 pm by Thaddy »
Specialize a type, not a var.

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 592
    • Double Dummy Solver - free download
Re: Redirect problems downloading web file.
« Reply #4 on: December 15, 2018, 10:36:37 pm »
I tried the approach at http://wiki.freepascal.org/download_a_file_over_http

Code: Pascal  [Select][+][-]
  1. procedure SaveURL(URL, aFile: string);
  2. var
  3.   Client: TFPHttpClient;
  4.   FS: TStream;
  5.   SL: TStringList;
  6. begin
  7.   InitSSLInterface; // SSL initialization has to be done by hand here
  8.   Client := TFPHttpClient.Create(nil);
  9.   FS := TFileStream.Create(aFile, fmCreate or fmOpenWrite);
  10.   try
  11.     try
  12.       Client.AllowRedirect := True;  // Allow redirections
  13.       Client.Get(PChar(URL), FS);
  14.     except
  15.       on E: EHttpClient do
  16.         writeln(E.Message)
  17.       else
  18.         raise;
  19.     end;
  20.   finally
  21.     FS.Free;
  22.     Client.Free;
  23.   end;
  24.  
  25.   { Test our file }
  26.   if FileExists(aFile) then
  27.     try
  28.       SL := TStringList.Create;
  29.       SL.LoadFromFile(aFile);
  30.       writeln(SL.Text);
  31.     finally
  32.       SL.Free;
  33.     end;
  34. end;
  35.  

This bombed at the line
      Client.Get(PChar(URL), FS);

with a message ...Invalid protocol: ""

Same error when I tried:
      Client.Get(URL, FS);

URL is a valid address.

Any idea what went wrong?

This worked fine with
  URL := 'https://google.com'; 

but not with
  URL := 'http://www.bridgebase.com/myhands/hands.php?traveller=5532-1544797794-37619840&username=jec';

« Last Edit: December 15, 2018, 10:51:49 pm by bobonwhidbey »
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Redirect problems downloading web file.
« Reply #5 on: December 15, 2018, 10:50:54 pm »
Strange. But the code are all console programs so it may be a Lazarus UTF8 issue.
I just ran it and it works as a console program:
Code: Pascal  [Select][+][-]
  1. {$mode delphi}{$ifdef windows}{$apptype console}{$endif}
  2. uses
  3.   classes, fphttpclient, fpopenssl, openssl;
  4. var
  5.   Client: TFPHttpClient;
  6. begin
  7.   { SSL initialization has to be done by hand here }
  8.   InitSSLInterface; // this is fixed in trunk
  9.   Client := TFPHttpClient.Create(nil);
  10.   try
  11.     { Allow redirections }
  12.     Client.AllowRedirect := true;
  13.     writeln(Client.Get('http://www.bridgebase.com/myhands/hands.php?traveller=5532-1544797794-37619840&username=jec'));
  14.   finally
  15.     Client.Free;
  16.   end;
  17. end.

You probably forgot to include the two ssl units. In that case it can't handle https.....
« Last Edit: December 15, 2018, 10:53:51 pm by Thaddy »
Specialize a type, not a var.

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 592
    • Double Dummy Solver - free download
Re: Redirect problems downloading web file.
« Reply #6 on: December 15, 2018, 11:22:34 pm »
I'm sure part of the problem is that this web site looks for a password stored in the browser's cookies. As mentioned, this works with
     URL := 'https://google.com';

This procedure from LCLIntf works fine, displaying in my browser.
  OpenURL(PChar(URL));

Is there a way to redirect the output from the browser to a file?
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

Leledumbo

  • Hero Member
  • *****
  • Posts: 8746
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Redirect problems downloading web file.
« Reply #7 on: December 16, 2018, 07:17:35 am »
I'm sure part of the problem is that this web site looks for a password stored in the browser's cookies. As mentioned, this works with
If you're VERY sure about it (but I'm not), then do it in the browser (I assume you use Chrome) with developer tools open. Navigate to network tab, find the request and click, cookies should be there below request headers, copy it to TFPHTTPClient.Cookies then call TFPHTTPClient.Get as usual.

sash

  • Sr. Member
  • ****
  • Posts: 366
Re: Redirect problems downloading web file.
« Reply #8 on: December 16, 2018, 01:09:04 pm »
Quote from: bobonwhidbey
I'm sure part of the problem is that this web site looks for a password stored in the browser's cookies.

A test with disabled cookies shows that it is not: redirected page has generated html content.

Quote from: bobonwhidbey
This procedure from LCLIntf works fine, displaying in my browser.
  OpenURL(PChar(URL));
Is there a way to redirect the output from the browser to a file?

This procedure simply launches another process with "currently registered default browser" application.
To save requested content to a file you'll need to tweak (browser specific) commandline options.

The problem is that TFPHTTPClient has some parsing bug (yes, SErrInvalidProtocol, although the specified site is a plain text on port 80) and Synapse's one does not return Response Headers, so you cannot get redirect location with stock HTTPMethod.

Sugestions: use Indy, or your own version of HTTPMethod to access  Response Headers.
Lazarus 2.0.10 FPC 3.2.0 x86_64-linux-gtk2 @ Ubuntu 20.04 XFCE

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Redirect problems downloading web file.
« Reply #9 on: December 16, 2018, 01:26:18 pm »
1. If it relies on already set cookies you need either to know the cookie or use the default browser indeed. The cookie belongs to the application that set it.
2. The code I wrote for the wiki also contains file storage for https example
3. Synapse can return the response headers, but not with the simple calls. The same as Indy... so that advice does not hold at all.
The wiki entry is not finished, but I will give examples for synapse like for fcl-web.
Specialize a type, not a var.

wp

  • Hero Member
  • *****
  • Posts: 11858
Re: Redirect problems downloading web file.
« Reply #10 on: December 16, 2018, 01:49:33 pm »
http://wiki.freepascal.org/download_a_file_over_http is a nice write-up of web-access by means of fcl routines and will get a prominent place in my bookmark collection.

In the https sections, it is not mentioned, however, that the ssl dlls (libeay32.dll and ssleay32.dll) must be somewhere on the system to be found by the program, either in correct windows system dir or directly in the application's exe folder. This is more important than explicitly adding fpopenssl and openssl to "uses" which can be skipped at all: when I have the dlls in the exe directory of dl_fphttp_b and remove fpopenssl and openssl the program runs without issues. A note on the correct download locations of the dlls, as well as the problem that 32bit and 64bit are named equally and even the 64bit dlls have the "32" in their name would certainly be very helpful to novice users, too.
« Last Edit: December 16, 2018, 01:58:31 pm by wp »

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 592
    • Double Dummy Solver - free download
Re: Redirect problems downloading web file.
« Reply #11 on: December 16, 2018, 06:17:59 pm »
I've used many of the suggested approaches, but they all seem to be variants of the same theme - having my Pascal app act like a browser. And they all work except for this particular web site.

For this web site it's necessary to utilize the user's default browser - specifically because of the password/cookie issue. I was hoping there might be some way to PIPE the output from the user's monitor to a file. Then my app would open the file and parse the HTML.
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

sash

  • Sr. Member
  • ****
  • Posts: 366
Re: Redirect problems downloading web file.
« Reply #12 on: December 16, 2018, 07:24:39 pm »
In the https sections
A site being discussed is not encrypted, it is plain-text http.

specifically because of the password/cookie issue.

As I said before, there's no cookie issue, there's Redirect location issue.
Specifically, this site uses relative redirect url (which is not common, but valid since 2014).

My bad, I overlooked comments in Synapse and said
Quote
Synapse's one does not return Response Headers

which is wrong, because it uses Headers as Request, and replaces same StringList with Response values. So Location is available.

Quote
I was hoping there might be some way to PIPE the output from the user's monitor to a file.
Forget it (OpenURL). The best you can do with this approach: TProcess + specific commandline.

UPD: The only foreseeable problem, that you're (probably) requesting an url that is normally accessible to logged-in users (normally, logins are executed by POST (via html-form-variables) requests) and it seems like you're missed that step, so you're getting redirects to a Login form.

But actually you always should handle "login required" situation (f.e. in case of session has been expired, or whatever other reason) in the middle of your data exchange process.

Summary: try login first and there are chances you won't get any 302s at all (for the first time :-)).
« Last Edit: December 16, 2018, 07:52:28 pm by sash »
Lazarus 2.0.10 FPC 3.2.0 x86_64-linux-gtk2 @ Ubuntu 20.04 XFCE

wittbo

  • Full Member
  • ***
  • Posts: 150
Re: Redirect problems downloading web file.
« Reply #13 on: October 10, 2019, 06:55:15 am »
For me there was a similar problem with redirection and error "Invalid protocol ''".

After some tests, I found a solution. The problem is, that some servers don't send a full URI in the redirection response, when only a redirection to a new subpage occurs. In this case the "Location" parameter contains the new subpage without any protocol or host string.
To handle those cases you can use the TFPHTTPClient property OnRedirect. Deliver a small checkroutine, which handles your problem. What you have to do, is:

  try
    { Allow redirections }
    Client.AllowRedirect := true;
    Client.OnRedirect    := @CheckURI;   // this tells the Client how to handle redirection;
    writeln(Client.Get('http://www.bridgebase.com/myhands/hands.php?traveller=5532-1544797794-37619840&username=jec'));
  finally
    Client.Free;
  end;


And provide this check routine:

procedure CheckURI (Sender: TObject; const ASrc: String; var ADest: String);
var newURI     : TURI;
    OriginalURI: TURI;
begin
   newURI := ParseURI (ADest, False);
   if (newURI.Host = '') then begin                         // NewURI does not contain protocol or host
      OriginalURI          := ParseURI (ASrc, False);       // use the original URI...
      OriginalURI.Path     := newURI.Path;                  // ... with the new subpage (path)...
      OriginalURI.Document := newURI.Document;              // ... and the new document info...
      ADest                := EncodeURI (OriginalURI)       // ... and return the complete redirected URI
   end
end;

Last but not least, don't forget to add the unit URIParser to your uses clause.

I tried it and it works well, with my URI and your example as well.

-wittbo-
MBAir with MacOS 10.14.6 / Lazarus 2.2.4
MacStudio with MacOS 13.0.1 / Lazarus 2.2.4

 

TinyPortal © 2005-2018