Recent

Author Topic: [Solved] Workaround for HTTPSender.ResultCode=403  (Read 1073 times)

bobonwhidbey

  • Sr. Member
  • ****
  • Posts: 376
    • Double Dummy Solver - free download
[Solved] Workaround for HTTPSender.ResultCode=403
« on: June 09, 2019, 11:38:14 pm »
When I run the LCLIntf
Code: Pascal  [Select]
  1.   OpenDocument(Url)

The correct web page appears in my default browser, as expected. But I want the html contents of the web page sent to a file. When I use
Code: Pascal  [Select]
  1. function DownloadHTTP(URL, TargetFile: string): boolean;
  2. var
  3.   HTTPSender: THTTPSend;
  4.   k: integer;
  5. begin
  6.   Result := False;
  7.   HTTPSender := THTTPSend.Create;
  8.   try
  9.     HTTPSender.HTTPMethod('GET', URL);
  10.     k := HTTPSender.ResultCode;
  11.  
  12.     if (k >= 100) and (k <= 299) then
  13.     begin
  14.       HTTPSender.Document.SaveToFile(TargetFile);
  15.       Result := True;
  16.     end;
  17.   finally
  18.     HTTPSender.Free;
  19.   end;
  20. end;

I get an ResultCode = 403 for the same URL. Is there a way to use the default browser but have the result sent to a file which could then be accessed and parsed?
« Last Edit: June 16, 2019, 02:07:43 am by bobonwhidbey »
Win10 64-bit / Lazarus 32-bit 2.0.6 / FPC 3.0.4

Leledumbo

  • Hero Member
  • *****
  • Posts: 8114
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Workaround for HTTPSender.ResultCode=403
« Reply #1 on: June 10, 2019, 07:02:15 am »
The possibility is theoretically infinite, but I would guess the server expects some well known user agent header. Try to pass chrome one.

bobonwhidbey

  • Sr. Member
  • ****
  • Posts: 376
    • Double Dummy Solver - free download
Re: Workaround for HTTPSender.ResultCode=403
« Reply #2 on: June 10, 2019, 05:49:39 pm »
The OpenDocument approach utilizes my default browser (Chrome) and works - but puts the output to my monitor. It would seem that I have two roads to a solution:
  • Somehow have OpenDocument put output in a file. Perhaps something along the lines of piping, OR
  • Figure out how to make the web site happy with the DownloadHTTP approach
As you say, there are an unlimited number of potential problems with the 2nd approach. Is there some way to pipe monitor output to a file, perhaps via a ShellExecute?
Win10 64-bit / Lazarus 32-bit 2.0.6 / FPC 3.0.4

Thaddy

  • Hero Member
  • *****
  • Posts: 9292
Re: Workaround for HTTPSender.ResultCode=403
« Reply #3 on: June 11, 2019, 08:01:55 am »
I would first try LeleDumbo's suggestion, because the default useragent from synapse is rather old. It was also my first guess.
also related to equus asinus.

bobonwhidbey

  • Sr. Member
  • ****
  • Posts: 376
    • Double Dummy Solver - free download
Re: Workaround for HTTPSender.ResultCode=403
« Reply #4 on: June 11, 2019, 08:33:06 am »
I have no idea how to follow through on that suggestion. Can you point me to an article or give me an idea.
Win10 64-bit / Lazarus 32-bit 2.0.6 / FPC 3.0.4

sstvmaster

  • Full Member
  • ***
  • Posts: 131
Re: Workaround for HTTPSender.ResultCode=403
« Reply #5 on: June 11, 2019, 08:05:34 pm »
Code: Pascal  [Select]
  1. function DownloadHTTP(URL, TargetFile: string): boolean;
  2. var
  3.   HTTPSender: THTTPSend;
  4.   k: integer;
  5. begin
  6.   Result := False;
  7.   HTTPSender := THTTPSend.Create;
  8.   // This is the UserAgent !!! This is only an example
  9.   HTTPSender.UserAgent := 'Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/20100101 Firefox/5.0';
  10.   try
  11.     HTTPSender.HTTPMethod('GET', URL);
  12.     k := HTTPSender.ResultCode;
  13.      
  14.     if (k >= 100) and (k <= 299) then
  15.     begin
  16.       HTTPSender.Document.SaveToFile(TargetFile);
  17.       Result := True;
  18.     end;
  19.   finally
  20.     HTTPSender.Free;
  21.   end;
  22. end;

More information:
- https://en.wikipedia.org/wiki/User_agent
- https://developers.whatismybrowser.com/useragents/explore/
« Last Edit: June 11, 2019, 08:08:43 pm by sstvmaster »
Lazarus 2.0.4 x32
Lazarus 2.1.0 Trunk x32
OS Win 7 32bit

sash

  • Sr. Member
  • ****
  • Posts: 289
Re: Workaround for HTTPSender.ResultCode=403
« Reply #6 on: June 12, 2019, 09:09:39 am »
I almost absolutely (I don't have an access to server's code after all) sure you have nothing to do with UserAgent.

If you're getting 403 - most typical cause - you're visiting non-public page and logged in previously, so you got session cookie in your browser, while your HTTPClient lacks one.
Lazarus 2.0.6 FPC 3.0.4 x86_64-linux-gtk2 -- Ubuntu 19.10 XFCE

bobonwhidbey

  • Sr. Member
  • ****
  • Posts: 376
    • Double Dummy Solver - free download
Re: Workaround for HTTPSender.ResultCode=403
« Reply #7 on: June 16, 2019, 02:07:21 am »
Thank you SSTVMASTER. Your idea worked perfectly. I merely added this:

Code: Pascal  [Select]
  1.     HTTPSender.UserAgent := 'Mozilla/5.0';

after the create, and all went smoothly.

Sorry I didn't get back to you earlier, but I was away from my PC.
Win10 64-bit / Lazarus 32-bit 2.0.6 / FPC 3.0.4

Leledumbo

  • Hero Member
  • *****
  • Posts: 8114
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Workaround for HTTPSender.ResultCode=403
« Reply #8 on: June 17, 2019, 07:48:22 am »
I almost absolutely (I don't have an access to server's code after all) sure you have nothing to do with UserAgent.

If you're getting 403 - most typical cause - you're visiting non-public page and logged in previously, so you got session cookie in your browser, while your HTTPClient lacks one.
Seems like it does ;)
Some sites optiimizes its view depending on the browser requesting the page (or simply wants to always know what it is requested with) and when it's unknown, rather than giving possibly broken render, send permission denied instead. Github does this as well, so it's kind of common (ab?)use of such a response code.

Thaddy

  • Hero Member
  • *****
  • Posts: 9292
Re: Workaround for HTTPSender.ResultCode=403
« Reply #9 on: June 17, 2019, 09:04:36 am »
so it's kind of common (ab?)use of such a response code.
No, a  page may not be rendered correctly for older browsers, so the 403 is correct. It is not abuse. Hence as a result, simply upgrade (acually spoof!) your useragent to something more recent.
The useragent identifies the feature set of the client at serverside. Even if you do not use a browser the server expects support for a bottom-line feature set to give a valid response.
This is also called "content negotiation". See https://en.wikipedia.org/wiki/User_agent
« Last Edit: June 17, 2019, 09:11:33 am by Thaddy »
also related to equus asinus.

trev

  • Sr. Member
  • ****
  • Posts: 254
  • Former Delphi 7 and Delphi 10.2 User
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #10 on: June 17, 2019, 09:44:30 am »
Quote
6.5.3.  403 Forbidden

   The 403 (Forbidden) status code indicates that the server understood
   the request but refuses to authorize it.  A server that wishes to
   make public why the request has been forbidden can describe that
   reason in the response payload (if any).
Source: RFC 7231

So, yes, not abuse, but also not very friendly when the reason for the response can be included in the payload. microchip.com is a not very friendly one I encountered using fphttpclient.
o Lazarus v2.1.0 r61775, FPC v3.3.1 r42640, macOS 10.14.6 (with sup update), Xcode 10.3
o Lazarus v2.1.0 r61574, FPC v3.3.1 r42318, FreeBSD 12.0 (Parallels VM)
o Lazarus v2.1.0 r61574, FPC v3.0.4, Ubuntu 18.04 (Parallels VM)

sash

  • Sr. Member
  • ****
  • Posts: 289
Re: Workaround for HTTPSender.ResultCode=403
« Reply #11 on: June 17, 2019, 09:53:58 am »
No, a  page may not be rendered correctly for older browsers, so the 403 is correct.

How do they know if it is "older"? They simply don't know this Useragent string and refuse to care about actual features set, which actually should be "content-negotiated" with Accept* headers.

The problem is that 403 is very generic status and probably ok, if there would be a meaningful description along with the body of 403's response.
Lazarus 2.0.6 FPC 3.0.4 x86_64-linux-gtk2 -- Ubuntu 19.10 XFCE

Thaddy

  • Hero Member
  • *****
  • Posts: 9292
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #12 on: June 17, 2019, 09:58:11 am »
No they know the useragent string, but their site is not capable of rendering for old browsers. Mozilla4 is really old, so they throw an error page :o 8-) O:-).
I will submit a patch to synapse to change the useragent string to something that is 10 years old rather than 20.. Usually these are still applied.
Actually this is useragent spoofing, since no browser is used. See the link to wikipedia.
Anyway, this way you can obtain a fully rendered page.
« Last Edit: June 17, 2019, 10:02:30 am by Thaddy »
also related to equus asinus.

bobonwhidbey

  • Sr. Member
  • ****
  • Posts: 376
    • Double Dummy Solver - free download
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #13 on: June 17, 2019, 05:56:39 pm »
This is a really old  website. The page in question was last updated in 2001
Win10 64-bit / Lazarus 32-bit 2.0.6 / FPC 3.0.4

Thaddy

  • Hero Member
  • *****
  • Posts: 9292
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #14 on: June 17, 2019, 06:27:19 pm »
It is not a question of the page but a question of the last server update...... It maybe hosted, in which case the hoster will keep the server software up to date....In fact everybody keeps their servers up to date because of security.
also related to equus asinus.