Recent

Author Topic: [Solved] Workaround for HTTPSender.ResultCode=403  (Read 3287 times)

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 586
    • Double Dummy Solver - free download
[Solved] Workaround for HTTPSender.ResultCode=403
« on: June 09, 2019, 11:38:14 pm »
When I run the LCLIntf
Code: Pascal  [Select][+][-]
  1.   OpenDocument(Url)

The correct web page appears in my default browser, as expected. But I want the html contents of the web page sent to a file. When I use
Code: Pascal  [Select][+][-]
  1. function DownloadHTTP(URL, TargetFile: string): boolean;
  2. var
  3.   HTTPSender: THTTPSend;
  4.   k: integer;
  5. begin
  6.   Result := False;
  7.   HTTPSender := THTTPSend.Create;
  8.   try
  9.     HTTPSender.HTTPMethod('GET', URL);
  10.     k := HTTPSender.ResultCode;
  11.  
  12.     if (k >= 100) and (k <= 299) then
  13.     begin
  14.       HTTPSender.Document.SaveToFile(TargetFile);
  15.       Result := True;
  16.     end;
  17.   finally
  18.     HTTPSender.Free;
  19.   end;
  20. end;

I get an ResultCode = 403 for the same URL. Is there a way to use the default browser but have the result sent to a file which could then be accessed and parsed?
« Last Edit: June 16, 2019, 02:07:43 am by bobonwhidbey »
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

Leledumbo

  • Hero Member
  • *****
  • Posts: 8744
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Workaround for HTTPSender.ResultCode=403
« Reply #1 on: June 10, 2019, 07:02:15 am »
The possibility is theoretically infinite, but I would guess the server expects some well known user agent header. Try to pass chrome one.

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 586
    • Double Dummy Solver - free download
Re: Workaround for HTTPSender.ResultCode=403
« Reply #2 on: June 10, 2019, 05:49:39 pm »
The OpenDocument approach utilizes my default browser (Chrome) and works - but puts the output to my monitor. It would seem that I have two roads to a solution:
  • Somehow have OpenDocument put output in a file. Perhaps something along the lines of piping, OR
  • Figure out how to make the web site happy with the DownloadHTTP approach
As you say, there are an unlimited number of potential problems with the 2nd approach. Is there some way to pipe monitor output to a file, perhaps via a ShellExecute?
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

Thaddy

  • Hero Member
  • *****
  • Posts: 14163
  • Probably until I exterminate Putin.
Re: Workaround for HTTPSender.ResultCode=403
« Reply #3 on: June 11, 2019, 08:01:55 am »
I would first try LeleDumbo's suggestion, because the default useragent from synapse is rather old. It was also my first guess.
Specialize a type, not a var.

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 586
    • Double Dummy Solver - free download
Re: Workaround for HTTPSender.ResultCode=403
« Reply #4 on: June 11, 2019, 08:33:06 am »
I have no idea how to follow through on that suggestion. Can you point me to an article or give me an idea.
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

sstvmaster

  • Sr. Member
  • ****
  • Posts: 299
Re: Workaround for HTTPSender.ResultCode=403
« Reply #5 on: June 11, 2019, 08:05:34 pm »
Code: Pascal  [Select][+][-]
  1. function DownloadHTTP(URL, TargetFile: string): boolean;
  2. var
  3.   HTTPSender: THTTPSend;
  4.   k: integer;
  5. begin
  6.   Result := False;
  7.   HTTPSender := THTTPSend.Create;
  8.   // This is the UserAgent !!! This is only an example
  9.   HTTPSender.UserAgent := 'Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/20100101 Firefox/5.0';
  10.   try
  11.     HTTPSender.HTTPMethod('GET', URL);
  12.     k := HTTPSender.ResultCode;
  13.      
  14.     if (k >= 100) and (k <= 299) then
  15.     begin
  16.       HTTPSender.Document.SaveToFile(TargetFile);
  17.       Result := True;
  18.     end;
  19.   finally
  20.     HTTPSender.Free;
  21.   end;
  22. end;

More information:
- https://en.wikipedia.org/wiki/User_agent
- https://developers.whatismybrowser.com/useragents/explore/
« Last Edit: June 11, 2019, 08:08:43 pm by sstvmaster »
greetings Maik

Windows 10,
- Lazarus 2.2.6 (stable) + fpc 3.2.2 (stable)
- Lazarus 2.2.7 (fixes) + fpc 3.3.1 (main/trunk)

sash

  • Sr. Member
  • ****
  • Posts: 366
Re: Workaround for HTTPSender.ResultCode=403
« Reply #6 on: June 12, 2019, 09:09:39 am »
I almost absolutely (I don't have an access to server's code after all) sure you have nothing to do with UserAgent.

If you're getting 403 - most typical cause - you're visiting non-public page and logged in previously, so you got session cookie in your browser, while your HTTPClient lacks one.
Lazarus 2.0.10 FPC 3.2.0 x86_64-linux-gtk2 @ Ubuntu 20.04 XFCE

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 586
    • Double Dummy Solver - free download
Re: Workaround for HTTPSender.ResultCode=403
« Reply #7 on: June 16, 2019, 02:07:21 am »
Thank you SSTVMASTER. Your idea worked perfectly. I merely added this:

Code: Pascal  [Select][+][-]
  1.     HTTPSender.UserAgent := 'Mozilla/5.0';

after the create, and all went smoothly.

Sorry I didn't get back to you earlier, but I was away from my PC.
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

Leledumbo

  • Hero Member
  • *****
  • Posts: 8744
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Workaround for HTTPSender.ResultCode=403
« Reply #8 on: June 17, 2019, 07:48:22 am »
I almost absolutely (I don't have an access to server's code after all) sure you have nothing to do with UserAgent.

If you're getting 403 - most typical cause - you're visiting non-public page and logged in previously, so you got session cookie in your browser, while your HTTPClient lacks one.
Seems like it does ;)
Some sites optiimizes its view depending on the browser requesting the page (or simply wants to always know what it is requested with) and when it's unknown, rather than giving possibly broken render, send permission denied instead. Github does this as well, so it's kind of common (ab?)use of such a response code.

Thaddy

  • Hero Member
  • *****
  • Posts: 14163
  • Probably until I exterminate Putin.
Re: Workaround for HTTPSender.ResultCode=403
« Reply #9 on: June 17, 2019, 09:04:36 am »
so it's kind of common (ab?)use of such a response code.
No, a  page may not be rendered correctly for older browsers, so the 403 is correct. It is not abuse. Hence as a result, simply upgrade (acually spoof!) your useragent to something more recent.
The useragent identifies the feature set of the client at serverside. Even if you do not use a browser the server expects support for a bottom-line feature set to give a valid response.
This is also called "content negotiation". See https://en.wikipedia.org/wiki/User_agent
« Last Edit: June 17, 2019, 09:11:33 am by Thaddy »
Specialize a type, not a var.

trev

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2020
  • Former Delphi 1-7, 10.2 user
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #10 on: June 17, 2019, 09:44:30 am »
Quote
6.5.3.  403 Forbidden

   The 403 (Forbidden) status code indicates that the server understood
   the request but refuses to authorize it.  A server that wishes to
   make public why the request has been forbidden can describe that
   reason in the response payload (if any).
Source: RFC 7231

So, yes, not abuse, but also not very friendly when the reason for the response can be included in the payload. microchip.com is a not very friendly one I encountered using fphttpclient.

sash

  • Sr. Member
  • ****
  • Posts: 366
Re: Workaround for HTTPSender.ResultCode=403
« Reply #11 on: June 17, 2019, 09:53:58 am »
No, a  page may not be rendered correctly for older browsers, so the 403 is correct.

How do they know if it is "older"? They simply don't know this Useragent string and refuse to care about actual features set, which actually should be "content-negotiated" with Accept* headers.

The problem is that 403 is very generic status and probably ok, if there would be a meaningful description along with the body of 403's response.
Lazarus 2.0.10 FPC 3.2.0 x86_64-linux-gtk2 @ Ubuntu 20.04 XFCE

Thaddy

  • Hero Member
  • *****
  • Posts: 14163
  • Probably until I exterminate Putin.
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #12 on: June 17, 2019, 09:58:11 am »
No they know the useragent string, but their site is not capable of rendering for old browsers. Mozilla4 is really old, so they throw an error page :o 8-) O:-).
I will submit a patch to synapse to change the useragent string to something that is 10 years old rather than 20.. Usually these are still applied.
Actually this is useragent spoofing, since no browser is used. See the link to wikipedia.
Anyway, this way you can obtain a fully rendered page.
« Last Edit: June 17, 2019, 10:02:30 am by Thaddy »
Specialize a type, not a var.

bobonwhidbey

  • Hero Member
  • *****
  • Posts: 586
    • Double Dummy Solver - free download
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #13 on: June 17, 2019, 05:56:39 pm »
This is a really old  website. The page in question was last updated in 2001
Lazarus 3.0RC2, FPC 3.2.2 x86_64-win64-win32/win64

Thaddy

  • Hero Member
  • *****
  • Posts: 14163
  • Probably until I exterminate Putin.
Re: [Solved] Workaround for HTTPSender.ResultCode=403
« Reply #14 on: June 17, 2019, 06:27:19 pm »
It is not a question of the page but a question of the last server update...... It maybe hosted, in which case the hoster will keep the server software up to date....In fact everybody keeps their servers up to date because of security.
Specialize a type, not a var.

 

TinyPortal © 2005-2018