Recent

Author Topic: How determine slash/no-slash end of URL  (Read 1825 times)

Michael Collier

  • Sr. Member
  • ****
  • Posts: 329
How determine slash/no-slash end of URL
« on: July 01, 2025, 05:07:03 pm »
I'm looking for way to determine if a URL has a trailing slash or not.

An example of URL without trailing slash..
https://www.gov.uk/browse/driving

An example of URL with trailing slash..
https://www.gov.uk/
But is displayed in browser only as..
https://www.gov.uk

Trailing slashes seem to be hidden from chrome browser address bar.
So I can't rely on how the URL is displayed as a means to determine its actual value.
So need a way to confirm URL before storing in database as the application that uses these URLs needs them to `correct`.

I've tried checking for redirect but didn't work, any ideas what else to try?

Thanks,
Mike

anse

  • New Member
  • *
  • Posts: 31
  • Bugmonkey
    • HeidiSQL
Re: How determine slash/no-slash end of URL
« Reply #1 on: July 01, 2025, 05:20:28 pm »
> But is displayed in browser only as...

You cannot rely the displayed url is valid, it's just for the user to see where he is. Firefox even hides the "https://" from it once the page is loaded.

On the server side, you might be able to access the REQUEST_URI environment variable, which should be always valid.

Michael Collier

  • Sr. Member
  • ****
  • Posts: 329
Re: How determine slash/no-slash end of URL
« Reply #2 on: July 01, 2025, 05:24:57 pm »
On the server side, you might be able to access the REQUEST_URI environment variable, which should be always valid.

Thanks but I only have access to the client side of things though... I'm logging the URL of multiple websites that I have no control over..

dsiders

  • Hero Member
  • *****
  • Posts: 1461
Re: How determine slash/no-slash end of URL
« Reply #3 on: July 01, 2025, 06:32:30 pm »
On the server side, you might be able to access the REQUEST_URI environment variable, which should be always valid.

Thanks but I only have access to the client side of things though... I'm logging the URL of multiple websites that I have no control over..

Most responses include link metadata with the canonical or alternate URIs. Beyond that... you should not care.
Preview the next Lazarus documentation release at: https://dsiders.gitlab.io/lazdocsnext

Michael Collier

  • Sr. Member
  • ****
  • Posts: 329
Re: How determine slash/no-slash end of URL
« Reply #4 on: July 01, 2025, 06:47:12 pm »
Beyond that... you should not care.

The problem though is that the application I feed the URLs into does care (it seems to be able to determine the actual URL) and throws error if they don't match.

anse

  • New Member
  • *
  • Posts: 31
  • Bugmonkey
    • HeidiSQL
Re: How determine slash/no-slash end of URL
« Reply #5 on: July 01, 2025, 07:31:29 pm »
You could check with a regular expression if the url has no path, and then ensure a slash exists at the end:

Code: Pascal  [Select][+][-]
  1. uses RegExpr;
  2.  
  3. begin
  4.   r := TRegExpr.Create('^https?\://[^/]+$');
  5.   url := 'http://test.com';
  6.   if r.Exec(url) then
  7.     url := url + '/';
  8.   r.Free;
  9. end;
  10.  


Remy Lebeau

  • Hero Member
  • *****
  • Posts: 1565
    • Lebeau Software
Re: How determine slash/no-slash end of URL
« Reply #6 on: July 01, 2025, 09:27:26 pm »
The problem though is that the application I feed the URLs into does care (it seems to be able to determine the actual URL) and throws error if they don't match.

Sounds like the app is just doing plain string comparisons instead of using URL normalization & comparison rules (see RFC 3986 Section 6).  For example, when comparing HTTP/S URLs, an empty path is supposed to compare equivalent to "/", so it shouldn't matter if a trailing "/" is actually present or not.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

dbannon

  • Hero Member
  • *****
  • Posts: 3556
    • tomboy-ng, a rewrite of the classic Tomboy
Re: How determine slash/no-slash end of URL
« Reply #7 on: July 02, 2025, 01:14:26 am »
Michael, is this what you are looking for ?

Code: Pascal  [Select][+][-]
  1. if not URL.EndsWith('/') then  URL := URL + '/';

Davo

Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

Thaddy

  • Hero Member
  • *****
  • Posts: 18305
  • Here stood a man who saw the Elbe and jumped it.
Re: How determine slash/no-slash end of URL
« Reply #8 on: July 02, 2025, 02:12:48 pm »
I believe includetrailingpathdelimiter will also work on URL's.

Correction: Linux only, On windows it will add '\'.
« Last Edit: July 02, 2025, 02:21:22 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

cdbc

  • Hero Member
  • *****
  • Posts: 2462
    • http://www.cdbc.dk
Re: How determine slash/no-slash end of URL
« Reply #9 on: July 02, 2025, 02:15:13 pm »
Hi
Yup, me too Thaddy, at least an *nixes...
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE6 -> FPC 3.2.2 -> Lazarus 4.0 up until Jan 2025 from then on it's both above &: KDE6/QT6 -> FPC 3.3.1 -> Lazarus 4.99

Michael Collier

  • Sr. Member
  • ****
  • Posts: 329
Re: How determine slash/no-slash end of URL
« Reply #10 on: July 02, 2025, 05:26:16 pm »
The problem I'm having is more to do with how the server/browser handles slashes (adds or removes them without me seeing) and being able to determine whether a change has happened so I'm using the `correct` value in my database.

I've used TFPHttpClient to visit URL and track any redirects.

Here is example of it working ok for me..

Code: Pascal  [Select][+][-]
  1. ==========================================
  2. Added Un-Needed /
  3. https://www.gov.uk/browse/driving/
  4. ==========================================
  5. --redirects--
  6. src =https://www.gov.uk/browse/driving/
  7. dest=http://www.gov.uk/browse/driving
  8. src =http://www.gov.uk/browse/driving
  9. dest=https://www.gov.uk/browse/driving <<==`correct`
  10. ==========================================
  11.  
  12. ==========================================
  13. Removed Needed /
  14. https://examples.eze2e.com/command
  15. ==========================================
  16. --redirects--
  17. src =https://examples.eze2e.com/command
  18. dest=https://examples.eze2e.com/command/ <<==`correct`
  19. ==========================================

But this method doesn't work on my tests for the host main page URL..
If I remove slash and visit https://www.gov.uk I don't get redirects to https://www.gov.uk/

The application I feed these URLs into acts like a browser (in fact it is a form of browser) and it waits for internal redirects to happen e.g. (slashes added/removed), so it needs to be using `correct` values for further tests to work e.g. confirming database URL is `correct`.

So I guess I'm looking for component that tracks internal redirects?

Edit: The image shows what happens when dragging URL into mousepad
« Last Edit: July 02, 2025, 05:27:55 pm by Michael Collier »

Thaddy

  • Hero Member
  • *****
  • Posts: 18305
  • Here stood a man who saw the Elbe and jumped it.
Re: How determine slash/no-slash end of URL
« Reply #11 on: July 02, 2025, 06:22:28 pm »
Yes, you will with these links, but you can't automate that in this case because user input is required.
That is a sane - but small - security precaution. Especially for government sites.
Once the cookie is set, you never experience it again.
Bots usually can not set cookies like this.
« Last Edit: July 02, 2025, 06:29:04 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 1565
    • Lebeau Software
Re: How determine slash/no-slash end of URL
« Reply #12 on: July 02, 2025, 06:28:09 pm »
The problem I'm having is more to do with how the server/browser handles slashes (adds or removes them without me seeing) and being able to determine whether a change has happened so I'm using the `correct` value in my database.

The browser doesn't care, it will happily request a URL any way it is given.  It is the server that decides what the URL needs to look like.  So, I agree that you should query the URL live and if the server redirects it then update your database accordingly (only if it is a permanent redirect, not a temporary redirect).

But this method doesn't work on my tests for the host main page URL..
If I remove slash and visit https://www.gov.uk I don't get redirects to https://www.gov.uk/

Because there is no need for such a redirect, as they are logically the same URL.  This goes back to my earlier comment about needing to follow the normalization and comparison rules defined in RFC 3986.  For instance, you should consider normalizing the URLs that you store in your database.

The application I feed these URLs into acts like a browser (in fact it is a form of browser) and it waits for internal redirects to happen e.g. (slashes added/removed), so it needs to be using `correct` values for further tests to work e.g. confirming database URL is `correct`.

Then it is using broken logic, as there is no guarantee that a redirect will occur on any given URL.  That is entirely up to the server.  If you request a URL, you will get either a success, an error, or a redirect.  That is all you have to go on, so act accordingly.  If you request a URL and get a success, then the URL is 'correct' regardless of how the server decides to process it.

So I guess I'm looking for component that tracks internal redirects?

There is no such thing as an "internal redirect".  Either the server redirects or it does not.  That is an explicit response.

When you drag the URL from your browser address bar to your Mousepad, you are being given a normalized URL, not the exact text typed into the address bar.
« Last Edit: July 02, 2025, 06:33:41 pm by Remy Lebeau »
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

Thaddy

  • Hero Member
  • *****
  • Posts: 18305
  • Here stood a man who saw the Elbe and jumped it.
Re: How determine slash/no-slash end of URL
« Reply #13 on: July 02, 2025, 06:31:02 pm »
There is no such thing as an "internal redirect".  Either the server redirects or it does not.  That is an explicit response.
But there is URL validation and that can work in a similar way.
In this case that happens.
(I had to delete my cache to replay this)
« Last Edit: July 02, 2025, 06:34:33 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

 

TinyPortal © 2005-2018