Forum > General

Sorting special characters

(1/3) > >>

IM314:
Very occasional, complete amateur hobbyist here, so please be patient with my ignorance. I've been struggling to get names with special characters to sort 'right'. The following is a *very* rough and simplified example of what I am dealing with.

someObject is a class with property name : String.


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---constructor someObject.create(s : string);begin   name := s;end; function compare1(s1,s2 : pointer) : integer;begin   result := comparetext(someObject(s1).name,someObject(s2).name);end; procedure TForm1.Button1Click(Sender: TObject);vari : integer;list : tfpList;begin  list := tfpList.Create;  list.add(someObject.create('Rêd'));  list.add(someObject.create('Rad'));  list.add(someObject.create('Rod'));  list.add(someObject.create('Rêzd'));  list.add(someObject.create('Rêad'));  list.Sort(@compare1);  memo1.clear;  for i := 0 to list.count-1 do   memo1.append(someObject(list[i]).name);end;    
In this example (and my real-life project where I read names with special characaters from a file), the code above always sorts the special characters as if they come after z, so the result is always
Rad
Rod
Rêad
Rêd
Rêzd

How do I cast and/or compare these strings to get a more 'natural' sort order like every other piece of software I've tried (like Excel) seems to achieve with the same list: e.g:
Rad
Rêad
Rêd
Rêzd
Rod

Any advice much appreciated.

dseligo:
Maybe you can use this function to convert accented letters to non-accented ones and then sort: https://forum.lazarus.freepascal.org/index.php/topic,46804.msg334219.html#msg334219

IM314:
Thanks! This certainly works, but it feels like an incredibly roundabout (if clever) way of doing this. From my admittedly amateur reading and understanding of the documentation, something like
--- Quote ---ansiCompareText(string1,string2)
--- End quote ---
should do the same, but it does not appear to have any effect.

Zvoni:
Sounds like Collation
Maybe this: https://www.freepascal.org/docs-html/rtl/unicodedata/incrementalcomparestring.html

IM314:
Ah, should have known there would be a word for it. Collation. Interesting, and it does look like it addresses the issue. I knew from the start different languages would sort special characters differently based on pronunciation, so this makes sense that you can define different sort schemes, as it were. I was just being lazy and hoping for a magical shortcut that takes the nearest* equivalent from the Latin alphabet and sorts ê like e and à like A, for instance. The previous commenter's suggested function does exactly that, so at least my project now works.

*Poorly-defined, I know. Hence collation being a thing.

Navigation

[0] Message Index

[#] Next page

Go to full version