Forum > General

Removing duplicates from an array

(1/7) > >>

BillMcEnaney:
Hi, everyone,

Here's my Pascal function to remove duplicates from a list stored in an array. Please forgive me for any FPC-related mistakes because I'm used to ANSI Standard Pascal.

Here are the type and constant declarations for the program.

const
   Capacity = 80;

type
   Index_Range = 0..Capacity;
   Element_Tupe = integer;
   List = array[Index_Range] of Element_Type;


--- Code: ---function Remove_Duplicates(Table: List; Table_Length: Index_Range): List;
   var
      Result: List;
      Here, There: Index_Range;
      Duplicates: set of Element_Type;
   begin
      There := 1;
      for Here ::= 1 to Table_Length do
         if Table[Here] not in Duplicates
         then
            begin
               include(Duplicates, Table[Here]);
               Result[There] := Table[Here];
               There := succ(There)
            end;
         if Duplicates = []
            then Result := Table;
        Remove_Duplicates := Result
   end;
--- End code ---

The compiler made my laptop complain about the if-statement until I changed Element_Type to 0..5. So I wonder whether the set of integers between 1 and MaxInt is too big to represent with a Pascal set. Am I right about that?

I could overwrite duplicates in the original list. But I use the set and the result list to speed up the function.

howardpc:

--- Quote from: BillMcEnaney on May 19, 2022, 07:07:24 pm ---So I wonder whether the set of integers between 1 and MaxInt is too big to represent with a Pascal set. Am I right about that?

--- End quote ---
You wonder correctly. FPC sets are limited to Byte range ordinals (0..255).
There are generics libraries that offer sets with greater ranges and more functionalities (e.g. a set Count property) than the built-in FPC set implementation.
For instance see here .

BobDog:

Is this it in fp maybe?

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- const   Capacity = 255; type   Index_Range = 0..Capacity;   Element_Type = integer;   List = array[Index_Range] of Element_Type;   aod= array of Element_Type;      function Remove_Duplicates(Table: List; Table_Length: Index_Range):aod;   var      Result:array of Element_type;//            Here, There: Index_Range;      Duplicates:set of index_range;   begin    setlength(result,capacity);    Duplicates:=[0];      There := 1;      for Here := 1 to Table_Length do         if (Table[Here] in Duplicates) =false then                     begin               include(Duplicates, Table[Here]);               Result[There] := Table[Here];               There := succ(There)            end;         if Duplicates = []            then Result := Table;            setlength(result,There);        Remove_Duplicates := Result   end;        function range(mymin:int32;mymax:int32):int32;   begin   range:=trunc(int((Random*(Mymax-mymin+1)))+MyMin);  end;           var   i:integer;   table:List;   answer:aod;   begin   for i:=0 to high(table) do table[i]:=range(100,110);   writeln('Before:');   for i:=0 to high(table) do write(table[i],' ');   answer:=Remove_Duplicates(table,high(table));   writeln;   writeln('After:');   for i:=1 to high(answer) do write(answer[i],' ');   writeln;   writeln('Press return to end');   readln;   end.  
And my attempt, keeping the order and using dynamic arrays and template (generics), but not sets.


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- generic function CleanUp<T>(a:T):T;var n1,n2,flag:int32; count:int32=1; ret:T;begin  setlength(ret,high(a));  ret[0]:=a[0];  For n1 :=1 To high(a) do  begin   flag:=0;   For n2 :=0 to  n1-1 do    begin    If (a[n1]=a[n2]) Then      begin      flag:=1;      break;      end;     end; If (flag=0) Then    begin     ret[count]:=a[n1];     count:=count+1;    end; end; setlength(ret,count); exit(ret);end;  function range(mymin:int32;mymax:int32):int32;   begin   range:=trunc(int((Random*(Mymax-mymin+1)))+MyMin);  end;   type aoi=array of int32;type aos=array of char;var ans:aoi=nil;var ret:aos=nil;var j:int32;var i:aoi;var s:aos; beginsetlength(i,255);for j:=0 to high(i) do i[j]:=range(100,110);writeln('Before:');for j:=low(i) to high(i) do write(i[j],' ');writeln;writeln('After:');ans:=specialize CleanUp<aoi>(i);for j:=low(ans) to high(ans) do write(ans[j],' ');writeln;writeln;writeln;setlength(s,50);for j:=0 to high(s) do if range(1,10)>5 then s[j]:=chr(range(65,90)) else s[j]:=chr(range(97,122));//s[j]:=rangewriteln('Before:');for j:=low(s) to high(s) do write(s[j],' ');writeln;writeln('After:');;ret:=specialize CleanUp<aos>(s);for j:=low(ret) to high(ret) do write(ret[j],' ');writeln;writeln('Press return to end');readln; end.   

440bx:
Just a suggestion, while sets in FPC are limited to 256 elements, bitpacked arrays of boolean are not.

That would allow creating a bitmap of duplicates to be used to remove them in a subsequent pass.

avk:
The bitmap for the above range (1..MaxInt) must be at least 228 bytes in size and must be initialized.
It doesn't look very practical, does it?

Navigation

[0] Message Index

[#] Next page

Go to full version