Okay, I am rather baffled, so I am assuming that I am creating this program completely wrong in FPC, and there's a bottleneck somewhere, as I would imagine compiled machine code would be 2x or even more faster than bytecode in Python... So, here are both programs, can someone explain why the FPC code would be 1.5x slower and where I could optimize the code:
Program ChkBlock;
{$M OBJPAS}
Uses classes, sysutils;
Var
blocking: TStringList;
ips: TStringList;
matches: TStringList;
i: longint;
st: TDateTime;
Begin
st := Time;
blocking := TStringList.Create;
ips := TStringList.Create;
matches := TStringList.Create;
blocking.LoadFromFile('blocking.txt');
ips.LoadFromFile('ips.txt');
WriteLn(ips.Count);
for i := 0 to ips.Count-1 do
begin
if (blocking.IndexOf(ips[i]) > -1) then
begin
Write('.');
matches.Append(ips[i]);
end;
end;
WriteLn(#10'Total matches: ', matches.Count);
matches.SaveToFile('matches.txt');
matches.Destroy;
ips.Destroy;
blocking.Destroy;
WriteLn('Started at ', TimeToStr(st));
WriteLn('Took ',Time-st);
End.
I also don't really know how to properly time code in Pascal, in Python, there's a timedelta object, as you'll see in the Python code example:
import datetime, sys
st=datetime.datetime.now()
blocking = [line.split(',')[1] for line in open('BIG_blocktable_mapi-Updated.csv','r').readlines()]
ips = open('ips.txt','r').readlines()
matches=[]
for ip in ips:
if ip.replace('\n','') in blocking:
sys.stdout.write('.')
sys.stdout.flush()
matches.append(ip)
print "\nTotal matches: %s" % len(matches)
open('matches.txt', 'w').writelines(matches)
d=datetime.datetime.now()-st
print "Took %s seconds." % d.seconds
Would the standard output being slowing things down? I plan on doing another benchmark in a bit without any output, besides that which displays the length of the operation.
Are there additional options I could provide to the FPC compiler which would optimize code like this? I would really love to know what everybody else does in regards to Pascal/Delphi code optimizations.
Should I be creating an array instead of using a TStringList class for this? Would that make the code run 5x quicker like compiled code should?
Also, oddly, running the same code using Cython(Python module compiled into machine code) runs the same loop at basically the same speed...
This speed concern won't deter me from using Pascal/Delphi/Lazarus for future projects, as I see other benefits of it other than shear speed. Such as being compiled(less deps on targets), and being able to target more platforms than Python currently can. I personally use a Linux PC for everything, and when it comes to distributing Python apps to other platforms... Well, I'm sure anybody who's use a scripting language knows the ordeals here. Python is much larger, and shipping more than one app potentially means multiple Python installs on the target OS. Where Pascal just depends on the core OS libraries, and is thus easier to ship. Anyways, any insight on this issue would be great!