Recent

Author Topic: Still hitting head against wall  (Read 18100 times)

liewald

  • Full Member
  • ***
  • Posts: 142
Still hitting head against wall
« on: July 02, 2012, 03:25:08 pm »
Does anyone have any idea if it is possible in any way to create an in memory database with a working index. NOTHING I try seems to work.

ZMSQL in so full of bugs I cant get it to work
TBufdataset seems to have been fixed in FPC 2.7.1 but I cant get it to install
SQLite. works but is increadably slow

I'm dying here folks!!!!

Blestan

  • Sr. Member
  • ****
  • Posts: 461
Re: Still hitting head against wall
« Reply #1 on: July 02, 2012, 03:41:04 pm »
How many rows you have in this dataset? i cannot imagine that sqlite is slow ....
give some hints about your goals
Speak postscript or die!
Translate to pdf and live!

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: Still hitting head against wall
« Reply #2 on: July 02, 2012, 03:42:29 pm »
Do you think Bufdataset has been fixed in 2.7 or are you sure? Going from the rest of your post, I suppose you mean you saw it's been fixed in the bugtracker but couldn't test because you couldn't install fpc 2.7.

Recent versions of FPC trunk (2.7.1.) failed to compile Lazarus trunk... you may have better luck with a bit older revision.

You could use e.g. ludob's and my fpcup tool (see 3rd party announcements for details) to do a parallel installation of FPC trunk+Lazarus in c:\development\ (on Windows) or ~/lazarus and ~/fpc.
Note: you'll have to specify the --fpcurl parameter as fpcup defaults to fixes 2.6/2.6.1; see fpcup --help for details.
Works on Linux and Windows.

As for zmsql - sorry to be a bit crude here but if there are bugs there can't you fix them and submit them to tatamata? Of course I realize fixing other people's code wasn't what you set out to do, but perhaps it turns out to be the quickest way. Just my 2 cents...
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: Still hitting head against wall
« Reply #3 on: July 02, 2012, 03:43:22 pm »
@Blestan: you might want to search on Liewald's posts re sqlite in memory... I think they were large datasets... and IIRC nobody could figure out why sqlite was slow...
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

liewald

  • Full Member
  • ***
  • Posts: 142
Re: Still hitting head against wall
« Reply #4 on: July 02, 2012, 04:01:05 pm »
Lots of rows  > 3Mb in some cases. Human genomes are big!

the new datasets are getting bigger latest datasets are 3.5 GB per individual


Blestan

  • Sr. Member
  • ****
  • Posts: 461
Re: Still hitting head against wall
« Reply #5 on: July 02, 2012, 04:04:12 pm »
for 3,5GB maybe you should use mysql with memory db on a dedicated machine ...
Speak postscript or die!
Translate to pdf and live!

liewald

  • Full Member
  • ***
  • Posts: 142
Re: Still hitting head against wall
« Reply #6 on: July 02, 2012, 04:05:22 pm »
Looks like they found then bug and fixed it in 2.7.1 but just cant get 2.7.1 to work and no 2.6 patch was released.  just waited 3 days for a short prog to do 400,000  lines out of 2.5 million each line with being processed with just 3 lookups.

Hell!!!!

Blestan

  • Sr. Member
  • ****
  • Posts: 461
Re: Still hitting head against wall
« Reply #7 on: July 02, 2012, 04:08:13 pm »
did you optimized the SQL?
Speak postscript or die!
Translate to pdf and live!

liewald

  • Full Member
  • ***
  • Posts: 142
Re: Still hitting head against wall
« Reply #8 on: July 02, 2012, 04:09:26 pm »
I'm not at 3.5 GB yet but will be soon.Mysql is an option  But still need to connect to it!


liewald

  • Full Member
  • ***
  • Posts: 142
Re: Still hitting head against wall
« Reply #9 on: July 02, 2012, 04:11:53 pm »
no sql with Tbufdatset. just locate. Worth pointing out that this worked perfectly in older fpc but then collapsed and died on one of the upgraded. a 20 min job turned into 4 days due to indexing problems. Will insytall a local copy on mysql and tect but not hopefull as passing commands to an engine is not as quick as tbufdatset was, as passing millions of sql statements rather than working on a local buffered dataset is SLOOOOOOOOOOOW!
« Last Edit: July 02, 2012, 04:14:24 pm by liewald »

Blestan

  • Sr. Member
  • ****
  • Posts: 461
Re: Still hitting head against wall
« Reply #10 on: July 02, 2012, 04:16:28 pm »
Massive db servers are optimized for this kind of stuff ... but avoid sending command to the server ... write SQL that will do the hole job .... make  use of views and temp tables when using aggregate data ... if you give me an idea what kind of lookups you do i can help you with the SQL
Speak postscript or die!
Translate to pdf and live!

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: Still hitting head against wall
« Reply #11 on: July 02, 2012, 04:23:23 pm »
Will insytall a local copy on mysql and tect but not hopefull
I'd also try getting an FPC 2.7.1 version in (much easier if you don't need Lazarus - FPC revision 21757 compiles fine... it's just Lazarus that won't work well... at least in my setup).

Either use fpcup or get it using subversion yourself... fpcup is the easiest way though.
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

eny

  • Hero Member
  • *****
  • Posts: 1634
Re: Still hitting head against wall
« Reply #12 on: July 02, 2012, 04:32:17 pm »
SQLite. works but is increadably slow
Got SQLite working in memory and is extremely fast (all actions) with SQL.
Tried with 3.1M records and results are there instantly.
I'm not using a wrapper class but direct access to SQLite functions (own layer for the sqlite ddl).
All posts based on: Win10 (Win64); Lazarus 2.0.10 'stable' (x64) unless specified otherwise...

liewald

  • Full Member
  • ***
  • Posts: 142
Re: Still hitting head against wall
« Reply #13 on: July 02, 2012, 04:32:39 pm »
Nah dont work like that.

wish it did. here's pseudocode of what needs to be done.

1. Read in and store reference set one ( 500,000 lines);
2. Read in and store reference set two ( 400,000) lines);
3. Read in and store Reference set three ( 200,000) lines;
4. Process results file
 readln from results file
Parse and extract values.
do lookups and extract additional values from 3 lookups
assemble result string;
write out.


you could say read it all in to a database and do it there but the analysis files (22 of them) are all in excess of 2,000,000 lines and it's quicker to read parse lookup and write. believe me I've tried everything.

I got this process down to 3msec and it worked fine using sql it took 40msec. OK if you are doing one or 2 but Gigs of transactions!

one text file I had to process was 900GB....

what I need is very quick in memory lookups. nothing more that that. every millisec added adds hours or days to the processing.


liewald

  • Full Member
  • ***
  • Posts: 142
Re: Still hitting head against wall
« Reply #14 on: July 02, 2012, 04:34:23 pm »
it's not the size of the database it's thenumber of transactions. try hitting sqlite with 3 GB of transactions and see what happens  ;D

 

TinyPortal © 2005-2018