Hi gli,
Let me rant a bit about what I'm understanding about your predicament.
From what I can gather your specs are:
- Upon a determined date/time interval
- Select a chunk of data
- From stock market companies
So:
- Step 1: Filter universe data by date
- Step 2: retrieve subset of universe data
What Warfley is proposing is what I was about to propose about 10 minutes ago, and solves Step 2 quite well.
What made me rethink this is the fact that in order to have the fastest Step 1 you would need a balanced binary tree, you know, like the ones MySQL implements for it's indexes.
Doing that yourself is gonna be a real pain in the arse and (I don't believe I'm about to say this) why re-invent the wheel here?
So, even if you implement a blazing fast solution for Step 2, you're still threading water in Step 1.
Now, you're proposing splitting your data into smaller chunks.
Well that's the best way to go, absolutely and I would advise doing that with a system that can solve Step 1 with balanced binary trees(AVL trees if I'm not mistaken).
My proposal is the following:
- For all the companies crate a table with the name of the company(Apple) and the name of a table prefix(apple_)
- For each company create a set of tables for each graph time: apple_daily, apple_30min, etc with only 2 columns: date/time, pointer to points value table, where the primary key is the date/time column
- For each graph time create a table to hold the points value data: apple_daily_data, apple_30m_data that contains an index that goes into apple_daily and the other data points
Using this method, you are slicing the data in many smaller chunks, just like what a binary tree does when searching it.
You'll then have the same amount of data, in total, but now a bit more manageable for any database engine.
And you want that under a piece of software that has been tested for these types of use.
Anything you do is only going back to methods discarded in favour of databases.
Cheers,
Gus