Forum > Databases

[SOLVED] Mind bending SQL problem with UPDATE function

(1/3) > >>

Gizmo:
UPDATE : DSELIGO HAS HELPED ME SOLVE THIS ONE. WHAT A STAR!

I am really struggling with something that I hope is easy for someone to spot.

I have a program that compares folders of files by hash.

It finds all the files in FolderA and adds those names to a stringlist.
It finds all the files in FolderB and adds those names to a second stringlist.

It then hashes all the files in FolderA, using the first stringlist as a direction, and adds the filename and hash value to Columns 1 and 2 of my database using the UPDATE SQL syntax.
It then hashes all the files in FolderB, using the second stringlist as direction, and adds the filename and hash value to Columns 3 and 4 of my database also using the UPDATE SQL syntax.

If all the files match, and critically if the count matches, all is good. And even the files do not match but their count does match, all is good too.

But imagine where the file that was found in FolderA was NOT in FolderB at all, or visa versa. No entry gets put into Columns 3 and 4 for that missing file. So there are two empty cells; one in Col3 and one in Col4 (or rather, there needs to be two empty cells to represent the fact it was not there).

So, my problem is that when the next file is found in FolderB, UPDATE adds the details of that file to Col 3 and Col 4 but using the last empty cell so it goes into the empty cell of the preceeding entry where no file was found for its corresponding FolderA file, instead of adding the data to the next row. And then they all get out of line like a waterfall effect.

It is hard to explain and I hope the screenshot attached illustrates? You'll see the few at the top were all matching, then one is missing, and from there they all go out of whack. 

As you will see, 29.raw starts with a hash of 9B31...7349 and its copy was found in FolderB. As did all the files before match. But then comes 3-Copy.raw which IS NOT in FolderB (I created these folders by copying and pasting and then I removed one from the 2nd folder and I also edited one file to make its hash differ). But instead of it being left empty, the details for 3.raw have been inserted into Column 3 and 4. Then 3.raw is listed again on the next row, , and the whole mis-alignment trickles down. So from that point on, it's all wrong.

So, the code I have that populates the database and the 4 columns is as follows. Basically, first it works out what the highest number of rows are that are needed, and inserts those rows. Then it uses two procedures that call UPDATE to populate those rows with the data.

Can anyone happen to see how I can correct this?


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- //Populate database with all rows required for the biggest of the two folders        // If they match, just use the FolderA file count        if FolderAFileCount > FolderBFileCount then        begin        frmSQLiteDBases.Write_INSERT_All_Rows_Required(FolderAFileCount);        end          else if FolderBFileCount > FolderAFileCount then          begin          frmSQLiteDBases.Write_INSERT_All_Rows_Required(FolderBFileCount);          end            else if FolderAFileCount = FolderBFileCount then            begin            frmSQLiteDBases.Write_INSERT_All_Rows_Required(FolderAFileCount);            end;     ... // Populate the database table with empty rows first, to then be populated using UPDATEprocedure TfrmSQLiteDBases.Write_INSERT_All_Rows_Required(RowCount : integer);var  i : integer;begin  for i:= 0 to RowCount -1 do    begin      try        sqlCOMPARETWOFOLDERS.Close;        sqlCOMPARETWOFOLDERS.SQL.Text := 'INSERT into TBL_COMPARE_TWO_FOLDERS (FolderAndFileNameA, FolderAndFileNameAHash, FolderAndFileNameB, FolderAndFileNameBHash) values (:FolderAFilename,:FolderAFileHash,:FolderBFilename,:FolderBFileHash)';        SQLTransaction1.Active := True;        sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderAFilename').AsString := '';        sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderAFileHash').AsString := '';        sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderBFilename').AsString := '';        sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderBFileHash').AsString := '';        sqlCOMPARETWOFOLDERS.ExecSQL;      except          on E: EDatabaseError do          begin            MessageDlg('Error','Unable to create all rows required for comparison. Technical error message: ' + E.Message,mtError,[mbOK],0);          end;      end;    end;end;          procedure TfrmSQLiteDBases.Write_COMPARE_TWO_FOLDERS_FolderA(Col1, Col2 : string; Counter : integer);{ Col1 : Filename from Folder A  Col2 : Hashvalue from FileA in FolderA}begin  try    sqlCOMPARETWOFOLDERS.Close;           sqlCOMPARETWOFOLDERS.SQL.Text := 'UPDATE TBL_COMPARE_TWO_FOLDERS SET FolderAndFileNameA =:FolderAFilename, FolderAndFileNameAHash =:FolderAFileHash WHERE Id=:Counter';    SQLTransaction1.Active := True;    sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderAFilename').AsString := Col1;    sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderAFileHash').AsString := Col2;    sqlCOMPARETWOFOLDERS.Params.ParamByName('Counter').AsString := IntToStr(Counter);    sqlCOMPARETWOFOLDERS.ExecSQL;   except      on E: EDatabaseError do      begin        MessageDlg('Error','A database error has occurred. Technical error message: ' + E.Message,mtError,[mbOK],0);      end;  end;end; procedure TfrmSQLiteDBases.Write_COMPARE_TWO_FOLDERS_FolderB(Col3, Col4 : string; Counter : integer);{ Col3 : Filename from Folder B  Col4 : Hashvalue from FileB in FolderB}begin  try    sqlCOMPARETWOFOLDERS.Close;          sqlCOMPARETWOFOLDERS.SQL.Text := 'UPDATE TBL_COMPARE_TWO_FOLDERS SET FolderAndFileNameB =:FolderBFilename, FolderAndFileNameBHash =:FolderBFileHash WHERE Id=:Counter';    SQLTransaction1.Active := True;    sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderBFilename').AsString := Col3;    sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderBFileHash').AsString := Col4;    sqlCOMPARETWOFOLDERS.Params.ParamByName('Counter').AsString         := IntToStr(Counter);    sqlCOMPARETWOFOLDERS.ExecSQL;   except      on E: EDatabaseError do      begin        MessageDlg('Error','A database error has occurred. Technical error message: ' + E.Message,mtError,[mbOK],0);      end;  end;end;    

engkin:
I think the table should have three fields only: NameA, NameB, and Hash
when two files have identical hash, you record the three values.
when one file is not matched, you record its name and hash and leave the other name empty.

You build the two lists first with the hashes, then you write the results. When you write an entry from the first list, you search the second list for a matching hash. If you found a matching hash* you take the names and the hash as values for the table fields, and you delete both entries from the two lists.

If no matching entry was found, take one name and the hash and record it in its appropriate fields, and delete the entry from the list

It does not matter which list you begin with, when done with the first list you switch to the second list. All its entries, at this point, are files with no match.

When filling the table, you just need one go using INSERT.

* You should try to match the names AND the hashes first, if failed, then try the hash alone. This is to account for more than one copy of the same file in the same folder.

dseligo:
You didn't say what database you use, I will write you for MySQL (MariaDB) (edit: I saw later that you use SQlite).
I would organize database table little different: you don't need two columns for FolderAndFileName (since you want the same filename in one row), so the table would look like:

--- Code: MySQL  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---CREATE TABLE `tbl_compare_two_folders` (        ID INT(11) NOT NULL AUTO_INCREMENT,        FolderAndFileName VARCHAR(500) NOT NULL,        FolderAndFileNameAHash VARCHAR(36) NULL,        FolderAndFileNameBHash VARCHAR(36) NULL,        PRIMARY KEY (ID),        UNIQUE INDEX FolderAndFileName (FolderAndFileName));
If you will insert names and hashes into database, I don't see point into filling stringlist.

Now you go through folder 'A', read files calculate hash and call InsertHash procedure:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---  ... FindFirst(...)...  repeat    sHash:=YourCalculateHashFunction(...);    InsertHash(FileName,sHash,'FolderAndFileNameAHash'); // for 'B' folder use FolderAndFileNameBHash column name  until FindNext(...)<>0;  FindClose(...);
You repeat this for folder 'B', just use different column name.


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure InsertHash(sFileName,sHash,sColumnName:String);begin  sqlCOMPARETWOFOLDERS.Close;  sqlCOMPARETWOFOLDERS.SQL.Text := 'INSERT INTO tbl_compare_two_folders (FolderAndFileName, '+sColumnName+') VALUES (:FolderAndFileName, :hash) ON CONFLICT(FolderAndFileName) DO UPDATE SET '+sColumnName+' = :hash;';  SQLTransaction1.Active := True;  sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderAndFileName').AsString := sFileName;  sqlCOMPARETWOFOLDERS.Params.ParamByName('FolderAFileHash').AsString := sHash;  sqlCOMPARETWOFOLDERS.ExecSQL;//   SQLTransaction1.Active := False; // don't you need this? I am not sure, I don't work with SQlite muchend;
If you want to check files that exists in both folders you use:

--- Code: MySQL  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---select * from tbl_compare_two_folders where FolderAndFileNameAHash is not null and FolderAndFileNameBHash is not null

If you want to check files that exists in both folders but not the same you use:

--- Code: MySQL  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---select * from tbl_compare_two_folders where FolderAndFileNameAHash<>FolderAndFileNameBHash
If you want to check files that don't exists in folder A you use:

--- Code: MySQL  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---select * from tbl_compare_two_folders where FolderAndFileNameAHash is null

And if you want to check files that don't exists in folder B you use:

--- Code: MySQL  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---select * from tbl_compare_two_folders where FolderAndFileNameBHash is null

dseligo:
Oh, and if you wan't to check if there exists different file names with same hash you could use something like this:

--- Code: MySQL  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---select *from tbl_compare_two_folders a, tbl_compare_two_folders bwhere a.FolderAndFileName<>b.FolderAndFileNameand (  a.FolderAndFileNameAHash=b.FolderAndFileNameAHashor a.FolderAndFileNameAHash=b.FolderAndFileNameBHashor a.FolderAndFileNameBHash=b.FolderAndFileNameAHashor a.FolderAndFileNameBHash=b.FolderAndFileNameBHash)

engkin:
@dseligo, beautiful approach. How do you go about filling dbgrid[?] as in the attached image in the first post. I think that's his goal.

Navigation

[0] Message Index

[#] Next page

Go to full version