Forum > Third party
FPC Neural Networks + Places2: 1.8 million training images - TESTING HAS STARTED
(1/1)
schuler:
:) Hello everyone :)
I think that the importance of the moment deserves a new topic.
Previous methods of the API load the full set of training images into memory. This was ok or even stretched with:
* CIFAR 10 and CIFAR100 (60k images): https://www.cs.toronto.edu/~kriz/cifar.html
* Tiny image net 200 (100k images): http://cs231n.stanford.edu/tiny-imagenet-200.zip
* Plant Village (54k images): https://data.mendeley.com/datasets/tywbtsjrjv/1/files/d5652a28-c1d8-4b76-97f3-72fb80f94efc
For a dataset with 1.8 million images, keeping everything into RAM wouldn't work. Places2 standard has 1.8 million training images:
http://places2.csail.mit.edu/download.html
For my own tests, I'm downloading "Small images (256 * 256) with easy directory structure with 21GB".
A solution for this is currently in testing. The folder structure is loaded into RAM with:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- FTrainingFileNames, FValidationFileNames, FTestFileNames: TFileNameList;... ProportionToLoad := 1; CreateFileNameListsFromImagesFromFolder( FTrainingFileNames, FValidationFileNames, FTestFileNames, {FolderName=}'places_folder/train', {pImageSubFolder=}'', {TrainingProp=}0.9*ProportionToLoad, {ValidationProp=}0.05*ProportionToLoad, {TestProp=}0.05*ProportionToLoad );
There is a new fitting class capable of working with the above:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- NeuralFit := TNeuralImageLoadingFit.Create; ... NeuralFit.FitLoading({NeuralNetworkModel}NN, {ImageSizeX}256, {ImageSizeY}256, FTrainingFileNames, FValidationFileNames, FTestFileNames, {BatchSize}256, {Epochs}100);
Is it bug free? I don't know. It's currently in testing. My current test case is modifying a source code intended for Plant Village dataset by just switching the folder name:
https://github.com/joaopauloschuler/neural-api/blob/master/examples/SimplePlantLeafDisease/SimplePlantLeafDiseaseLoadingAPI.pas
When creating the model (the neural network), the number of classes comes straight from the dataset:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---NN.AddLayer([ TNNetInput.Create(FSizeX, FSizeY, 3), ... TNNetFullConnectLinear.Create(FTrainingFileNames.ClassCount), TNNetSoftMax.Create() ]);
:) Wish everyone happy pascal coding. May the source be with you. :)
Editor:
Hi. it looks fantastic. Could yo tell more about the whole code?
Laksen:
Really cool library, and excellent information on github
I'm curious about whether you have support for fixed point datatypes of any kind?
Navigation
[0] Message Index