Recent

Author Topic: Pointers, do I really need them?  (Read 2925 times)

Hendrikus

  • New Member
  • *
  • Posts: 10
Pointers, do I really need them?
« on: August 10, 2020, 06:26:10 pm »
Hello all,

I'm writing a program to extract data from digital images. Jpegs works very well and now I'm starting to process raw-files. Most camerabrands (Nikon, Canon etc.) use a TIFF-based header for their files, but each brand is slightly different. What I don't want is to maintain a type like
 TTIFFHeader = Packed record
     ByteOrder     : Word;
     TiffMagicValue: Word;
     TiffOffset    : Longword;
   end;   
for each brand. Is there a way to define a sort of common header and add members at runtime (depending on the camerabrand) or do I need to define a type for each brand and use pointers to those structures?

Hope anyone can put me in the right direction.

Best regards,
Hendrikus.

simone

  • Hero Member
  • *****
  • Posts: 571
Re: Pointers, do I really need them?
« Reply #1 on: August 10, 2020, 06:44:28 pm »
I'm not sure I understand your program goals. If you need to extract the exif metadata of the images, the Dexif library might be useful to you:

https://github.com/cutec-chris/dexif
Microsoft Windows 10 64 bit - Lazarus 3.0 FPC 3.2.2 x86_64-win64-win32/win64

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Pointers, do I really need them?
« Reply #2 on: August 10, 2020, 08:06:57 pm »
If they have a common prefix you could use variant records: https://www.freepascal.org/docs-html/ref/refsu15.html

To give you an example:
Code: Pascal  [Select][+][-]
  1. type
  2.   TVendor1Specifics = packed record
  3.     vendorField: Word;
  4.     EndOfHeader: Byte;
  5.   end;
  6.   TVendor2Specifics = packed record
  7.     vendorField: Cardinal;
  8.     EndOfHeader: Byte;
  9.   end;
  10.  
  11.   TVendorType = (vendor1, vendor2);
  12.  
  13.   TCommonHeader = packed record
  14.     CommonField1: Word;
  15.     CommonField2: Word;
  16.     Case TVendorType of
  17.     vendor1: (vendor1Specifics: TVendor1Specifics);
  18.     vendor2: (vendor2Specifics: TVendor2Specifics);
  19.   end;

Via @commonHeader.vendor1.EndOfHeader you get the address to the first byte after that vendors header

MarkMLl

  • Hero Member
  • *****
  • Posts: 6646
Re: Pointers, do I really need them?
« Reply #3 on: August 10, 2020, 09:09:40 pm »
I wonder how this can best be handled without reading excessive file content to memory and possibly having to fudge things with a pointer into that memory at some point?

I can't remember how well the traditional (pre-streams) "file of" types handled (tagged or untagged) variant records.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Pointers, do I really need them?
« Reply #4 on: August 10, 2020, 11:03:50 pm »
I wonder how this can best be handled without reading excessive file content to memory and possibly having to fudge things with a pointer into that memory at some point?

I can't remember how well the traditional (pre-streams) "file of" types handled (tagged or untagged) variant records.

MarkMLl

I would simply read the whole file into memory, I don't know that much about the raw image format, but I guess its something like 32 bit per pixel so in 4k this would be 300 megs. Thats easiely managable. Especially if you want to do work on that image that must be done anyway.

That said, depending on when the model dispatching is done, you can simply just read enough memory for the header and read it into the variant record:
Code: Pascal  [Select][+][-]
  1. size := GetFormatHeaderSize(Model);
  2. MyStream.ReadBuffer(commonHeader, size);

Regarding the file of, this would just read the size of the union, meaning if the vendor stuff has different length, you might read to much, so you need to seek back that amount. That said, for such purposes reading bytewise is probably the better idea

MarkMLl

  • Hero Member
  • *****
  • Posts: 6646
Re: Pointers, do I really need them?
« Reply #5 on: August 10, 2020, 11:49:06 pm »
I for one, and irrespective of available computational resources, would never advocate a whole-file read unless I were absolutely confident that the user intended to process the whole file. It would be ludicrous to read 300Mb if all the user were trying to do were identify the files with matching metadata, particularly since archives and galleries inevitably expand with advancing time.

Regarding the file of, this would just read the size of the union, meaning if the vendor stuff has different length, you might read to much, so you need to seek back that amount.

Yes, that's what I was thinking. Noting (and not necessarily disagreeing with) what you said about reading bytewise, I suspect that what I'd do would be read a variant record to determine the format and then reset all the way to the beginning before starting over.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Pointers, do I really need them?
« Reply #6 on: August 11, 2020, 12:09:11 am »
I for one, and irrespective of available computational resources, would never advocate a whole-file read unless I were absolutely confident that the user intended to process the whole file. It would be ludicrous to read 300Mb if all the user were trying to do were identify the files with matching metadata, particularly since archives and galleries inevitably expand with advancing time.

To be honest I would use mmap to map the file into memory, this way I can access it as an array and the linux kernel does internally the reading. This means after a call to MMAP I can pretend as the file would be loaded into memory and access it as a pointer/array while in reality it is still on the disk, and it's up to the kernel to load the data when required. Often used memory regions will end up in your physical memory while those that are very sparsely used will be removed if memory is needed.

This let's the OS do all the complecated stuff, and on the bright side, it is even faster than usual reading because no context switches are required and if the file is cached, the data will not be copied but you simply can (read) access the kernels file cache directly

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: Pointers, do I really need them?
« Reply #7 on: August 11, 2020, 12:14:26 am »
Hi!

Especialy the TIFF format is nothing for variant records.

The internal format is tricky and dynamic:

The first two bytes are II or MM - meaning Intel or Motorloa.
IF your machine has the proc of the specified vendor you are lucky.
If not you have to swap for every integer the byte sex.

Second: The first chunk is the only one with a constant length.
The second contains already the number of tags and the pointers to the tags.

And TIFF is very flexible.
You can set Point (0,0) to bottomright. If you like.
You can set the BitsPerPixel between 1 and 32.
You can save the CMYK (printer) format with 4 planes. if you need.
And you can save more than 1 image in one file.

You jump from Tag to Tag and get more detailed information.
If you got all necessary details you come to the raw data which are also saved in Tags.  And the next Tag ...

I have made a TIFF reader in the late 80s but it vanished with the broken harddisk of my Atari ST .

Try to take an existing reader.
Someone already did the job!

Winni




winni

  • Hero Member
  • *****
  • Posts: 3197
Re: Pointers, do I really need them?
« Reply #8 on: August 11, 2020, 12:40:28 am »
Hi!

Detailed info abot the TIFF format:
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwjcrMiA15HrAhXDC-wKHZjnDhcQFjAQegQIBxAB&url=http%3A%2F%2Fpaulbourke.net%2Fdataformats%2Ftiff%2Ftiff_summary.pdf&usg=AOvVaw1QOPZiomM-N7c3gigYKWmc

A short excerpt from the PDF, only A-C :

Code: Text  [Select][+][-]
  1. Tag Name Tag ID Tag Type
  2. Artist 315 ASCII
  3. BadFaxLines[1] 326 SHORT or LONG
  4.  BitsPerSample 258 SHORT
  5.  CellLength 265 SHORT
  6. CellWidth 264 SHORT
  7. CleanFaxData[1] 327 SHORT
  8. ColorMap 320 SHORT
  9. ColorResponseCurve 301 SHORT
  10.  ColorResponseUnit 300 SHORT
  11. Compression 259 SHORT
  12. CCITT 1D 2
  13. CCITT Group 3 3
  14. CCITT Group 4 4
  15.  LZW 5 -
  16. JPEG  6 -
  17. Uncompressed 32771
  18. Packbits 32773
  19. ConsecutiveBadFaxLines[1] 328 LONG or SHORT
  20. Copyright 33432 ASCII
  21.  
A lot of work!!!

If you want to do the world a favour:
There are a lot of programms which don't understand the TIFF Fax Format (CCITT 1..4).


Winni


TRon

  • Hero Member
  • *****
  • Posts: 2400
Re: Pointers, do I really need them?
« Reply #9 on: August 11, 2020, 12:57:39 am »
Especialy the TIFF format is nothing for variant records.

The internal format is tricky and dynamic:
I fully agree to that.

Besides the fact that it supports both endianess it is dynamic to such extend that it allows you to store multiple data into a single file. So it can for instance contain and a colormap and a jpg and a bitmap/fax image and at the same time contain an mp3 and a wav file, or multiple of them.

In that respect it is more like a container format, which requires use of a good parser.

Hendrikus

  • New Member
  • *
  • Posts: 10
Re: Pointers, do I really need them?
« Reply #10 on: August 11, 2020, 07:51:42 am »
I read a lot of reactions about stuff I never knew, so I got so much to learn, thank you for that. It's always good to learn.
Best regards,

Hendrukus

Hendrikus

  • New Member
  • *
  • Posts: 10
Re: Pointers, do I really need them?
« Reply #11 on: August 11, 2020, 07:53:48 am »
If they have a common prefix you could use variant records: https://www.freepascal.org/docs-html/ref/refsu15.html

To give you an example:
Code: Pascal  [Select][+][-]
  1. type
  2.   TVendor1Specifics = packed record
  3.     vendorField: Word;
  4.     EndOfHeader: Byte;
  5.   end;
  6.   TVendor2Specifics = packed record
  7.     vendorField: Cardinal;
  8.     EndOfHeader: Byte;
  9.   end;
  10.  
  11.   TVendorType = (vendor1, vendor2);
  12.  
  13.   TCommonHeader = packed record
  14.     CommonField1: Word;
  15.     CommonField2: Word;
  16.     Case TVendorType of
  17.     vendor1: (vendor1Specifics: TVendor1Specifics);
  18.     vendor2: (vendor2Specifics: TVendor2Specifics);
  19.   end;

Via @commonHeader.vendor1.EndOfHeader you get the address to the first byte after that vendors header

This is new to me and very nice. I'm going to experiment with this one. Thank you.

Hendrikus

  • New Member
  • *
  • Posts: 10
Re: Pointers, do I really need them?
« Reply #12 on: August 11, 2020, 07:59:27 am »


I have made a TIFF reader in the late 80s but it vanished with the broken harddisk of my Atari ST .

Try to take an existing reader.
Someone already did the job!

Winni

Maybe you're right and should I go back to ExifTool. exe. Thanks anyway.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6646
Re: Pointers, do I really need them?
« Reply #13 on: August 11, 2020, 08:24:30 am »
To be honest I would use mmap to map the file into memory, this way I can access it as an array and the linux kernel does internally the reading. This means after a call to MMAP I can pretend as the file would be loaded into memory and access it as a pointer/array while in reality it is still on the disk, and it's up to the kernel to load the data when required. Often used memory regions will end up in your physical memory while those that are very sparsely used will be removed if memory is needed.

I can't remember the extent to which mmap is on-demand, or if it pre-reads the entire space.

In any event, I'd still suggest that an initial read to determine the format is in order, irrespective of the way it's handled during the major processing phase.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

PascalDragon

  • Hero Member
  • *****
  • Posts: 5444
  • Compiler Developer
Re: Pointers, do I really need them?
« Reply #14 on: August 11, 2020, 09:14:06 am »


I have made a TIFF reader in the late 80s but it vanished with the broken harddisk of my Atari ST .

Try to take an existing reader.
Someone already did the job!

Winni

Maybe you're right and should I go back to ExifTool. exe. Thanks anyway.

Depending on how "standard compliant" the RAW images are you could take a look at fcl-image and its TIFF Reader. You don't need to use it as is, but you could use it as a base... (maybe together with the infrastructure provided by fcl-image).

 

TinyPortal © 2005-2018