Recent

Author Topic: Camera OCR / Sorting / AI learning  (Read 1204 times)

Petrus Vorster

  • Jr. Member
  • **
  • Posts: 82
Camera OCR / Sorting / AI learning
« on: November 05, 2024, 01:16:32 pm »
Greetings Team

The more I work with LPC, the more I am astounded by its capabilities.

This is probably way above my head, but it is very interesting and could keep me busy until I retire one day.

In mail processing, the OCR reads the zip/postal code from visual text within a certain readable area on the envelope in a millisecond. This tech was available since the mid 80's.
How did they do that then?....and can one do something similar on a PC with a webcam now?

There is a need for an AI tool to learn to determine the TO and FROM addresses on mail pieces, regardless where its printed and then use a inline printer to do returns, numbering etc.
Processing equipment like that costs millions. Few countries can afford that.

I will be happy if i can learn how to swipe an envelope past a webcam, gets it to find the zipcode and print it for me!

In such a wilderness, where would you wizzards start to even remotely look into something like that?
Some broad ideas would be interesting to hear.

-Peter
« Last Edit: November 05, 2024, 01:19:31 pm by Petrus Vorster »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: Camera OCR / Sorting / AI learning
« Reply #1 on: November 05, 2024, 01:44:06 pm »
The speedier versions of that tech probably used FPGAs. Also not all OCR is AI, and AI is generally computationally heavier.

I'm not sure 1ms for a generic detection using generic tools is feasible/easy, but the default open source solution is tesseract in combination with opencv or not.  ( https://www.geeksforgeeks.org/text-detection-and-extraction-using-opencv-and-ocr/ and https://en.delphipraxis.net/topic/8694-ann-new-opencv-v-46-c-api-wrapper/)

(Opencv is a general machine vision library, and Delphi header files for it do exist, I'm not sure if they cover the OCR part though).

I'm also planning to do something this winter, but I only have to read something similar to LCD segmented digits (like on an eighties equipment), but I decided to first try without AI, to see if I can find a speedier solution with less dependencies.
« Last Edit: November 05, 2024, 02:00:32 pm by marcov »

MarkMLl

  • Hero Member
  • *****
  • Posts: 8039
Re: Camera OCR / Sorting / AI learning
« Reply #2 on: November 05, 2024, 01:56:32 pm »
There's a series or articles at https://hackaday.com/2023/10/19/youve-got-mail-grilled-scrambled-and-other-delicious-stamps/ which might be useful as backgrounders.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: Camera OCR / Sorting / AI learning
« Reply #3 on: November 05, 2024, 02:05:37 pm »
You mean like https://hackaday.com/2023/09/20/youve-got-mail-reading-addresses-with-ocr/

It actually suggest they do something that I also plan to do, using normal vision tricks to narrow the region of interest and get a rough idea of orientation. Since the subject is glass or ceramic (embossed mould numbers), optical artefacts can confuse standard preprocessing in vision libs.

But since all my info are either horizontal or vertical lines, some projections will probably be sufficient.

added later: ah sorry for the noise, there is a list with all the articles near the bottom
« Last Edit: November 05, 2024, 02:12:33 pm by marcov »

Petrus Vorster

  • Jr. Member
  • **
  • Posts: 82
Re: Camera OCR / Sorting / AI learning
« Reply #4 on: November 05, 2024, 02:11:18 pm »
Interesting topics!

Will read!

Thanks everyone.

-Peter

gidesa

  • Full Member
  • ***
  • Posts: 145
Re: Camera OCR / Sorting / AI learning
« Reply #5 on: November 05, 2024, 02:29:00 pm »
I'm not sure 1ms for a generic detection using generic tools is feasible/easy, but the default open source solution is tesseract in combination with opencv or not.  ( https://www.geeksforgeeks.org/text-detection-and-extraction-using-opencv-and-ocr/ and https://en.delphipraxis.net/topic/8694-ann-new-opencv-v-46-c-api-wrapper/)
(Opencv is a general machine vision library, and Delphi header files for it do exist, I'm not sure if they cover the OCR part though).

Opencv doesn't have a direct module for OCR. It can be used in conjunction with Tesseract. Or with Opencv you can load and use OCR deep neural networks.
(Note: Opencv cannot train a deep neural network).
Another technique uses the kNN statistical algorithm. That nevertheless requires a training step, but it can be done using Opencv functions.
See https://docs.opencv.org/4.x/d8/d4b/tutorial_py_knn_opencv.html.
I will add this interesting example translated for Delphi/FPC to Opencv 4.6 wrapper repository https://github.com/gidesa/ocvWrapper46 

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: Camera OCR / Sorting / AI learning
« Reply #6 on: November 05, 2024, 02:58:39 pm »
gidesa: thanks. I was hoping it had one, to be used as fallback if I failed. 

I liked the main video (in the link that I posted). It is roughly kind of a similar approach, but since my features are simpler, simply putting a grid over it, and sampling a few points should be enough. (I also have an own blob/feature library when necessary), and without dependencies.

My case however might be randomly orientated, while the video talks about business mail that was offered to the postal service in a matter suitable for OCR reading in those early days (address in fixed location, upright orientation, printed letters etc).   Also results are checked against a (rotating drum memory :-)) database, solutions with a checking mechanism are always more reliable than without.

It is strange that so much of the tricks of machine vision that I do daily  already happened back then (e.g. maximising resolution on the core part, first determining outer dimensions and orientation, change from bitmap into runlists etc). It is not just about algorithms it is also optimising the incoming data, both in equipment line-up as with preprocessing, and also postprocessing (the already mentioned checking solution).

Some of our algorithms spend more time generating a number for reliability  than in the actual analysis.

But my main problems will be lense artefacts in the glass and worn moulds that make the embossing disappear in some cases, not the algorithm.
« Last Edit: November 05, 2024, 05:15:55 pm by marcov »

MarkMLl

  • Hero Member
  • *****
  • Posts: 8039
Re: Camera OCR / Sorting / AI learning
« Reply #7 on: November 05, 2024, 04:38:34 pm »
added later: ah sorry for the noise, there is a list with all the articles near the bottom

The apology should be from me. I was fairly sure that series of articles contained relevant stuff, but I was under pressure to get a couple of emails out in a hurry so didn't try to refine the target.

There's also a Youtube video from the always-interesting Tom Scott https://www.youtube.com/watch?v=XxCha4Kez9c , although right at the start he emphasises that manual keying is very much a minority requirement. However this has to be read in the context of a substantial proportion of mail being "quasi-junk", with a machine-readable code generated at-source in order to qualify for maximum discount.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

jollytall

  • Sr. Member
  • ****
  • Posts: 366
Re: Camera OCR / Sorting / AI learning
« Reply #8 on: November 05, 2024, 08:39:26 pm »
Also not all OCR is AI

Well, this is an interesting statement.
In my view every OCR is Artificial Intelligence. There is no clear definition of AI, but after many iteration my definition is AI when a machine can perform a task what earlier thought to be linked to human intelligence. Also it often means that for a normal person it is unimaginable how that can be made by a program. Therefor to convert a printed, even handwritten writing into a digitalized text was definitely something intelligent. Also as such systems are designed by humans and not by mother nature, it is called artificial (in the classical usage artificial equals to manmade).
There are many examples of AI systems like this, e.g face recognition, SIFT, recommendation systems, classifiers and also OCR.
Obviously, as we get more and more used to what programs can do, the bar is constantly being raised what can be called really AI and what is only a standard algorithm.

Nowadays AI is often used as a synonym to Neural Networks. Those definitely create also intelligence and as such can be considered as a subset of AI. Some people even further reduces AI to very complex systems, LLMs and alikes.

Albeit it has Intelligence (as specified above), in my terminology, I do not even use Artificial I for some of the neural network solutions, as the intelligence in it is not (directly) created by humans, hence not "artificial" in that sense. Unlike in SIFT, no human created the knowledge, but a machine developed it in an evolutionary training process in a very general framework. SIFT is manmade (David Lowe) to be a good image recognition algo, but only that. A large Neural Network is only a framework and the knowledge in it is not designed by man, but generated through learning.
Hence I prefer to use Machine Intelligence or Machine Learning for the larger neural network models.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: Camera OCR / Sorting / AI learning
« Reply #9 on: November 05, 2024, 09:20:40 pm »
Also not all OCR is AI

Well, this is an interesting statement.
In my view every OCR is Artificial Intelligence. There is no clear definition of AI, but after many iteration my definition is AI when a machine can perform a task what earlier thought to be linked to human intelligence.

Well, that is very, very broad, IMHO so broad it is unworkable and too subjective.

I would at least have some learning step in it, and some dynamism in the algorithm (i.e. the learning step should not be e.g. a fixed adding of an mask image or so)

In this case I refered to  the printed letter analysis in the letter. It simply makes a matrix of pixels and then does some arithmetic to analyse that. IOW hardcoded algorithm with hardcoded weigh factors etc. No self learning part of it. That is IMHO not AI.


« Last Edit: November 05, 2024, 09:58:45 pm by marcov »

jollytall

  • Sr. Member
  • ****
  • Posts: 366
Re: Camera OCR / Sorting / AI learning
« Reply #10 on: November 05, 2024, 09:43:07 pm »
Well, yes you are right. There are some really old OCRs, if we can call it that, that is only checking fixed locations. I remember in the Soviet Union they had a system, if you bought an envelope, on it the Zip code segments were indicated with light dashed lines and you had to make the given lines stronger with a pen to form the numbers. If we call that OCR, then I agree it ireally should not be called AI.
https://i.ebayimg.com/images/g/t48AAOSw3hBepeEp/s-l1600.jpg

If we are at terminology, it is also a question what do we call "learning". I can make an image recognition system based on SIFT. I can show to it thousands of photos and store the feature points. Then an unknown picture can be processed and compared to the stored data. Now the question is, whether the storage of the feature points is learning or not.
If we call it "learning" then it meets your definition of AI (although there is no dynamism in it), but in my view it is not even learning since not an y=f(x) function learnt but a database is built (even if at the end it gives a value to an input data). Although I would not call it learning, I would call a system that tells me what is on a photo definitely an AI.

MarkMLl

  • Hero Member
  • *****
  • Posts: 8039
Re: Camera OCR / Sorting / AI learning
« Reply #11 on: November 05, 2024, 10:12:03 pm »
In my view every OCR is Artificial Intelligence.

No, definitely not. I read some design notes from Redactron corp. in the early 1980s and their OCR was all about eigenvalues and eigenvectors... frankly, made my head spin. And /that/ was in the era when AI meant IKBS and /way/ before GPUs etc.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: Camera OCR / Sorting / AI learning
« Reply #12 on: November 05, 2024, 10:15:03 pm »
If we are at terminology, it is also a question what do we call "learning". I can make an image recognition system based on SIFT. I can show to it thousands of photos and store the feature points. Then an unknown picture can be processed and compared to the stored data. Now the question is, whether the storage of the feature points is learning or not.
If we call it "learning" then it meets your definition of AI (although there is no dynamism in it), but in my view it is not

This is exactly why I added the requirement for dynamism in the algorithm, not just the complexity of the learned pattern. Of course it also depends how the stored data is further processed.  You can make the number of each types of features columns in a neural network, but you could also simply determining the minimal number of each feature in a good item and use that for basic sifting. The first is dynamic, the second not.

But what I say is not a hard truth. I just try to catch into words how I'm feeling about it (and not paint everything  "AI")

In general I don't use SIFT, though I experimented with Hough. Simple blob and edges and some pretty simple highschool math go a long way.

« Last Edit: November 05, 2024, 10:36:36 pm by marcov »

alpine

  • Hero Member
  • *****
  • Posts: 1303
Re: Camera OCR / Sorting / AI learning
« Reply #13 on: November 06, 2024, 09:39:52 am »
In general I don't use SIFT, though I experimented with Hough. Simple blob and edges and some pretty simple highschool math go a long way.
I'll be glad to hear a bit more details about those alternative (to the NN) OCR methods.  I'm using Tesseract OCR for a long time and while generally happy with it, there is some occurring faults specific for the (so called) AI - hallucinations and overfitting mainly. Thus I'm trying to learn about some different methods for doing it, preferably with a lesser I in the name.

AFAIK there is some fuzzy logic approach to the recognition, but the appnotes I have read are of limited application, mainly for OCR tailored fonts.
The eigenvalues mentioned by Mark  are also an interesting approach, but I suspect there is also a "self-learning" part in that, so and the big I included.

Thanks in advance for any little idea you are willing to share.



"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

gidesa

  • Full Member
  • ***
  • Posts: 145
Re: Camera OCR / Sorting / AI learning
« Reply #14 on: November 06, 2024, 01:26:50 pm »
I'll be glad to hear a bit more details about those alternative (to the NN) OCR methods.  I'm using Tesseract OCR for a long time and while generally happy with it, there is some occurring faults specific for the (so called) AI - hallucinations and overfitting mainly. Thus I'm trying to learn about some different methods for doing it, preferably with a lesser I in the name.

In the so called AI there are many algorithms different from Neural Networks, for example K Nearest Neighbors, Support Vector Machines, Random Trees, Boost.
The set of these algorithms, plus NN, is also called Machine Learning, and are all statistical algorithms in principle, as NN.
Deterministic algorithms could be better in more limited case. They do not require training, yes. But YOU require training to design that algorithms! :-)
That is, you must have a deep knowledge to do a good algorithm design.
Note that some of the other ML algorithms, different from NN, require much less data to do a good training. So in some cases could be a better choice than NN.

 

 

TinyPortal © 2005-2018