Jump to content
grantful

optical character recognition

Recommended Posts

I used this one on a project a few years back to replace an old version of ABBYY DLL (V1.0).

 

https://transym.com

 

It's a component written in Delphi and runs incredibly fast. NOT A DLL!

 

If you value your your time at all, this thing is cheaper than most programmers charge for an hour of their time. Tesseract will take you more than an hour to get a demo working in Delphi.

 

I'll give you a tip about OCR scanning ...

 

If you're processing any kinds of forms, first look for anything on the forms that lets you determine their orientation. Some OCR scanners do that, but they can be really slow b/c they'll scan in all four orientations and pick the one with the best results. Look for a few "landmarks" like a few words and isolate an area where you'd expect them to be on the page. Scan those spots and see if you get a match or not. Then rotate 180 degrees and try again. If it doesn't match, chances are either (1) the page is upside down (ie, it's blank) or (2) it's not the form with those landmarks on it.

 

Identify forms and their orientation by landmarks first. Then extract the data you need as follows:

 

Break the page into boxes that are big enough to contain the text you're looking for. Don't scan the whole page if you're only looking for, say, an invoice number in the top-right corner, a date, and an ID#. Just look in the places where you expect those things to be and ignore everything else.

 

If the forms are being scanned and sent by FAX, look for "noise" (lots of random dots). If you find a high noise level, then run something to get rid of the noise first as it will increase the recognition rate tremendously (and accuracy).

Share this post


Link to post
21 hours ago, David Schwartz said:

It's a component written in Delphi and runs incredibly fast. NOT A DLL!

As far as I can tell from looking at the website, this is not the case anymore for newer versions of TOCR (currently art version 5). 

Last version that unofficially supported Delphi was version 4 and even in version 4 is not a Delphi component (looking at the source code examples).

Edited by Fons N
  • Like 1

Share this post


Link to post

Well, the website seems just as vague today as it was when I worked with it (mid-2020). It was an incredibly fast library that was very easy to use in Delphi and cost under $100 USD for a license that had no limits on its use (ie, as many products as you wanted), no royalties, and no DLL. It was at work and I don't have a copy of it, so I can't give you any more details.

 

But if whatever you've seen makes you think you'll be better off with OCR libs that are set up as DLLs where they restrict your usage, charge you royalties, or won't let you use it in a situation where you're getting data fed to you by clients through a web site, then by all means open your wallet and hand them your credit card. They'll ping it monthly for quite a bit.

 

People do not write software that does OCR processing to deal with 50 pages of material. They're dealing with thousands to millions, usually being sent over the internet, and they are trying to replace workers with faster automated processing to boost their profits. The OCR software vendors know this, and they do all they can to ensure you're paying for every scan you process with their software.

 

Again, I replaced an old ABBYY DLL with this and the workload went down from 6-10 hours to 30-50 minutes. The number of rejected documents before was about 30% (mostly because of a poor software design) and after it was under 1%. The accuracy rates were 25% higher with this library. The ABBYY people refused to sell us a newer DLL, and after our IT guy explained what we were wanting to do, the ONLY option they gave us was to connect our software with their server, upload each scan and have it processed by them for something like a nickel a page. The estimated cost they were going to charge us far exceeded what we were already charging the customer for everything. I was told this part was just a courtesy we were extending to the customer, and there's no way they'd stick with us if we raised their price by even 5%. (so I was told.) 

 

I still got dinged on my annual review because Mgt believed it was supposed to be a "one-week effort to migrate this from Win XP to Win 10" and it ended up taking nearly 3 months while I had to research OCR libs and find one to replace the DLL that just did not work in Win 10. (That analysis was done before I was even hired and they didn't seem to care what I had to say about it.) And they grumbled that it cost them $85 no less. 

Share this post


Link to post
6 hours ago, David Schwartz said:

But if whatever you've seen makes you think you'll be better off with OCR libs that are set up as DLLs

On the contrary, I would take a Delphi component over a DLL anytime. My point is, transym.com does not sell a Delphi component anymore (at least not with it's latest version, which is at version 5).

 

https://transym.com/download/

 

image.thumb.png.190e8cf2af01b125f345172d0c4a732f.png

 

If version 5 would be available as a Delphi component, than there would be Delphi samples. There are none.

 

image.thumb.png.c92c04d665c8046cc80b1e8f0e220d73.png

 

Version 4 has Delphi samples, but only for 32 bit. I downloaded them, but unless I am missing something, it is still not a Delphi component.

 

Maybe version 3 is, but the website states that the latest update of these Delphi samples where updated in December 2011. That is a long time ago. I know code doesn't rust, but as it is clear there won't be any updates that specifically supports Delphi, I would think twice before using it.

 

A shame, cause the price, it's unrestricted use and what it can do, would be a top choice if you need OCR - if only it would be an actual Delphi component.

 

 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×