Post to Database Programing Question?

by emplife21. Posted on Sep 11, 2020    0    3


Hi everyone, I'm an entrepreneur with only basic knowledge of programming and I had a question that hopefully, someone should be able to answer or at least provide a lead that could help me get closer to a potential answer for my research.

I have seen a lot of technology floating around that involves taking a picture of an object (i.e. receipt from a purchase) and some basic information from the receipt was gathered and posted on a webpage (ex. https://myblackreceipt.com/upload-receipt/).

I was wondering what is the correct terminology for this technology and would it be possible to do something similar to this but would perform this action pathway stated below?

Person takes picture of receipt > Name of business is found within said database in > The count/tally for the respective database increases by 1 (+1) > Count is displayed on a webpage in numerical order from most to least.

If someone could answer this question that would surely be helpful :). Thank you all and have a wonderful day.


Comments

le_crosst 3

The term you might be looking for is OCR (Optical Character Recognition).
It is definitely possible and should be pretty straight forward. You could for example check out AWS Textract. It is a OCR managed service. The rest of the pipeline you are describing is basic business logic.

  emplife21 1

Thank you!

RecursiveBob 3

Seconding what /u/le_crosst said, this is mostly straightforward stuff, with the exception of the OCR. A couple things to bear in mind:

  • Do NOT do OCR yourself. Use a library, either the AWS one or one of countless others. OCR takes a lot of time to get right, so you don't want to reinvent the wheel.
  • Make sure the rest of your process is tolerant of errors. OCR is never perfect, there are always going to be receipts where the company name isn't scanned properly and you get a misspelling.
  • If you're making a mobile app, it might be best to use an OCR library that's part of your app rather than part of your site. That way you save on bandwidth and your site is more scalable.