M
Metspitzer
I take screenshots that contain a lot of text. Is there a built in
program (Win7) that will convert the image to text?
program (Win7) that will convert the image to text?
I take screenshots that contain a lot of text. Is there a built in
program (Win7) that will convert the image to text?
Metspitzer said:I take screenshots that contain a lot of text. Is there a built in
program (Win7) that will convert the image to text?
Perhaps if you load it into AcrobatX and OCR it?
Metspitzer said:I take screenshots that contain a lot of text. Is there a built in
program (Win7) that will convert the image to text?
I take screenshots that contain a lot of text. Is there a built in
program (Win7) that will convert the image to text?
Anything that says "OCR" or "Optical Character Reading" should work.
Something called "ABBYY FineReader" came with one of my scanners.
I can't find anything inbuilt into Win7 but this:
http://www.sevenforums.com/software/217440-victory-irfanview-ocr.html
might be useful.
J.
Metspitzer said:I take screenshots that contain a lot of text. Is there a built in
program (Win7) that will convert the image to text?
James Silverton said:PureText may do what you want. It's free and I use it a lot.
He did say built in - or is AcrobatX part of 7?
(As others have said, what you need is OCR: screenshots of plain text
should give near 100% accuracy. There are a few free ones, or if you
have a scanner, I'd be slightly surprised if it didn't come with some.)
I use the ABBYY OCR program that came with my scanner.*I don't know the answer (though I suspect not), but if you have Office,
that has some OCR ability.
What are you going to do with the non-text parts?
How are you going to handle overlapping window parts?
Is there a reason you can't just use highlight-and-copy anyway?
He did say built in - or is AcrobatX part of 7?
I don't know the answer (though I suspect not), but if you have Office,
that has some OCR ability.
What are you going to do with the non-text parts? How are you going to
handle overlapping window parts? Is there a reason you can't just use
highlight-and-copy anyway?
I bookmarked this. I have used IrfanView before. It seemed useful.Anything that says "OCR" or "Optical Character Reading" should work.
Something called "ABBYY FineReader" came with one of my scanners.
I can't find anything inbuilt into Win7 but this:
http://www.sevenforums.com/software/217440-victory-irfanview-ocr.html
might be useful.
J.
Or you could have a read of :
http://answers.microsoft.com/en-us/...ionality/0c90f381-40cb-41ad-8e5e-25831dd8989f
which is very authoritative.
Sort of.
You're looking for OCR. (That's a general function,
to go from a pixmap, to a string of text, perhaps
output in Word format.)
And generally that's something you pay for. I don't
know if any of free ones are "worthy" or not.
http://en.wikipedia.org/wiki/List_of_optical_character_recognition_software
*******
But another area that tries to do things like that,
are "screen readers" or text to voice functions. They
need to vocalize the text they seen on the screen,
for the visually impaired. This doesn't immediately
solve your problem, but the article shows there are
other "hooks" in the system, that can help acquire
the text strings you want.
http://en.wikipedia.org/wiki/Screen_reader
You would need a screen reader, that happens to keep a
text copy of "what it saw". That then, would be a
"poor man's OCR", relying on messages from the system
for the details. That is better than starting from
scratch, picking apart pixmaps.
Paul
Metspitzer said:OCR. Got it.
Thanks
I did a test, and you can see a "partial" result here.
http://imageshack.us/a/img849/3530/mak3.png
There is a problem with your idea. The problem with screen
captures, is things like ClearType. If your OS has
ClearType enabled, it puts "color fringes" around
the letters.
http://en.wikipedia.org/wiki/Cleartype
*******
For my experiment, I chose to view some text in a web browser
(rather than some dialog box).
I chose a couple ways to capture the web page. One was "Export to PDF",
which avoids ClearType and renders the web page into a PDF. That
gives a clean copy of the screen. I converted the PDF to an image, so
I could pretend that test file, came from a paper scanner.
The second method, I used "screen capture" of the web page,
to capture it. Doing screen capture, also captures the
effects of ClearType.
In my Imageshack screenshot, the upper left is an "Export To PDF"
method, while the lower left is via screen capture. You can see
the color fringes around the text in the lower left.
When I ran OCR on the image in the lower left (with the color
fringes), the recognition rate was 0%. Nothing got captured.
There was no text to wipe over and copy/paste.
For the view in the upper right, there I took a picture copy
of the PDF (so the OCR could work on it), and brought it over
to my OCR tool. You can see in the upper right "results",
I managed to wipe over some selections. In Acrobat Paper Capture,
if you can wipe the text cursor over the surface of the document,
and things highlight, that means the OCR step worked properly.
Since Adobe Paper Capture (in Acrobat), layers the text strings
on top of the original image, you can check for proper character
recognition, by looking for differences between the string
on top the image, and the image itself underneath. In my upper-right
example, you can see there are no differences, or 100% recognition
in the sample area. (I zoomed in, to make those examples easier
to see, but the whole document on the upper right, was clean like that.)
Summary: Screen capture sucks as an information source, unless you're
very careful to turn off any screen anti-aliasing method.
Paul
Thanks for that info. It is really a shame. I would have thought
that a computer would be pretty good at recognizing typed text.
Gene said:And you would have been right.