released in TLY2016.
created by this new feature using other fonts. I usually use Siddhanta and
Sanskrit2003 font.
can provide a few more sample PDFs in devanagari for testing.
- sent from my phone. excuse the brevity.
Post by ShreeDevi KumarTesting dev-actualtext.pdf sent by JK
* Adobe Acrobat Reader XI on Windows 10
o Does not highlight text fully
o SEARCH finds words and word parts correctly but usually
highlights only beginning of the word containing the letter
o COPY paste to NOTEPAD++, OPENOFFICE WRITER works correctly,
o Save as TXT file does not work correctly - only saves ... in it,
not the actual unicode text which can be copied
So it looks like Acrobat makes use of the ActualText for Search and Copy,
but sadly its "Save as Text" doesn't support Unicode.
I'm pleasantly surprised to see the Gmail previewer also handles it.
The others (Foxit, Edge) sound like they're just working from the glyph
stream, which is basically doomed to failure.
For a further data point, I tried Evince (Document Viewer) on Ubuntu
15.10, and found that Copy and Search work well; it looks like it is using
the ActualText correctly. This is thanks to the poppler library, I believe.
The (poppler-based) "pdftotext" tool was also able to extract the Unicode
text correctly from the PDF, although "pdftohtml" didn't do so well.
One issue with Evince is that drag-selecting text to highlight it (as for
Copy/Paste) looks bad: the highlighting completely obscures the selected
text, although it will end up being copied correctly. Interestingly, its
highlighting of search results doesn't suffer from this problem, and it
even makes a fair attempt (not completely accurate) at highlighting
specific letters within a word, not just entire words.
JK
* Foxit Reader 7.3 on Windows 10
Post by ShreeDevi Kumaro Highlights text fully,
o smallest highlight unit is word,
o COPY paste to notepad++ as well as SEARCH does NOT work
correctly as Unicode text is not fully correct.
à¥à€¯
à€¿à€šà€à¥à€¡ à€à¥à€¯à€Ÿ à€¹ ? à¥
o
âSave as TXT file does not work correctly - saves the unicode
text with same problems as in copy and pasteâ
*
âMicrosoft Edge Viewer on Windows 10
o
â
Highlights text fully,
o COPY paste to notepad++ as well as SEARCH does NOT work
correctly as Unicode text is not fully correct.
à€¯ à¥à€¿à€šà€à¥à€¡ à€à¥à€¯à€Ÿ à€¹à¥?
*
â
Previewing from within gmail in Chrome on Windows 10 -
o Highlights text fully,
o smallest highlight unit is word,
o COPY paste to NOTEPAD++, OPENOFFICE WRITER works correctly,
o (highlights only first letter of first word in
paragraph à€¯à¥ rather than full word à€¯à¥à€šà€¿à€à¥à€¡)
o there is NO SEARCH feature
o there is no save as TXT file feature
* Same as above while Previewing from within gmail in Internet
Explorer on Windows 10
ShreeDevi
____________________________________________________________
Using Akira-san's "actest.pdf" as sample, Adobe Acrobat Pro 7.1 allows
me to select only half of the text whereas Adobe Reader DC allows me to
select it all; neither allows me to select individual kanji.
Ah, right... as there are no spaces between the kanji, they'll end
up in the same text object. That's a shortcoming of how the current
implementation works, for scripts that don't use inter-word spaces.
In either case, copy&paste actually gives you the whole text, even
though AAPro only highlights half of it, I guess?
JK
--------------------------------------------------
http://tug.org/mailman/listinfo/xetex
--------------------------------------------------
http://tug.org/mailman/listinfo/xetex
--------------------------------------------------
http://tug.org/mailman/listinfo/xetex