Font Embedding in PDF object - Essential Objects, Inc. Support Forum

Welcome Guest

Search | Active Topics | Sign In | Register

Essential Objects Product Support Forum » All Products » Support » Font Embedding in PDF object

Font Embedding in PDF object

Options

Previous Topic · Next Topic

Thomas D. Greer

Posted: Thursday, November 10, 2011 9:12:54 AM

Rank: Newbie
Groups: Member

Joined: 11/10/2011
Posts: 1

I'm testing the PDF .NET product for HTML to PDF conversion. In terms of layout, the resultant PDFs look fine, but I need to post-process the PDFs (text extraction for Postal Processing).

I'm concerned that the fonts are CID with Identity-H encoding.

Also, the "strings" are fragmented. If I select an address block, and copy/paste it into Notepad, for example, instead of discrete lines I get individual lines per word.

Can font type and encoding be specified / controlled?
Can tolerances for whitespace, or vertical / horizontal offests, font sizes, etc. be adjusted to "keep strings together"?

eo_support

Posted: Thursday, November 10, 2011 5:27:22 PM

Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,385

Hi,

Sorry about the delay. We do use Identity-H encoding together with ToUnicode map. The value in the content stream is the CID value, you can then use ToUnicode map to translate the CID value to unicode value. You can not control font type and encoding.

As to the string fragmentation issue, we do not have a lot of control over it. Usually if your texts are of the same font within the same element, they will come out "together" from Adobe Reader. However there is no firm rule on how Adobe Reader decides which piece of text comes after which piece of text. What we do is to render the text at the right location, but how to connect different text segments at different locations to a single sentence is a totally different matter and sometimes it's not possible. For example, if your HTML has three different words "this" "is" "great" artistically arrange with different font at different locations like they do in print ads, then all we do is to render three text segments precisely at where they suppose to be. However there is no way for us or Adobe Reader to figure out this is actually a single sentence.

Hope this helps. Please feel free to let us know if you have any more questions.

Thanks!

You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Message