Jump to content
THansen

Fully Searchable PDF output How To Fix. Text Width Factor, Oblique Angle, Mtext

Recommended Posts

THansen

I have been working to correct issues we did not realize we had with PDF output from AutoCAD DWG and Inventor IDW files. I have seen several posts related to the same thing so I wanted to share what I have discovered and the fix for it to date.

 

If you are finding that your DWG files or IDW files when output to PDF from just about any method are not FULLY searchable your not alone!

 

ROOT CAUSE

1.) Any Text Style using a SHX font will not convert to searchable text. Instead it is output as an image.

 

WHY - SHX fonts are unique to AutoDesk products and were developed for the pen plotter days. In today's world there is no reason to be using them. Since they are unique to Autodesk and "old" they are not a native font to your modern OS. As a result when you try to process the file with a PDF print driver, pc3 file, etc. the program does not recognize the font and thus outputs it as an image. Images are not searchable text!

 

FIX - Change the font used in all Text Styles to a TrueType font (TTF) or an OpenType font with TrueType Outlines. To determine what fonts are on your system START>SEARCH>FONTS and double click on a font. At the top left corner it will indicate what it is. For Arial Black (OpenType) the information given is "OpenType Layout, Digitally Signed, TrueType Outlines"

 

Try to "standardize" on fonts with these features:

A.) A font family with a lot of choices for different "widths" and "styles" like condensed, narrow, Bold, Italic, Narrow Italic, etc.

 

B.) A font that is designed for different languages to cover your localization needs. This makes translating drawings much easier at a latter date. Arial is a good choice since it is designed for Latin; Greek: Cyrillic; Hebrew; and Arabic.

 

When applying the new font to existing Text Styles use the existing Text Style's Text Width Factor (TWF) setting to clue in on what font to use. If the TWF is 0.85 then maybe Arial Narrow is a good fit.

 

2.) Any TTF that has been altered in anyway by AutoCAD will not be recognized as a standard font and will convert as an image.

 

A.) The TWF MUST be 1.0

Instead of using TWF to condense or expand text use the font family style instead; Arial Narrow, Arial Narrow Bold, etc. You may have to use other fonts to accomplish this as Arial may not have enough choices in the font family.

 

FIX - Use lisp to change all EXISTING text to a TWF of 1.0. This includes single line text, Mtext, text in blocks, text in dynamic blocks, text in attributes, etc.

 

B.) The Text Oblique Angle (TOA) must be 0.0

Instead of using the TOA use different font styles like Italic for example.

 

FIX - Use lisp to change all EXISTING text to a TOA of 0.0. This includes single line text, Mtext, text in blocks, text in dynamic blocks, text in attributes, etc.

 

C.) Any text inside a block or object that has not been scaled on x;y symmetrically will not be recognized as "text" and converted as an image. You can scale a block, it just has to be symmetrical. Scaling to x=1.5, y=1.5 works. However x=1.5, y=2 does not. Again, the program cannot match the shape of the font to a "standard" font.

 

FIX - Use lisp to re-scale all blocks to be symmetrical or explode the blocks.

 

In the process of doing this you can also change text height and other attributes of text elements.

 

WHAT'S BEEN TRIED

Since we have 90,000 plus drawings with the above issues I have tried several programs to try and get a fully searchable PDF with limited success. In all cases except for Acrobat Pro the result is some unsearchable text. This includes the following: Adobe PDF, CutePDF, ClarityPDF, PrimoPDF, DWG to PDF.pc3, etc. I even tried exporting to DWF, DWFx first then trying to conversion to PDF and got the same results.

 

OCR

Typically our drawings are a mixed bag.....some have ALL SHX fonts, some have TTF that are altered, and some are all TTF that have not been altered. 90% have SHX and altered TTF since our title blocks have the TTF but with TWF of from .8 to .95.

 

The PDF output of these yields some searchable text and some not searchable. Most of it being non-searchable.

 

We tried processing with OCR but the results are not very accurate AND most pages cannot be OCR processed anyway. Text elements that are close to lines and other objects confuse the OCR process. This was most notable in item balloons, revision triangles, and BOM tables. Having a searchable BOM is one of the key needs for us.

 

The OCR process also converts the ENTIRE drawing to an image. It then "scans" for "shapes" and tries to match them to existing fonts on your system. It puts this OCR output on a hidden layer directly under the text. This is why when you try to highlight text the area selected is not directly under the actual text.

 

With Acrobat Pro......if the page contains a single character of "rendered" text the the program assumes the entire page has already been OCRed and will not allow you to process the rest of the page. Adobe.....??? So any page that has TTF that is not altered will come over as rendered text and OCR will not work!

 

ACROBAT PRO

Acrobat Pro works provided you process the DWG file directly and you have to go into the Acrobat settings and set the PATH statement for the location of the SHX fonts, plotter, and plot config. files on your system. Acrobat then substitutes a TTF font for each SHX font found in the file when producing the output. It does the same for fonts that have been altered by comparing the "shape" to its database of fonts and substituting one that is close. It can even create a new font on the fly if needed. This is not a fast process as it takes some processing HP to do this.

 

FIX THE ROOT CAUSE

I have read that AutoCAD 2016 has some better tools for producing searchable PDF but our workflow creates the PDF from a DWF file. We use Vault and it creates DWF or DWFx visualization files. It will not create a PDF w/o purchasing an add on. So for use the only way to correct this is to fix the root cause.

 

LISP PROGRAMS

I found a lisp by Lee Mac called FixAllText that comes close to doing what is needed. I modified the lisp to include code to change the existing text styles to use the Arial Font and set all Text Style attributes as needed. ie TWF=1.0 and TOA=0.0. I did not change text height, color, layer, etc.

 

What is missing is code that will look at existing Text Style "TWF" settings and automatically choose a font family style based on existing TWF setting.

 

I would like to to see variables at the top of the code for something like the following:

 

If TWF is in the range 0.95 to 1.0 to 1.05 use ARIAL REGULAR

If TWF is in the range 0.95 to 1.0 to 1.05 AND the TOA is not 0.0 use ARIAL REGULAR ITALIC

If TWF is in the range 0.85 to 0.94 use ARIAL NARROW BOLD

If TWF is in the range 0.75 to 0.84 use ARIAL NARROW

If TWF is in the range 1.06 to 1.15 use ARIAL BOLD

If TWF is in the range 1.16 to 1.50 use ARIAL BLACK REGULAR

etc.

 

Or if these is a way with lisp to determine how close the start and stop points of text are to other elements then base font selection on that??

 

Code I have to date is attached.....Any help is appreciated

FixAllText.lsp

Share this post


Link to post
Share on other sites
ttray33y

that process is tedious on my opinion.

theres an easy solution for this using just Adobe Acrobat, as long as the print definition were set during the drafting cycle.

*benefits on the method I know is that, you dont need TT fonts no changing on anything inside your drawings, OCR is next to useless, , just set the configuration once & press print (even batch print is possible).

 

-NO TT fonts

-OCR

-NO LISP

 

just pure Adobe Acrobat utilization.

see my sample attached PDF.

Font type is just romans.shx

File from AutoCAD P&ID 2014 sample

Software use to print Adobe Acrobat Pro XI

12.pdf

Share this post


Link to post
Share on other sites
SLW210

I haven't checked, but I believe starting with AutoCAD 2016 DWG to PDF, SHX Fonts were searchable.

Share this post


Link to post
Share on other sites
THansen

I fully agree Acrobat Pro will do the job. But with our current workflow that does not help. Maybe someone can recommend a different workflow or script that will give us a work around.

 

For each truck we produce we generate a truck manual for archival purposes and use by our service/parts and dealer personnel. We produce about 35 trucks a month. We build highly customized trucks so BOM is different for each one and thus manual content is different for each.

 

We product the manual using CURRENT documents from the Vault file store since new drawings are created daily and others are revised daily. We need to make sure manual content is as up to date as possible.

 

Current workflow is as follows:

 

1.) All of our DWG files are stored in Autodesk Vault Pro 2015.

 

2.) Folder structure in Vault is several folders and sub folders deep.

$/Engineering/600/692/692-005000/692-5124.dwg

$/Engineering/711/711-004000/711-4000.dwg

$/Engineering/711/711-036000/711-36201.dwg

$/Engineering/800/800-006000/800-6507.dwg

etc.

 

3.) Vault will Automatically generate a visualization file upon CHECK-IN. The only file format it will create are DWF and DWFx. We currently use DWFx since it has enhanced design review capabilities. Vault will also place a copy of the DWFx in a folder outside of vault for access by non-vault users. We use this feature so we have access to the DWFx files for multi-page PDF generation. The file folder structure replicates that of the Vault so it is multi-level folders.

 

4.) We have a separate script that produces a list of DWFx file names with full path statement to the DWFx files in the network folder. The script copies all the needed DWFx files to a folder on the client machine. All DWFx files end up in one folder, ie flat folder structure.

 

5.) To generate the multi-page PDF we use Design Review's Batch plot feature to batch convert the DWFx to PDF. When doing this the client machine's default printer is set to Adobe PDF.

 

6.) After all the separate PDF files are created we use Adobe Acrobat to combine all into multi-page PDF file and then QA the manual, rotating pages, etc.

 

 

DESIRED WORKFLOW

OPTION 1

If we could get the Vault to copy the needed DWG files out of Vault and into a single file folder on the client machine then we could use Acrobat Pro to process the DWG files directly which works. Acrobat will also combine all the individual PDF files into a single multi-page document.

 

OPTION 2

Our service, parts, sales, etc. would prefer there be a PDF version of all the released documents in Vault so they could use that format in emails instead of the DWFx format. Dealers, vendors, and customers don't know what a DWFx is and don't want to mess with it. They want PDF. We also want to eliminate the need for having Vault copy all visualization files to a network folder since all internal personnel now have Vault Client and can access the files directly in the vault.

 

But, we still need to feed Vault a list of file names and have it either copy the PDF files to a client folder or have it build the PDF internally on server side. The server side is desired so that client machine is not burdened with the process.

 

We are hoping there is a way to do this with a custom script but don't have internal personnel to do this.

 

Thus, my search for help on these forums

Share this post


Link to post
Share on other sites
ttray33y
I haven't checked, but I believe starting with AutoCAD 2016 DWG to PDF, SHX Fonts were searchable.

We tried 2016 and yes it made it searchable but my drafter said my way was easier than browsing google. :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×