searchable text in PDF

Hy

I'm trying to export DWG to searchable text in PDF but al my text is
blurred, when i get it in PDF.

can you help me with that.

Comments

  • Danijal,

    The reason your text looks strange is because your text is no longer text.

    I don't have Bricscad V11, so can't work out how you created the PDF, but if you look at the attached PDF all the letters are made up of just short lines and even shorter lines to make them look like curves.

    I don't know if the PDF Creator that Bricscad V11 had could handle text or if you have exploded the text before creating I can't say.

    I know this doesn't help much, but maybe it enables you to understand what is actually going on.

    David

    Text.pdf

  •  PDFs created by BricsCAD can definitely have problems, and can vary, depending on how they are created.

    In general, Mtext is the best choice for searchable text.  Any text in a viewport that is set to anything but "Legacy 2D Wireframe" or "Legacy Hidden" will become a raster image. So, any text or objects are converted to a series of dots.  This means it is not searchable as text, and also tends to make the PDF very large.  In some cases, the pixelation is quite visible.

    -Joe
  • @ Danijal:
    The problem of the blurred text is mainly caused by the fact that the lineweights for SHX fonts are somehow scaled in the exported PDF in BricsCAD V11. In the PDF SHX fonts appear ultra thin. You can do two things. Either increase the lineweight for SHX fonts (this will probably involve using a unique color and a dedicated .ctb file), or switch to a TTF font. I think the last choice is the best. Especially if you consider that later versions of BricsCAD can no longer create searchable PDF text from SHX fonts, as mentioned by Anthony.
    I know that there are companies that swear by the isocp font. If that is the case for you, you can try to find a (free) TTF lookalike font.
  • Anthony,

    I didn't even think to try and search in the PDF!

    You are correct, you can find anything you type in, even the words with the accents, I can search in both Acrobat Reader and PDF-XChange Viewer.

    The fonts seem to be embedded, but the view I get isn't great in Acrobat Reader if I zoom in to single words (see attached PDF).

    Conversely the view I get of the same text in PDF-XChange Viewer (a PDF created from screen grab doesn't do the view justice as it becomes pixelated) is much more linear, with all the parts of the font the same thickness apart from the curves still being a collection of straight lines in both viewers.

    That is as far as my knowledge takes me, Sorry.




    Text2.pdf

  • I, too, see straight line approximations of curved lines when I zoom in on Danijal's text. But I think that's just the nature of SHX fonts.
    They look that way in Bricscad too, i.e. no real curves. TrueType fonts have real curves both in Bricscad and in any PDF file produced from it.

    Is that what everyone means by blurred text? That would explain why everyone else sees blurring and I don't.

    An interesting point in regard to searching:
    In the files David posted, and in a PDF print I made from Danijal's file, the text is not searchable.
    I assume that's because the PDF printer drivers are refusing to use SHX fonts.
  • I think when the OP says 'blurred' he is referring to the greyed out effect of the texts in Acrobat Reader.

    I have recently come across a PDF of a low-res scanned document that did however have searchable texts. I do not know how this works but I believe that the V11 PDF export uses something similar. So the SHX fonts do end up as loose line segments in the PDF but these are somehow recognised as texts.
  • The scanner might have used optical character recognition, to produce the text without knowing it was generated by an SHX font.

  • Especially if you consider that later versions of BricsCAD can no longer create searchable PDF text from SHX fonts, as mentioned by Anthony.


    This isn't my experience. BricsCAD V15 will create searchable text for both SHX and TTF. If anything the feature has improved with more recent versions. The key to creating a searchable PDF is to use the built-in pdf creator, which is available using EXPORT. Unfortunately you can't access this from PRINT, and I've had no success using 3rd party PDF printers. In V15 the easiest way to access the built-in PDF option is via the PUBLISH command. From here you can select PDF from 'Publish To' dropdown.  The built-in PDF option really improved from V13 on, so if you're using an older version I wouldn't expect this to work.

    Another consideration is your PDF reader. I've found some don't recognise that searchable text is available. In terms of creating searchable text, my understanding is that on creation the application generates a word list with location details and includes it with the PDF. You can see this at work with scanned documents in some PDF applications like Foxit, and Nitro. Both provide a feature to use OCR to generate a searchable word list, no visible change to the PDF is made.

    Attach two examples generated using V15. One includes a mix of SHX and TTF fonts. In the other I've used font mapping, by creating a default.fmp file that substitutes SHX for TTF variants. To access the most common SHX fonts as TTF you can install DWG Trueview, which includes them.

    Regards, Jason Bourhill CAD Concepts 



    CCL-SearchableText-SHX.pdfCCL-SearchableText-TTF.pdf

  •  Perhaps we should request for Bricsys to provide a detailed description of all the issues in regards to creating PDFs.  It is frustrating to have to do all the trial and error testing to try to discern the rules of how various entities get converted.

    -Joe
  • @ Jason: In my tests I have compared V11 PDF export and V14 PDF export.
  • Attach my test drawing in case you want to try yourself. This is a 2D only drawing with the objects modified in ways that I know have caused issues in other CAD applications such as:
    • Applying width factor to TTF text styles.
    • Using non-zero elevation.
    • Clipping text in some way (Mask, VCLIP, XCLIP...).
    • Mirroring text.
    My understanding is that in these situations the text, particularly TTF is rendered as a bitmap approximation of the original text. Because of this it can cause issues with generating searchable text. With BricsCAD however I had no issues.

    The only difference between the two attached files is the default.fmp font mapping file. Otherwise the drawings are identical.

    I did find that there is differences in SHX files of the same name between BricsCAD and AutoCAD. With recent versions of AutoCAD the vertical text option has been dropped. I think this was done because they also provided TTF of the same styles, and vertical text isn't possible with TTF.

    In regards to confusion on PDF. I think there are three factors to consider:
    • Comparing different versions of BricsCAD. Certainly there is differences between the versions. In fact at times there can be differences between updates of the same version. I would expect Bricsys would only respond to queries or issues related to the latest version.
    • Using different PDF printers. Per my previous post I've only really had success by using EXPORT to access the built-in PDF creator. Using PRINT will give a different result.
    • Using different PDF readers. Again different readers will give different results. The same is true if you're trying to view 3D PDFs.

    I think a lot of confusion would be removed if Bricsys provided access to the built-in PDF creator from PRINT i.e provide a DWG to PDF.pc3. This would make it a lot more obvious to use, working in a way that people are familiar with. Perhaps it will come with V16!!

    Regards, Jason Bourhill CAD Concepts 

    CCL-SearchableText-TTF.zipCCL-SearchableText-SHX.zip

  • Attach another test example based around a 3D drawing with different visual styles applied to the vports. Here EXPORT to PDF doesn't create searchable text on the vports with a rendered visual style. To see what I mean try doing a search of the "MODEL SPACE TEXT", which has been create using SHX and TTF styles. The text in paperspace, and in non-rendered vports remains searchable. The quality of the render is really quite good.

    Regards, Jason Bourhill CAD Concepts 

    MixedPrintingStyles-Layout.pdfMixedPrintingStyles.zip

  • Thank you for those files Jason.

    Here is my first test with your files from post #13. Note: I am still using V14.
    1. Unpack all files and folders from CCL-SearchableText-SHX.zip. But don't install fonts or plot config files.
    2. Open CCL-SearchableText.dwg by double-clicking.
    3. Export to PDF.
    4. Result: The PDF file is searchable. File size: 336 kB.
    5. Exit BC without saving.
    6. Install CCL-BW.ctb.
    7. Repeat steps 2 and 3.
    8. Result: The PDF file is NOT searchable. File size: 1.199 kB.

    I have to admit that I am stumped. This is bizarre.



  • OK, now I understand what is going on. If the .ctb file is missing the PDF export does not use plot styles (as if PdfUsePlotStyles=0). And in V14 only with PdfUsePlotStyles=0 can you get a searchable PDF. But that is a big limitation since most people would want to create PDFs *with* plot styles.

    Another strange thing: If I create a PDF export from CCL-SearchableText.dwg with PdfUsePlotStyles=1 and the .ctb file installed, the magenta lines appear much thinner than the red lines. Event though the lineweight for magenta is 0.70mm and that for red is 0.18mm.
  • ..... objects modified in ways that I know have caused issues in other CAD applications such as:

    • Applying width factor to TTF text styles   ......
    My understanding is that in these situations the text, particularly TTF is rendered as a bitmap approximation of the original text.
    Because of this it can cause issues with generating searchable text. With BricsCAD however I had no issues.
    .....


    That may explain why I was surprised that Danijal's SHX-generated text is searchable in the PDF file, though others weren't surprised at that.
    The people I work with all use Autocad and an SHX font with a width factor, and I can never search for or select text in their PDF files.
    I use Bricscad and a TTF font with the same width factor, and the text in my PDF files is always selectable and searchable.
    I've always assumed that the difference was in the type of font, but perhaps it's just another example of the superiority of Bricscad.

  • OK, now I understand what is going on. If the .ctb file is missing the PDF export does not use plot styles (as if PdfUsePlotStyles=0). And in V14 only with PdfUsePlotStyles=0 can you get a searchable PDF. But that is a big limitation since most people would want to create PDFs *with* plot styles.

    Another strange thing: If I create a PDF export from CCL-SearchableText.dwg with PdfUsePlotStyles=1 and the .ctb file installed, the magenta lines appear much thinner than the red lines. Event though the lineweight for magenta is 0.70mm and that for red is 0.18mm.


    Hi Roy you're right. I checked your observations with 14.2.17 and got the same results. Indeed this is a severe limitation

    In regards to line weights, for me it was white lines that appeared much thinner, but not all white lines. Comparing to the PDF generated with V15.3.05 this issue seems to have been resolved. However I recently observed issues with V15 where it seemed to ignore that line weight scaling had been checked for the pagesetup.

    I guess at this point if you want a searchable PDF, then PDF export is the hands down winner. If however you need to maintain visual integrity, you will need to forgo searchable text and use a 3rd party Printer. This is a pity as BricsCAD almost has a feature that other CAD apps can't achieve, even with 3rd party add-ons

    Regards, Jason Bourhill CAD Concepts 
This discussion has been closed.