ocr - Is there a way to have Microsoft Office Document Imaging capture the document's layout?

07
2014-07
  • oscilatingcretin

    I am trying to scan a packet of information into a series of Word docs. The first few pages are part of an index. It's laid out like this:

    Article VIII                                     13
        Section 1 .... Notice of Association         13
        Section 2 .... Notice of Unpaid Assessments  13
        Section 3 .... Notice of Other Notices       13
    Article IX                                       14
        Section 1 .... Conflict of Interest          14
        Section 2 .... Blah blah                     15
    

    When I open the scanned .TIF in MODI and copy/paste it into Word, it looks like this:

    ARTICLE I.
    Sect ion
    Section
    Sect ion
    Sect ion
    Sec;ion
    Section
    Section
    Section
    Section
    Section
    Sect ion
    Section
    ARTICLE II
    Section 1.
    Section 2.
    Section 3.
    Section 4.
    

    Basically, it seems to convert whitespace and consecutive periods into carriage returns. If it could at least maintain the position of sections of text by using tabs or spaces then that would be at least somewhat awesome.

  • Answers
  • matan129

    For as far as I know, MS Document imaging cannot capture the layout of an document, but these products can:

  • Atari911

    I know this sounds like a strange way to do it but if you have a copy of Adobe Acrobat you could scan it as a PDF and then save the PDF as a word doc. I have found this to be an effective way to convert scanned docs to word.


  • Related Question

    Use Outlook 2003 with Word editor, but have Office 2007 installed
  • Jon

    Is it possible to use Outlook 2003 and enable "Use Microsoft Office Word 2003 to edit e-mail messages" (in the Mail Format options), but also have Office 2007 installed?

    It appears that both Word 2003 and Word 2007 duel over which one is default. Having the Word as the email editor only seems to work when Word 2003 is the default, but when you launch Word 2007 it reruns setup and makes it the default.

    Does anybody know of any workarounds or ideas? Thanks!


  • Related Answers
  • Kez

    There is an easy fix for this, see below. It is taken from the Microsoft KB, just scroll down to the section called Multiple versions of Word. I have used it before when dealing with an Act! problem that needed Word 2003 to be (and stay) the default instead of Word 2007 (both installed at the same time) - worked a treat.

    1. Exit Word 2007.
    2. Start Registry Editor.
      • In Windows Vista, click Start, type regedit in the Start Search box, and then press ENTER. If you are prompted for an administrator password or for a confirmation, type the password, or click Continue.
      • In Windows XP, click Start, click Run, type **regedit** in the Open box, and then click OK.
    3. Locate and then click to select the following registry subkey: HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Word\Options
    4. After you select the subkey that is specified in step 3, point to New on the Edit menu, and then click DWORD Value.
    5. Type NoReReg, and then press ENTER.
    6. Right-click NoReReg, and then click Modify.
    7. In the Value data box, type 1, and then click OK.
    8. On the File menu, click Exit to close Registry Editor.

    Once you have followed the steps above, when you open Word 2007 it will stop trying to set itself as the default version. Simply set 2003 to be the default again and the setting will stick for good.

  • Pär Björklund

    It's certainly possible to have both the 2003 and 2007 versions of Office installed without the issues your describing. What you're seeing are often a result of registry issues during the install/configuration causing Office to not be correctly installed.

    Easiest way to resolve this is to uninstall both versions, install Office 2003 and then install Office 2007 (make sure to use the customize option and select which 2003 apps to leave on the system).

    If you're still having issues it might be related to your user account. If you have the possibility I would suggest trying a new user account to see if there's a difference.