Home Products docWORKS

Contact

Dr Hartmut Janczikowski

hartmut_20100504_1265998323

Key Account Manager
E-mail

Library Digitization

dw_20100504_2062793657

visualized conversion process

Digitization and Conversion Technology For Monographs and Serials

Making collections or even entire libraries digitally available calls for a multitude of tasks to be accomplished. Books or newspapers are turned into meta data enhanced digital surrogates that comply with the specifications previously set up by the library and that satisfy the expectations of their users. Seeing as not all processing steps are always needed or wanted, docWORKS is set up modularly.

dwe_20100504_1855927917

INPUT / IMPORT BIBLIODATA

Scanned documents and digital images are imported, and pre-existing bibliographical meta data are added. Communication with the library and catalogue system means the process is monitored continuously using digital tracking. The physical location and status can be retrieved at any given time.

DIGITAL IMAGE ENHANCEMENT

Scanned pages are deskewed and despeckled where necessary, as well as cropped or resized. Scanned double pages are automatically split.

LAYOUt analysis

Page zones such as headings, text blocks, charts, advertisements and illustrations are separated, highlighted and classified.

TEXT recognition

The zones that were predefined during layout analysis are transformed into digital full text. A large variety of modern and historical fonts, languages and dictionaries are used, resulting in up to 99% recognition exactitude.

In addition docWORKS also uses historical and specialist dictionaries and algorithms to allow for better recognition of mixed font and “Fraktur” (Gothic print) texts.  

STRUcTURE ANALYSis

Intelligent Structure Recognition (ISR) automatically marks

  • lead-in, main body, final section,

  • chapter, subchapter, section, article etc.

  • captions for images or tables/charts, author, footnote

META DATA

Physical and logical structures are converted into XML data formats. The transferred bibliographical meta data are also assimilated.

CORRECTION/QUALITY CONTROL

Interactive quality assurance, which can also be performed online by service providers, allows for specifications to be upheld to the highest level.

EXPORT

Exporting takes place in accordance with library standards.

Migrationable, OS independent XML data, such as METS/ALTO, image data formats like TIFF, JPEG, JPEG2000 and PDF formats such as PDF, structured PDF, PDF/A etc. can all be exported.

 
© 2010 • CCS Content Conversion Specialists GmbH • Weidestrasse 134 • 22083 Hamburg • T +49 40 227 130 0 • F +49 40 227 130 11 • info@content-conversion.com
Imprint
    Legal Disclaimer     Privacy Policy     Contact