Digitization errors in Hungarian documents
Pataki, Máté and Füzessy, Tamás (2007) Digitization errors in Hungarian documents. In: AACS '07. Proceedings of the automation and applied computer science workshop. Budapest, 2007..
  | 
            
              
Image (cover image)
 cover.jpg - Cover Image Download (6kB) | Preview  | 
          |
  | 
            
              
Text
 200706_AACS_DigitizationErrors.pdf - Published Version Download (510kB) | Preview  | 
          
Abstract
Our task was to analyze a certain digitizing system, check what type of errors emerge during the process, and how these errors effect the searchability of the digitized documents. We have set up a testbed which is suitable for the automatic processing of digitized texts in a large scale. In this paper we shortly introduce the methodology of document digitization emphasizing the error-sources in the process, and sketch the results obtained from our test-system, especially the Hungarian language dependent characteristics of the emerging errors.
| Item Type: | Conference or Workshop Item (Paper) | 
|---|---|
| Uncontrolled Keywords: | character recognition, text processing, search, error, OCR | 
| Subjects: | Q Science > QA Mathematics and Computer Science > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány | 
| Divisions: | Department of Distributed Systems | 
| Depositing User: | Eszter Nagy | 
| Date Deposited: | 11 Dec 2012 15:26 | 
| Last Modified: | 11 Dec 2012 15:26 | 
| URI: | https://eprints.sztaki.hu/id/eprint/4402 | 
![]()  | 
			Update Item | 
        


