Guest guest Posted July 6, 2003 Report Share Posted July 6, 2003 "krishna kashyap" <kkalale1 > How can anyone scan in old texts. That way we can store the info. > If you can find out what tools are needed and how one can do that it > will be great > The process of bringing out a text electronically involves the following Note: A variantof this process was used for the Sri bhasya 1. A copyright clearance process. Depending on the text in question, and the need for distribution this would be one of the following a) All texts "first" printed pre-1923 are free of copyright as a general rule of thumb. b) For those printed later permission from either the author, publisher or estate. 2. Scanning in All this requires is a $49-$69 scanner for most works and once attached to a computer this can be easily scanned in. The files are typically saved in .png format to take advantage of space etc. 3. OCR (Optical Character Recognition) Typically, the scanner comes with OCR software. For english texts the scanned image can be OCR'd to produce raw digital text output. This will typically have numerous errors and would need proofing Although recently a Sanskrit OCR has been announced by Dr. Venugopal http://www.cedar.buffalo.edu/ILT it's efficacy is yet to be determined. For Tamil texts none exist, although I have been working with Project Madurai who are now testing out the Distributed Proofreading software url - http://www.tamil.net/projectmadurai 4. Proofread the OCR'd text. This can be done manually or through the use of a Distributed Proofreading process. for. e.g. http://www.pgdp.net 5. Release the e-text If it is copyright free, typically at the start of the process one would use the legal services of a group such as Project guttenbewrg to ensure that it is free of copyright free and subsequetly Project Guttenberg would e-release it in their collection. Since images etc. take up significant web-space i.e. in the gigabytes besides needing wide exposure, sripedia.org has gained official status with ibiblio giving it wide exposure, mirroring on mulitple locations worldwide and access to as-large-as needed web space and bandwidth. Note archive.org is way beyond tera and now reaching petabytes in storage. Please feel free to ask me any further questions that you may have. Thanks Srinivasan Sriram Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.