Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Lossy data and incorrect data Expand / Collapse
Author
Message
Posted Wednesday, August 21, 2013 9:26 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 2:30 AM
Points: 5,175, Visits: 2,789
This is quite horrifying as projects that I have been involved in that had some component of document management employing scanning then destroying originals did not come across this issue yet I would be exceptionally surprised if all of them were unaffected.

Fortunately for me I was never working on this part of the systems...phew!!!


Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!
Post #1486772
Posted Wednesday, August 21, 2013 10:00 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 11:53 AM
Points: 2,264, Visits: 1,312
Thanks Phil for taking the time and penning this piece. When I first started reading I was wondering how it would tie in and boom there it is. This has some very interesting ramifications in Public Disclosure cases and retention of the Copy of Record for legal purposes.

M...


Not all gray hairs are Dinosaurs!
Post #1486789
Posted Wednesday, August 21, 2013 10:04 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Tuesday, July 8, 2014 1:39 PM
Points: 245, Visits: 735
marlon.seton (8/21/2013)
[quoteYou had to sit in name ascending rank arrangement? I cannot imagine something so militaristic.

I imagine it was strictly for the teachers to be able to learn the names easier. This was in 1 - 6 grade mostly I think, but I seem to remember a couple of times in secondary school.


<><
Livin' down on the cube farm. Left, left, then a right.
Post #1486793
Posted Wednesday, August 21, 2013 4:50 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: 2 days ago @ 10:00 AM
Points: 8,551, Visits: 9,043
Anyone who did what is suggested in the last sentence of the editorial ought to be given a good thrashing with a clue stick. (My comments here are relevant to nothing other than that last sentence.)

Numerics are far better compressed by OCR than by image compression, and the rule should be to use OCR to get the numeric (and alphabetic) components BEFORE any compression is done (since areas of image can usually be then thrown away without loss of anything useful).

Note the positions and sizes of the alphanumeric chunk in the image. replace them by neutral background in the image, compress the revised image, then store the positions/sizes and text along with it. This will give better compression than compressing the image with the text/numeric data in it, and will ensure the accuracy (to the limits of the OCR plus whatever checking on the OCR is done) than compressing the text/numeric with the image, thus giving better compression and better accuracy at the same time, in all cases where the background of the text/numeric data is not important - which means in just about every case where the text/numeric content has any legal implication. Of course "in just about every case" doesn't mean always, there are cases where this technique is not good enough; even then, it's going to be better than compressing the image and using OCR on the result in any case where the accuracy of the numeric/text data matters: compressing the image with the original text/numeric zones included and keeping also a record of the text/numeric data and its positions in the image will have only a small compression penalty in exchange for an enormous improvement in accuracy of the alphanumeric data compared to compressing before applying OCR.


Tom
Post #1486994
Posted Thursday, August 22, 2013 9:22 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 3:19 AM
Points: 577, Visits: 2,503
Xerox have found the bug which affects all compression levels. and are issuing a patch which seems to work. hundreds of thousands of devices are affected
http://realbusinessatxerox.blogs.xerox.com/2013/08/07/update-on-scanning-issue-software-patches-to-come/#.UhYrMz_px8E
It could be more than just a Xerox problem. For the latest news see
http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_are_switching_written_numbers_when_scanning#might_this_be_more_than_a_xerox_problem



Best wishes,

Phil Factor
Simple Talk
Post #1487345
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse