Outils pour utilisateurs

Outils du site


diy:projets:paperscanner

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentesRévision précédente
Prochaine révision
Révision précédente
diy:projets:paperscanner [2018/04/25 21:05] gbouyjoudiy:projets:paperscanner [2018/04/25 21:33] (Version actuelle) – [Crop and rotate target paper] gbouyjou
Ligne 25: Ligne 25:
 ---- ----
 ===== Explanation of source program ===== ===== Explanation of source program =====
-Firstly we have a picture of text page like this 
  
-{{:diy:projets:out.jpg?400|}}+---- 
 +==== Crop and rotate target paper ==== 
 + 
 +Firstly we have a picture of text page like this\\ 
 +{{ :diy:projets:out.jpg?direct&400 |}} 
 +\\
 and we have to retrieve all sentence of this text.\\ and we have to retrieve all sentence of this text.\\
 +\\
 So before apply image treatment operations, I want to crop only the body text and align it.\\ So before apply image treatment operations, I want to crop only the body text and align it.\\
-I need to find the four corner of page before use warpPerspective function, to eliminate other color unlike the white page, I use histogram to exclude other colors +I need to find the four corner of page before use warpPerspective function,\\ 
-On histogram, there are two peak, on left this is the yellow color chair, and the other is the page color, seem as yellow more white that the precedent, is not a perfect white page. +to eliminate other color unlike the white page, I use histogram to exclude other colors\\ 
-{{:diy:projets:screen.png?200|}}+{{ :diy:projets:screen.png?direct&200 |}} 
 +On histogram, there are two peak, on left this is the yellow color chair,\\ 
 +and the other is the page color, seem as yellow more white that the precedent,\\ 
 +is not a perfect white page.\\ 
 +\\
 I find the two boundary with this code I find the two boundary with this code
- 
 <code python> <code python>
     img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)     img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Ligne 78: Ligne 86:
 </code> </code>
 after that, I make the contours, I see clearly the quadrilateral, and find the corners. after that, I make the contours, I see clearly the quadrilateral, and find the corners.
-{{:diy:projets:screen2.png?400|}}+{{ :diy:projets:screen2.png?direct&400 |}}
  
 <code python> <code python>
Ligne 103: Ligne 111:
 </code> </code>
  
-the result : +the result :\\ 
-{{:diy:projets:screen3.png?400|}}+{{ :diy:projets:screen3.png?direct&400 |}} 
 + 
 + 
 +---- 
 +==== Treatment (Thresholding, blurring, etc) ====
  
 So now we must treat text character before launch tesseract recognition So now we must treat text character before launch tesseract recognition
Ligne 120: Ligne 132:
     img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]     img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
 </code> </code>
 +
 +
 +----
 +==== Character Recognition ====
  
 finally I use tesseract and check if each word exist in a language text dictionary finally I use tesseract and check if each word exist in a language text dictionary
Ligne 139: Ligne 155:
     print("\naccuracy = ", cpt, '/', nbWord, ' ', "%.1f" % (cpt *100 /nbWord), "%")     print("\naccuracy = ", cpt, '/', nbWord, ' ', "%.1f" % (cpt *100 /nbWord), "%")
 </code> </code>
 +after ten seconds, I find 34.4% correct French word in the text. 
 +{{:diy:projets:screen4.png?direct&400|}}
  
diy/projets/paperscanner.1524690320.txt.gz · Dernière modification : 2018/04/25 21:05 de gbouyjou