Sources Contact Advanced Search Tutorials

An Interest In:

Web News this Week

Search Archive

Some of Our Sources

View All Sources

Help Webnuz

Referal links:

March 17, 2022 06:56 pm GMT

The Complete Guide to covert Image To Text and text to speech with Javascript

Are you looking for a way to convert images to text?
Just take a picture of a text and it will be converted to text for you?
Same text can be read by a javascript application?

Today, I am going to fulfill your long-awaited wish, by taking a picture of a text and converting it to text. In addition, I will also convert the text to speech for you.

I'm going to create a simple application that will read convert image URL to text and convert text to speech.

Before we begin, I want to explain a few things.

OCR (Optical Character Recognition)

It is a technology that recognizes the text in an image. It's commonly used in multiple applications like document scanning, handwriting recognition etc.

Javascript does not have a built-in OCR library. we can use the tesseract.js to do the OCR for us. You check out the tesseract.js library for more information.

SpeechSynthesis

SpeechSynthesis is a technology that can convert text to speech.

The SpeechSynthesis interface of the Web Speech API is the controller interface for the speech service; this can be used to retrieve information about the synthesis voices available on the device, start and pause speech, and other commands besides. Referred from MDN

I'm very excited to show you how to use tesseract.js to convert an image to text. I will show you how to do this in the following steps.

Part 1: Convert an image to text

I'll add 2 examples of images to convert to text. First from the image URL and second from the image file.

Step 1: Create a simple HTML page with the following code.

index.html

<html>  <body>    Progress: <span id="progress">0</span>    <div class="container">      <input        id="url"        value="https://tesseract.projectnaptha.com/img/eng_bw.png"      />      <button onclick="onCovert()">Convert URL Image</button>    </div>    <div class="container">      <img id="output" src="" width="100" height="100" />      <input        name="photo"        type="file"        accept="image/*"        onchange="onImageChange(this.files[0])"      />    </div>    <div class="container">      <p id="text"></p>      <button onclick="read()">Read</button>    </div>    <script src="script.js"></script>  </body></html>

Step 2: Add Tesseract.js to the HTML page, The easiest way to include Tesseract.js in your HTML5 page is to use a CDN. So, add the following to the <head> of your webpage.

<script src="https://unpkg.com/[email protected]/dist/tesseract.min.js"></script>

Step 3: Initialize And Run Tesseract OCR

script.js

const textEle = document.getElementById('text');const imgEle = document.getElementById('output');const progressEle = document.getElementById('progress');const logger = ({ progress }) =>  (progressEle.innerHTML = `${(progress * 100).toFixed(2)}%`);// Create Image to text using mainconst startConversion = async (url) => {  try {    const result = await Tesseract.recognize(url, 'eng', { logger });    const {      data: { text },    } = result;    return text;  } catch (e) {    console.error(e);  }};const onCovert = async () => {  const urlEle = document.getElementById('url');  const text = await startConversion(urlEle.value);  textEle.innerHTML = text;};// Create Image to text using worker better wayconst worker = Tesseract.createWorker({  logger,});const imageToText = async (url) => {  try {    await worker.load();    await worker.loadLanguage('eng');    await worker.initialize('eng');    const {      data: { text },    } = await worker.recognize(url);    await worker.terminate();    textEle.innerHTML = text;  } catch (error) {}};const onImageChange = (file) => {  if (file) {    let reader = new FileReader();    reader.readAsDataURL(file);    reader.onload = function () {      let url = reader.result;      imgEle.src = url;      imageToText(url);    };  }};

Tesreact.js API response

blocks: [{}]box: nullconfidence: 90hocr: "<div class='ocr_page' id='page_1' title='image \"\"; bbox 0 0 1486 668; ppageno 0'>
 <div class='ocr_carea' id='block_1_1' title=\"bbox 28 34 1454 640\">
  <p class='ocr_par' id='par_1_1' lang='eng' title=\"bbox 28 34 1454 640\">
"lines: (8) [{}, {}, {}, {}, {}, {}, {}, {}]oem: "DEFAULT"osd: nullparagraphs: [{}]psm: "SINGLE_BLOCK"symbols: (295) [{}, {}, {}, {}, {}, {}, ]text: "Mild Splendour of the various-vested Night!
Mother of wildly-working visions! haill
I watch thy gliding, while with watery light
Thy weak eye glimmers through a fleecy veil;
And when thou lovest thy pale orb to shroud
Behind the gatherd blackness lost on high;
And when thou dartest from the wind-rent cloud
Thy placid lightning oer the awakend sky.
"tsv: "41117028487140061-1
511171284871165087And
5111721704881505187when
5111733454901235192thou
5111744974921885191dartest
5111757114931285191from
511176866494875292the
5111779784952725292wind-rent
51117812754941535492cloud
41118096563122877-1
511181965631126992Thy
5111822315641727091placid
5111834275662487392lightning
5111847005681005389oer
511185824569876992the
5111869355692605482awakend
51118712185691067192sky.
"unlv: nullversion: "4.1.1-56-gbe45"words: (58) [{}, {}, {}][[Prototype]]: Object

Let's understand the structure of the data.

text: All of the recognized text as a string.
lines: An array of every recognized line by line of text.
words: An array of every recognized word.
symbols: An array of each of the characters recognized.
paragraphs: An array of every recognized paragraph.

We have text in the form of a string, We can use this for reading.

Part 2: Convert text to speech

For text to speech, we will use the inbuilt text to speech API.

speak: This method will add a speech to a queue called utterance queue. This speech will be spoken after all speeches in the queue before it have been spoken. this function takes a SpeechSynthesisUtterance object as an argument. This object has a property called text, which is the text that we want to convert to speech. We can use this to convert text to speech.

NOTE: SpeechSynthesisUtterance take different properties to create a speech. check the SpeechSynthesisUtterance for more information.

const read = () => {  const msg = new SpeechSynthesisUtterance();  msg.text = textEle.innerText;  window.speechSynthesis.speak(msg);};

cancel: Removes all utterances from the utterance queue.

getVoices: Returns a list of SpeechSynthesisVoice objects representing all the available voices on the current device.

pause: Puts the SpeechSynthesis object into a paused state.

resume: Puts the SpeechSynthesis object into a non-paused state: resumes it if it was already paused.

Live Demo

Browser Compatibility

SpeechSynthesis API is available in all modern browsers Firefox, Chrome, Edge & Safari.

Got any questions or additional? please leave a comment.

Thank you for reading

Original Link: https://dev.to/devsmitra/the-complete-guide-to-covert-image-to-text-and-text-to-speech-with-javascript-15gp

Share this article:

View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To