Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
June 14, 2022 12:35 pm GMT

AWS Textract and React Native

Amazon Textract is a machine learning (ML) service presented by Amazon which extracts text, handwriting, and data from scanned documents, PDFs and images into text documents which then can be stored in any kind of storage service such as DynamoDB, s3, etc....

Today I will go through the process of capturing or selecting an image from react native mobile application and uploading these images to S3, then the data extraction from those images will occur once we trigger a lambda function using API Gateway, then after processing the data we will insert this data as DynamoDB records.

Textract LucidChart

Prerequisites

  • npm or yarn
  • react-native > 0.59
  • aws-amplify
  • nodejs
  • aws-sdk

In this blog I will assume you have some knowledge in developing and deploying lambda functions and API Gateway.

Let us split our blog into 2 main parts:

  1. Frontend Part
  2. Backend Part

1. Frontend Part
In this section we will handle the upload of images we captured in our mobile application into S3 in order for our backend server to extract the data from these images.

First of all we will start by installing:

  • aws-amplify for react native.npm install aws-amplify or using npm install @aws-amplify/api @aws-amplify/core @aws-amplify/storage since we don't need all the aws-amplify libraries
  • react-native-image-picker to select a photo from the device library or camera.npm install react-native-image-picker

We will start by implementing two functions, one if user choose image from library and one from camera :

import {launchCamera, launchImageLibrary} from 'react-native-image-picker';//Inside your componentconst options = {  mediaType: 'photo',  quality: 0.5,  includeBase64: true,};const libraryPickerHandler = () => {    launchImageLibrary(options, async (response) => {      if (response.didCancel) {        console.log('User cancelled image picker');      } else if (response.errorMessage) {        console.log('ImagePicker Error: ', response.errorMessage);      } else {        await onImageSelect(response?.assets[0].uri);      }    });  };  const cameraPickerHandler = async () => {    launchCamera(options, async (response) => {      if (response.didCancel) {        console.log('User cancelled image picker');      } else if (response.errorMessage) {        console.log('ImagePicker Error: ', response.errorMessage);      } else {        await onImageSelect(response?.assets[0].uri);      }    });  };

The onImageSelect function will handle the upload of the image to S3, and will send the S3 key to our API endpoint /textract-scan that we will develop in our backend part:

import Storage from '@aws-amplify/storage';import API from '@aws-amplify/api';// orimport { Storage, API } from 'aws-amplify';const onImageSelect = async (uri: string) => {  let imageResponse = await fetch(uri);  const blob = await imageResponse.blob();  // timestamp for random image names  let naming = `{new Date().getTime()}.jpeg`;  const s3Response = await Storage.put(naming, blob, {    contentType: 'image/jpeg',    level: 'protected',  }); await API.post('your-endpoint-name', '/main/textract-scan', {    body: {      imageKey: [s3Response.key],    },  });};

2. Backend Part

In this section we will handle the extraction of data from the images which will be written in nodejs.

We will start by installing:

  • aws-sdk for javascript which enables you to easily work with Amazon Web Services.

npm install aws-sdk
or
yarn add aws-sdk

We will create a file called textract.ts which will include the lambda function called textractScan.
textractScan will handle the extraction of the data from the image.

The function will be a post method which will take in the body a key attribute. This key represent the S3 object key in the specified Bucket.

you will need to add this in your serverless.yml file inside the function block :

TextractScanLambda:  handler: path-to-your-file/textract.textractScan  events:    - http:        method: post        path: main/textract-scan        authorizer: aws_iam

now lets go to the textract.ts file and start implementing our lambda function.

lets develop our Texract service function analyzeText which we will be using in our lambda function:

import { Textract } from 'aws-sdk';const analyzeText = async(key: string) => {  const payload = {      Document: {        S3Object: {          //the bucket where you uploaded your images          Bucket: 'BUCKET_NAME',          Name: key,        },      },    };    return new Textract().detectDocumentText(payload);}

now we start developing our lambda function textractScan:

const textractScan = async (event: AWSLambda.APIGatewayProxyEvent) => {  try {    console.log(event);    const body = JSON.parse(event.body);    const { key } = body;    const analyzeTextResult = await analyzeText(key);  } catch (e) {    console.log(e);    return {      statusCode: 500,      body: JSON.stringify({ message: 'ERROR_ANALYZING_DOCUMENT' }),    };  }};

Now that we completed the function, we can use it to extract text from images. The result in analyzeTextResult will contain an array of block objects that contain the text that's detected in the document, but it will be time consuming to extract the actual data that we need from this object.
That's why aws-textract-json-parser is made for, this library parses the json response from AWS Textract into a more usable format, which you can then insert to DynamoDB:

import { DynamoDB } from 'aws-sdk';const textractScan = async (event: AWSLambda.APIGatewayProxyEvent) => {  try {    console.log(event);    const body = JSON.parse(event.body);    const { key } = body;    const analyzeTextResult = await analyzeText(key);    const parsedData = await AWSJsonParser(analyzeTextResult);    console.info(parsedData);    const rawData = parsedData.getRawData();    console.info(data);      if (data.length === 0) {        console.error('no text detected'); return {      statusCode: 400,      body: JSON.stringify({ message: 'INVALID_DOCUMENT' }),    };      }    const payload = {     ...someData,     textractData: rawData    }    new DynamoDB.DocumentClient(payload).put;    ....  } catch (e) {    console.log(e);    return {      statusCode: 500,      body: JSON.stringify({ message: 'ERROR_ANALYZING_DOCUMENT' }),    };  }};

So now, you can achieve many scenarios that requires the user to take photos and extract the data and associate with his profile in simple steps.


Original Link: https://dev.to/hadinasser24/aws-textract-and-react-native-m7j

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To