Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
December 2, 2020 05:41 pm GMT

I used Cypress as an Xbox web scraper and I regret nothing

Like many people, I would like to get my hands on the new Xbox. And like everyone but the most diligent online shoppers, I have so far failed in my efforts to do so, and have instead been relentlessly greeted by images such as this one:

costco

So what does an enterprising/desperate web developer do? Build their own alert system, of course!

Now, a web scraper is a pretty simple application and generally the ideal use case for this sort of thing. But I wanted to add a visual element to it, to make sure I wasn't getting false positives, and because I tend to prefer user interfaces over bare code (I do work at Stackery, after all). Also, I've been playing with the Cypress test suite for the past month or so, and absolutely love it for frontend testing, so I've been looking for more ways to implement it in my projects.

Now, I should say: I'm guessing this is not exactly the use case the devs at Cypress.io had in mind when they built the browser-based testing library, but as the famous saying goes, "You can invent a hammer, but you can't stop the first user from using it to hit themselves in the head1".

So without further ado, let's hit ourselves in the proverbial head and get that Xbox!

Setup: get yourself a Cypress account

Cypress has a very neat feature that allows you to view videos from your automated test runs in their web app. In order to do so, you'll need a free developer account:

  1. Go to the Cypress sign-up page and create an account
  2. Once you're in their dashboard, go ahead and create a new project. Name it "Xbox stock scraper", "testing abomination", or whatever you'd like. I generally name my projects the same as my repo, because that's how my brain works
  3. Now, you'll want to take note of the projectId as well as the record key, as you'll need this later

Create a serverless stack for your scraper

Because store inventories changes frequently, we'll want to run our scraper regularly - every hour to start, though it's easy to adjust that up or down as you see fit. Of course, we want to automate these runs, because the whole point is that you have a life and are trying to avoid refreshing web pages on the reg. Is it me, or is this starting to sound like an ideal serverless use case? Not just me? Thought so!

I originally wanted to run the whole thing in a Lambda, but after an hours-long rabbit-hole, I found out that's really, really hard, and ultimately not worth it when a CodeBuild job will do the trick just fine.

I'll be using Stackery to build my stack, so these instructions go through that workflow. This part is optional, as you can also do this in the AWS Console, but I like doing things the easy way, and Stackery is serverless on easy mode2.

  1. If you don't already have one, create a free Stackery account
  2. Navigate to /stacks, and click the Add a Stack dropdown arrow to select With a new repo. Here's what that looks like for me:
    xbox-1

  3. Normally, you'd add resources one by one in the Design Canvas, but as this stack is mainly based on a CodeBuild job and related roles, it's easier to copy-pasta an AWS SAM template like so:

xbox-compressed

Under Edit Mode, click Template, clear out the existing template, and paste the following:

AWSTemplateFormatVersion: '2010-09-09'Transform: AWS::Serverless-2016-10-31Resources:  SendMessage:    Type: AWS::Serverless::Function    Properties:      FunctionName: !Sub ${AWS::StackName}-SendMessage      Description: !Sub        - Stack ${StackTagName} Environment ${EnvironmentTagName} Function ${ResourceName}        - ResourceName: SendMessage      CodeUri: src/SendMessage      Handler: index.handler      Runtime: nodejs12.x      MemorySize: 3008      Timeout: 30      Tracing: Active      Policies:        - AWSXrayWriteOnlyAccess        - SNSPublishMessagePolicy:            TopicName: !GetAtt XboxAlert.TopicName      Events:        EventRule:          Type: EventBridgeRule          Properties:            Pattern:              source:                - aws.codebuild              detail-type:                - CodeBuild Build State Change              detail:                build-status:                  - SUCCEEDED                  - FAILED                project-name:                  - cypress-xbox-scraper          Metadata:            StackeryName: TriggerMessage      Environment:        Variables:          TOPIC_NAME: !GetAtt XboxAlert.TopicName          TOPIC_ARN: !Ref XboxAlert  CodeBuildIAMRole:    Type: AWS::IAM::Role    Properties:      AssumeRolePolicyDocument:        Version: 2012-10-17        Statement:          Effect: Allow          Principal:            Service: codebuild.amazonaws.com          Action: sts:AssumeRole      RoleName: !Sub ${AWS::StackName}-CodeBuildIAMRole      ManagedPolicyArns:        - arn:aws:iam::aws:policy/AdministratorAccess  CypressScraper:    Type: AWS::CodeBuild::Project    Properties:      Artifacts:        Type: NO_ARTIFACTS      Description: Cypress Xbox Scraper      Environment:        ComputeType: BUILD_GENERAL1_SMALL        Image: aws/codebuild/standard:2.0        Type: LINUX_CONTAINER        PrivilegedMode: true      Name: cypress-xbox-scraper      ServiceRole: !Ref CodeBuildIAMRole      Source:        BuildSpec: buildspec.yml        Location: https://github.com/<github-user>/<repo-name>.git        SourceIdentifier: BUILD_SCRIPTS_SRC        Type: GITHUB        Auth:          Type: OAUTH  CypressScraperTriggerIAMRole:    Type: AWS::IAM::Role    Properties:      AssumeRolePolicyDocument:        Version: 2012-10-17        Statement:          Effect: Allow          Principal:            Service:              - events.amazonaws.com          Action: sts:AssumeRole      Policies:        - PolicyName: TriggerCypressScraperCodeBuild          PolicyDocument:            Version: 2012-10-17            Statement:              - Effect: Allow                Action:                  - codebuild:StartBuild                  - codebuild:BatchGetBuilds                Resource:                  - !GetAtt CypressScraper.Arn      RoleName: !Sub ${AWS::StackName}-CypressScraperTriggerRole  TriggerScraper:    Type: AWS::Events::Rule    Properties:      ScheduleExpression: rate(1 hour)      State: ENABLED      RoleArn: !GetAtt CypressScraperTriggerIAMRole.Arn      Targets:        - Arn: !GetAtt CypressScraper.Arn          Id: cypress-xbox-scraper          RoleArn: !GetAtt CypressScraperTriggerIAMRole.Arn  XboxAlert:    Type: AWS::SNS::Topic    Properties:      TopicName: !Sub ${AWS::StackName}-XboxAlertParameters:  StackTagName:    Type: String    Description: Stack Name (injected by Stackery at deployment time)  EnvironmentTagName:    Type: String    Description: Environment Name (injected by Stackery at deployment time)
Enter fullscreen mode Exit fullscreen mode

Let's break this down a bit. For those new to serverless, this is an AWS SAM template. While using Stackery means you generally can avoid writing template files, there are a few things worth noting, and one line you'll need to input your own data into.

We'll start with lines 55-74:

  CypressScraper:    Type: AWS::CodeBuild::Project    Properties:      Artifacts:        Type: NO_ARTIFACTS      Description: Cypress Xbox Scraper      Environment:        ComputeType: BUILD_GENERAL1_SMALL        Image: aws/codebuild/standard:2.0        Type: LINUX_CONTAINER        PrivilegedMode: true      Name: cypress-xbox-scraper      ServiceRole: !Ref CodeBuildIAMRole      Source:        BuildSpec: buildspec.yml        Location: https://github.com/<github-user>/<repo-name>.git        SourceIdentifier: BUILD_SCRIPTS_SRC        Type: GITHUB        Auth:          Type: OAUTH
Enter fullscreen mode Exit fullscreen mode

This is the CodeBuild project that will be created to run Cypress in a Linux container in one of AWS's magical server estates. You'll need to replace line 70 with the Git repo you just created. This also means you may need to authenticate your Git provider with AWS, but I'll walk you through that a bit later.

Line 101 is where you can change the frequency at which messages are sent. Learn more about AWS schedule expressions here.

Now, if you switch back to Visual mode, you'll see that several resources were just auto-magically populated from the template:

xbox-3

They include:

  • TriggerScraper: The CloudWatch event rule that triggers the Cypress CodeBuild job every hour
  • TriggerMessage: The EventBridge Rule that triggers the SendMessage function once the CodeBuild job succeeds or fails
  • SendMessage: The Lambda function that sends a the SNS message if Xboxes are back in stock
  • XboxAlert: The SNS topic for sending SMS messages

You can double-click each resource to see its individual settings.

Look at that: a whole backend, and you didn't even have to open the AWS Console!

  1. Hit the Commit... button to commit this to your Git repo, then follow the link below the stack name to your new repo URL, clone the stack locally, and open it in your favorite VSCode (or other text editor, if you must)

xbox-2

To the code!

As you can see, Stackery created some directories for your function, as well as an AWS SAM template you'll be able to deploy. Thanks, Stackery!

First we'll want to add Cypress:

  1. From the root of your repo, run npm install cypress --save
  2. Once it's installed, run ./node_modules/.bin/cypress open.

Cypress will create its own directory, with a bunch of example code. You can go ahead and delete cypress/integration/examples and create cypress/integration/scraper.spec.js. Here's what will go in there:

// xbox-stock-alert/cypress/integration/scraper.spec.jsdescribe('Xbox out-of-stock scraper', () => {  it('Checks to see if Xboxes are out of stock at Microsoft', () => {    cy.visit('https://www.xbox.com/en-us/configure/8WJ714N3RBTL', {      headers: {        "Accept-Encoding": "gzip, deflate",        "keepAlive": true      }    });    cy.wait(5000);    cy.get('[aria-label="Checkout bundle"]')      .should('be.disabled')  });});
Enter fullscreen mode Exit fullscreen mode

Let's break that down:

  1. Cypress will visit a specific URL - in this case, it's the product page of the Xbox Series X console
  2. The added headers allow the page to actually load without the dreaded ESOCKETTIMEDOUT error (I found this out the hard way, so you don't have to!)
  3. Cypress looks for an element with the aria-label "Checkout bundle" and checks if it's disabled. If it is, the test ends and it is considered successful. If it isn't, the test ends as a failure (but we all know it tried really, really hard)

Now, why the specific "Checkout bundle" element? Well, if you go to the Xbox page in your browser and inspect it, you'll see that it's actually the checkout button that would be enabled were the Xbox in stock:

Checkout button

Let's automate this sh*t!

Ok, we've got our test, and we've got a chron timer set to run once an hour. Now we need to add the CodeBuild job that actually runs this test. We also need to add code to our SendMessage function that notifies us if the test failed, meaning the checkout button is enabled and we're one step closer to new Xbox bliss.

Remember that Cypress projectId and record key you noted forever ago? Here's where those come in.

Create a new file in the root directory called buildspec.yml and add the following and save3:

version: 0.2phases:  install:    runtime-versions:      nodejs: 10  build:    commands:      - npm install && npm run cypress -- --headless --browser electron --record --key <your-record-key>
Enter fullscreen mode Exit fullscreen mode

Open up cypress.json and replace it with the following and save:

{  "baseUrl": "https://www.xbox.com/en-us/configure/8WJ714N3RBTL",  "defaultCommandTimeout": 30000,  "chromeWebSecurity": false,  "projectId": "<your-projectId>"}
Enter fullscreen mode Exit fullscreen mode

Next, we'll add the function code that sends an alert should the test fail. Open up src/SendMessage/index.js and replace it with the following:

// xbox-stock-alert/src/SendMessage/index.jsconst AWS = require('aws-sdk');const sns = new AWS.SNS({region: 'us-west-2'});const message = 'Xbox alert! Click me now: https://www.xbox.com/en-us/configure/8WJ714N3RBTL';const defaultMessage = 'No Xboxes available, try again later';exports.handler = async (event) => {  // Log the event argument for debugging and for use in local development  console.log(JSON.stringify(event, undefined, 2));  // If the CodeBuild job was successful, that means Xboxes are not in stock and no message needs to be sent  if (event.detail['build-status'] === 'SUCCEEDED') {    console.log(defaultMessage)    return {      statusCode: 200,      body: defaultMessage    };  } else if (event.detail['build-status'] === 'FAILED') {    // If the CodeBuild job failed, that means Xboxes are back in stock!    console.log('Sending message: ', message);    // Create SNS parameters    const params = {      Message: message, /* required */      TopicArn: process.env.TOPIC_ARN,      MessageAttributes: {        'AWS.SNS.SMS.SMSType': {          DataType: 'String',          StringValue: 'Promotional'        },        'AWS.SNS.SMS.SenderID': {          DataType: 'String',          StringValue: 'XboxAlert'        },      },    };    try {      let data = await sns.publish(params).promise();      console.log('Message sent! Xbox purchase, commence!');      return {         statusCode: 200,        body: data      };    } catch (err) {      console.log('Sending failed', err);      throw err;    }  }  return {};};
Enter fullscreen mode Exit fullscreen mode

Oh, and while you're at it, you may want to add node_modules and package-lock.json to your .gitignore, unless polluting Git repos is your thing.

Time to deploy this bad boy

Be sure to git add, commit, and push your changes. When deploying, AWS will need access to your Git provider. Follow these instructions to set up access tokens in your account if you've never done that before. (This doc might also come in handy for noobs like me).

If you're using Stackery to deploy, like the smart and also good-looking developer you are, all you need to do is run the following command in the root of your repo:

stackery deploy
Enter fullscreen mode Exit fullscreen mode

This will take a few minutes, during which time you can daydream about how awesome that new Xbox is going to be once it's hooked up to your 4K TV.

waiting gif

Done? Ok! Next step: adding your phone number for text alerts.

Can I get your digits?

As I mentioned above, one of the resources created in your stack was the XboxAlert SNS topic. It was created during the deployment, but right now it's not doing anything. Let's change that.

  1. Open the AWS Console, and navigate to the SNS Dashboard
  2. Under Topics, you should see your freshly-minted topic, called something like xbox-stock-alert-<env>-XboxAlert. Click its name
  3. Click the big orange Create subscription button
  4. Fill out the form like so with your mobile number, and click Create subscription again:

subscription

You'll need to verify your phone number if you haven't used it with SNS before, and then you're good to go!

Testing time

Still in AWS, you should now be able to open up the CodeBuild console and see a new project in there:

xbox-4

You'll want to run it manually to make sure everything works before setting and forgetting it, so go ahead and select your project and hit that Start build button. This will take some time as well, but you can tail the CloudWatch logs by clicking the project name and selecting the most recent build run.

Vids or it didn't happen

Hopefully, your build was a success (and if it wasn't, hit me up - I think I hit all the errors while building this out and may be able to help).

But how can you make sure? Well, you can go back to your project in Cypress.io, and see if there's anything in your latest runs. If all went well, you'll be able to watch a video of the headless browser running your spec!

xbox-5

And, should one day that test fail , you'll get a notification straight to your phone letting you know that Xbox is right there for the taking. Good luck!

Notes

1 I actually just made that up, but I imagine the inventor of the hammer said that at some point.
2 I also just made that up, but that doesn't make it any less true.
3 A much better way to do this is to use environment parameters stored in AWS Systems Manager Parameter Store to store your record key, but for the sake of brevity my example hard-codes the key. Just make sure your repo is private if you follow my bad example

Postscript

It's possible to extend the scraper spec to add more retailers, though I ran into issues with a few, such as Walmart's bot detector:

walmart

I wasn't able to get these running without errors, but maybe someone else will have more luck and can comment with their solutions:

// xbox-stock-alert/cypress/integration/scraper.spec.jsdescribe('Xbox out-of-stock scraper - more retailers', () => {  it('Checks to see if Xboxes are out of stock at Best Buy', () => {    cy.visit('https://www.bestbuy.com/site/microsoft-xbox-series-x-1tb-console-black/6428324.p?skuId=6428324', {      headers: {        "Accept-Encoding": "gzip, deflate",        "keepAlive": true      }    });    cy.wait(5000);    cy.get('[data-sku-id="6428324"]')      .should('be.disabled')  });  it('Checks to see if Xboxes are out of stock at Walmart', () => {    cy.visit('https://www.walmart.com/ip/Xbox-Series-X/443574645', {      headers: {        "Accept-Encoding": "gzip, deflate",        "keepAlive": true      }    });    cy.wait(5000);    cy.get('.spin-button-children')      .contains('Get in-stock alert');  });  it('Checks to see if Xboxes are out of stock at Costco', () => {    cy.visit('https://www.costco.com/xbox-series-x-1tb-console-with-additional-controller.product.100691493.html', {      headers: {        "Accept-Encoding": "gzip, deflate",        "keepAlive": true      },      pageLoadTimeout: 60000    });    cy.wait(5000);    cy.get('.oos-overlay')  });});
Enter fullscreen mode Exit fullscreen mode

Original Link: https://dev.to/annaspies/i-used-cypress-as-an-xbox-web-scraper-and-i-regret-nothing-1bn4

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To