Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
October 26, 2020 10:35 pm GMT

Ruby CLI application: scraping, object relationships and single source of truth

Theres an exciting aspect to building a CLI application, or rather the CLI in general. For the average user like myself who hasnt a clue about the inner workings of a computer and has been confined to the comfort of GUIs, even just typing commands into the Terminal can give you the sensation of feeling like youve become some sort of notorious hacker in a spy movie. And thus, to be able to successfully build an entire CLI-based application from scratch was a truly rewarding experience.

The Goal

I decided to build a very basic application for Premier League football (or soccer, in some countries) where it shows the current league standings, all the clubs in the league, and information for each individual club, and perhaps even stats for individual players (although I never got that far). It started off as a breeze. I thought I had a pretty good grasp of the concepts being dealt with, how I was going to obtain the data by scraping, and so on and so forth. To be fair, I had a great guide to follow, using a video demo that was done by my Flatiron bootcamp instructor. All of it made so much sense when I was watching the demo, but as with most things in life, its one thing to watch someone do it and completely another thing to actually do it.

My first despair

Scraping consumed a bulk of my time. Not fully understanding and knowing how to effectively utilize Nokogiri methods was my downfall. I was fixated on using .css selectors when later on I discovered that I could have much more easily grabbed the same data using other methods in the module like .search, especially when combined with joining ids and classes directly to tags. For instance, a line in my scraper class that grabs a piece of data like so:

.css('.tableBodyContainer.isPL').css('tr:not(.expandable)').css('.long').text
Enter fullscreen mode Exit fullscreen mode

could have just as easily accomplished the same thing using:

.search('span.long').text
Enter fullscreen mode Exit fullscreen mode

My second despair

I knew two important rules about building relationships when going into this project. One was that the objects will need to follow the principle of maintaining a single source of truth when building relationships across different classes and that this should be done by having the object which belongs-to another object be accountable for holding the relationship. And once Ive done that, I knew Id be expected to establish the remaining relationships only through methods. Simple enough, right? The only problem was that this seemed much easier in my head when the relationship was A -< B >- C as opposed to what I had to do which was A -< B -< C. So instead of B keeping track of both A and C, I needed to have B accountable for A and C accountable for B, then somehow build methods that will allow A to interact with C and C to interact with A. After building and re-building my classes over and over and hours of rubber duck debugging, I got it done.

league = League.find_or_create_by_name(league_name)new_club = Club.new(name, league, position, matches_played, matches_won, matches_drawn, matches_lost, goals_for, goals_against, goal_diff, points)Player.new(new_club, player_number, player_name, player_position)
Enter fullscreen mode Exit fullscreen mode

My Club class was keeping track of my League class and my Player class was keeping track of my Club class.
Then I went on to build methods in my League class that could communicate with my Player class, like so:

def clubs    Club.all.select {|club| club.league == self}enddef players    Player.all.select {|player| self.clubs.include?(player.club)}End
Enter fullscreen mode Exit fullscreen mode

And then finally an instance method inside my Player class to access my League class:

def league    League.all.select {|league| league.players.include?(self)}End
Enter fullscreen mode Exit fullscreen mode

Final thoughts

Theres something I have yet to figure out and thats a way to delay my deeper level scrapes until they're needed, instead of scraping all of my data in advance when the application first runs. The scraping simply takes way too long at the moment. Although I would ideally like to store the URL for my deeper scrapes as instance variables and then pass it into a scraper method as needed, this is proving to be a lot more difficult than I had anticipated primarily because of the way my second scrape is designed and the way my logic is currently built in the CLI class. Hopefully as I dive deeper into programming and become more skillful, I will be able to find a more elegant solution.


Original Link: https://dev.to/hiddencilantro/ruby-cli-application-scraping-object-relationships-and-single-source-of-truth-2ni4

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To