Building a CLI Data Gem App and Scraping with Nokogiri

Can I say wow??!!! I just built my first project, a command line interface app that scraped data from a website and provided a list of key information on the page.

Slow and steady wins the race, right? I hope so because I really took my time to build this project.

So why did I choose to scrape a website that listed the best ice cream parlors?

Ice cream is amazing!!! There are literally tons of options, from salted caramel and matcha green tea to the old faithfuls, vanilla and chocolate.

Mikey Likes It

Naturally, I wanted to know where to find the best ice cream parlors in NY and the world.

Check the website before scraping

Before I could start scraping websites, I used curl to check websites for excessive javascript, etc. If there was a ton of javascript for Nokogiri to crawl through, it might be challenging to find the CSS selectors I needed for my code.

Local environment vs. Learn IDE

I chose to work in the local environment versus the Learn IDE.  I was a bit confused setting up the app via the IDE, so I decided this was a perfect time to be acquainted with my terminal, Git/GitHub, etc.

It was definitely challenging working in the terminal but once I learned a few things, I was on my way to coding my app.

Since I’m a Mac user, I used the following steps to make sure my computer was set up properly. Please note that Mac computers come with Ruby and other programming languages pre-installed, which means you can’t easily make edits to the software.

Step 1: Download Homebrew

Mac computers are great but they don’t come with all the goodies that you find in Homebrew, “The missing package manager for macOS”.

Step 2: Install packages

Scroll down to and click on Homebrew packages then click on browse all formulae.

CTRL-F for rbenv, which is the Ruby version manager. Once that is installed, you can install the version of Ruby you would like on your computer. I installed Ruby 2.30.

Install bundler-completion for bash in the terminal to recognize Bundler.

Once I had these installed, I was able to work with bundler to install other gems I needed like Nokogiri and Pry.

Install Bundler and Build Your Gem

Bundler provides simple documentation for installing bundle on your Mac.

I also used this video to see how it was done to navigate through potential hiccups I would see in the terminal. When I ran bundle gem best_icecream --test --coc --mit, all the files I needed were created and the repository was set up.

Once I built my gem, I was ready to link my gem to my GitHub repository and begin building my CLI app.

Building the CLI Class

I followed Avi’s Daily Deals CLI Gem walkthrough video and made adjustments to my code accordingly.

What I like about Avi’s walkthrough video is that he explains the thought process behind the steps he took to build the CLI app. I could start by scraping the website but if my command line isn’t working, I won’t know where to begin to fix any problems that occur.

I built the CLI class by adding the line BestIcecream::CLI.new.call to ./bin/best-icecream. I stubbed out the CLI class with some fake data to make sure the interface was working. I added four methods: call, list_parlors, menu, and goodbye. The call method collaborated with the BestIcecream::Parlor.scrape_parlors and included the other three methods.

The first level of the CLI was list_parlors and the second level required input from the user to input either a number, type “list” or “exit”. Anything else would ask the user to enter a number on the list, type “list” or “exit”. Of course, typing “exit” would exit the program.

The Parlor Class

The Parlor Class has two functions: to instantiate new objects and scrape the website with Nokogiri. I built the self.scrape_parlors method first.

As I stated previously when I curled the website, a lot of javascript loaded. Nokogiri has to crawl this to find the CSS selectors I would use in the project.

I used binding.pry to find the selectors needed for the name, location, phone, and url objects.

Through iteration and the pry console, each selector was located. I worked with a tech coach to further organize the code, adding in an initialize method to instantiate new objects with name, location, phone, and url; self.all and save, and the class variable @@all to hold all the instances of the class.

Conclusion

This was definitely a challenging project. The hardest part was finding the best website to scrape.

After I finished the CLI class and began to build the Parlor (previously called the Icecream class), I had some difficulty scraping my original website. I realized a lot of the websites I previously reviewed, listed the name and location on the same line, so I found ny.eater.com, which had the name and location on different lines and different selectors.

This website was also challenging to scrape because I couldn’t retrieve the phone selector and I wasn’t able to retrieve all 16 ice cream parlors listed. The tech coach (David K.) I worked with, suggested we see where the code was broken or malformed, so we could separate the “cards” we needed. Once we did that we sliced the last card in the iteration method.

Possible re-factor for this CLI app would be to create a scraper class and pull in that information to the Parlor class that finds and instantiates new objects.