Open Source + Free Tier Services: A Cornucopia for Hobbyists

Open Source logo I am constantly amazed by the amount, variety, and quality of open source software that is available, as well as the free (for small volume users) commercial Information and Communications Technology (ICT) services that exist. Open source has, in many domains, proven itself to be a reliable and cost-effective means of developing and improving software. I think an economic surplus for commercial companies has also contributed to this movement, as it has contributed to the free services that are available. (Although no doubt the free services are also intended as a gateway to the companies’ paid services.)

A little application I wrote the other evening provides an excellent example of this in action. The application is a software program that once a day checks inventory at our closest liquor stores for a certain hard to find and rarely in stock bourbon. If that particular bourbon is in stock at either store, it sends out a text alert. I wrote the application in Python, an open source programming language. It’s among the most popular programming languages in the world (and almost all its competitors are also open source). It used to be that companies paid large sums of money for commercial language compilers, while companies like Microsoft and Borland sold programming languages for PCs at hundreds of dollars per copy. Now programming language software is available for free, almost all as open source, while many software development environments are also free, and many open source.

The total volume of code involved in this application to open a web page, render it, scrape it for content, and send a text message with the result is huge, yet I only needed to write 34 lines of code! How? By using open source libraries that others had developed. These libraries (reusable modules with well-defined interfaces) did all the heavy lifting, I just needed to write some code to glue them together in a certain way and provide the key parameters. There are currently over 100,000 third party libraries available as free as open source in the Python Package Index repository. Cost to use any of these packages? $0.00. Similar libraries exist for other software languages, such as PhP, C, and Javascript.

As mentioned above, the program sends out a text message if the inventory is greater than zero at one of the local stores. How does it do this? Well, it uses the free twilio.rest library provided by Twilio. Twilio provides automated SMS (text), voice call, and other telecommunications services that you access through your own software programs. They are a commercial company that charges for their services, but you can test it out for free, and according to some folks, very low volume users don’t seem to ever exhaust their free account. This is an example where hobbyists like myself can, at no cost, take advantage of services developed for larger commercial customers.

At first, I ran this application on my PC at home, using the Windows Task Scheduler to have it automatically run each morning. But a) the Windows scheduler is both a pain and a bit flaky and b) it won’t run if my PC is powered off. So, I ported my application over to run “in the cloud” on Amazon Web Services, using what Amazon calls their Lambda service. As Amazon describes it, “AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume – there is no charge when your code is not running… upload your code and Lambda takes care of everything required to run and scale your code with high availability.”  As with Twilio, this is a commercial fee-for-service offering, except that you can use it for free if you call your services less than one million times per month and use less than 3.2 million seconds of compute time per month.

Finally, all those servers at Amazon need to run an operating system. What operating system do they use? Mostly Linux. While Windows and MacOS may dominate the PC world, the overwhelming majority of servers in the world run Linux. And like Python, Linux is free and open source. Linux is also at the heart of the Android operating system used on many phones and can be found running smart TVs and the infotainment systems on many cars.

Software is fundamentally different from hardware, so you can’t fully duplicate this environment for physical things (although there’s also a much smaller Open Hardware movement that open sources the design of hardware, rather than patenting it or keeping it a trade secret).  Nevertheless, it’s amazing how much can be done solely with the combination of open source software and free services!

Using Geek Power for Good: Better Living Through Code Edition

bottle of Buffalo Trace bourbon

Buffalo Trace bourbon

Bourbon has become a hot commodity, and as it takes years to make, the supply can’t quickly ramp up to match demand. Buffalo Trace is one of many brands that have become quite popular, making it hard to find. The clerk at a local Virginia ABC store told my wife that they get a small shipment in and it “flows out like a river”, and is gone in a day. My wife found that you could check inventory of local stores online, and asked me to write a script to check it. Starting tomorrow morning, the “Buffalo Hunter” script will run once a day, check the inventory at our two closest stores, and send her a text if there are any bottles in stock.

Version 1 was a fun 1-day project. I had to learn some new tricks, as the page uses client-side javascript and I hadn’t used Twilio to send texts before, but I got it all working well. Some time later I decided to port it to the cloud, using the AWS Lamda service, which had a short but steep learning curve.

The website uses javascript to generate a dynamic page, so I couldn’t simply use something like Beautiful Soup to parse the html. So I used Selenium, using a Chrome headless browser on my local version, switching over to phantomjs on AWS. I switched to phantomjs because you need to have executables compiled to run under AWS, and I found a precompiled version of phantomjs on the web, and didn’t find the same for Chrome.

There was one other “gotcha” I ran into. I use Windows. While I had found a correctly compiled phantomjs executable, when I zipped it along with the other files to upload, it lost its permissions settings. I could have booted up in Linux, instead I installed the Linux subsystem that’s available for Windows 10 and used bash to zip the files up. That ended up working fine. You also need to change the directory for the phantomjs log to the /tmp/ folder that AWS gives you write access to.

In version 1, I handed off the final processed web page to Beautiful Soup because I hadn’t used Selenium’s parsing before, and I’d used Beautiful Soup’s. You can easily hand off the processed resulting web page from Selenium to Beautiful Soup (see the commented out line that starts page2Soup in the code below). When I moved to Amazon, I also figured out how to do the page scraping in Selenium, so that I didn’t need Beautiful Soup any more. The concept’s simple, but I didn’t find a good reference for the find_element)by_css_selector in python, so it took a little trial and error. .If you’re interested, here’s the version of the code that runs on AWS:

Buffalo Hunter.py


import logging
import datetime
import time
# from bs4 import BeautifulSoup
from selenium import webdriver
from twilio.rest import Client

accountSID='SID Here'
authToken = 'token here'
stores = {'219': 'Old Courthouse' , '231': 'Maple Ave.'}

# options = webdriver.ChromeOptions()
# options.add_argument('headless')
# driver = webdriver.Chrome('c:/program files (x86)/chromedriver.exe')
driver = webdriver.PhantomJS(executable_path="/var/task/phantomjs", service_log_path='/tmp/ghostdriver.log')

def myhandler(event, context):
	try:
		results = ''
		success = 0
		for store in stores:
			driver.get('https://www.abc.virginia.gov/stores/'+store)
			make_my_store = driver.find_element_by_id('make-this-my-store')
			make_my_store.click()
			time.sleep(5)
			driver.get('https://www.abc.virginia.gov/products/bourbon/buffalo-trace-bourbon#/product?productSize=0')
			time.sleep(5)
			element = driver.find_element_by_css_selector('td[data-title="Inventory"]')
			# page2Soup = BeautifulSoup(driver.page_source, 'lxml')
			# element = page2Soup.find("td", {"data-title": "Inventory"})
			inventory_value = element.text
			if inventory_value <> '0': success = 1
			results= results+stores[store] +' has '+inventory_value+ ' bottles of Buffalo Trace. '
		driver.close()
		driver.quit()
	# Send results if inventory not 0 at both stores
		if success == 1:
			results = 'Success! ' + results
			twilioCli = Client(accountSID, authToken)
			myTwilioNumber = 'myPhoneNumberHere'
			destinationCellNumber = 'destinationCellNumberHere'
			message = twilioCli.messages.create(body=results,from_=myTwilioNumber, to=destinationCellNumber)
	except Exception as e:
		logging.error(str(datetime.datetime.now())+' Error at %s', 'division', exc_info=e)

Yorick 2.0: The Personality Split

Introduction

When Yorick was first brought to life, he had Alexa’s voice. A lot of his charm was the incongruity between his appearance and his voice. At the same time, a number of folks asked about having a creepier voice and I wanted to try to do that for this Halloween.  An update to the AlexaPi project added support for the SoX audio playback handler as an alternative to VLC. SoX has support for audio effects, so it became possible to change Yorick’s output voice. I didn’t want to lose Alexa’s voice, so I edited the AlexaPi code so that it would recognize both “Alexa” and “Yorick” as trigger words, with the output sound depending on which trigger word you used. As a result, Yorick now responds either as Alexa or with his own voice.

Just like Elliot on Mr. Robot, Yorick now has a split personality.

Conversations

I talked to Yorick, aka Alexa, a bit about Halloween:

It turns out that Yorick is a baseball fan and was rather disappointed that the Washington Nationals aren’t in the World Series. Awhile back, I asked him about going to one of the playoff games:

Technical Notes

AlexaPi uses PocketSphinx for recognizing the trigger word. The original code is set up to recognize a single trigger word or phrase, which you can easilly change in a yaml configuration file. However PocketSphinx can recognize multiple keywords or phrases selected from a python list. Some editing of the AlexaPi source code was needed in order to change the trigger from a single variable to a list. Similarly, the code was modified slightly so that once a trigger word was recognized it checks which word was used. If the trigger word is “Yorick” it changes the pitch and speed of the audio output.

I used version 1.5 of AlexaPi. This and previous versions had a problem in that the temporary file names used were the response code that the Alexa voice service returned. These sometimes included characters that were illegal for file names or that were too long for a file name. I patched these problems (and later versions of AlexaPi have fixed this problem).

In addition, the servo motion routines had to be modified slightly, as version 1.5 and later of AlexaPi begins streaming questions to the Alexa voice service as they are asked, rather than waiting until the question is finished. This results in a faster response time.