Will it Dockerize?

you have been spared from an outdated reference
My view of what virtual machines could be was a bit outdated...


Perhaps....?

The experience of getting one of the previous pull requests in the Code Talker repo running on my machine got me thinking about ways to try to make it easier for people to use code that needed to pull in a variety of dependencies (from packages to full on programs), and have them work on their machine without a ton of mucking about installing things manually. So far, the logistics of getting a speech-to-text extension for VS Code to work across multiple platforms (since VS Code works across multiple platforms), was proving a logistical challenge. There were unique dependencies that needed to be installed on different platforms to be able to pull audio from a machine's microphone, and on top of simple NodeJS packages that could be pulled in with "npm install", there were some Python dependencies that were needed too.

Musings on cross-platform compatibility started originally when I was poking around in the code for a NodeJS package called mic-stream. The npm summary of the package alleged at least some cross-platform compatibility with different software (sox, or alsa-utils) needing to be installed to support it. I wondered how the code would detect which of these dedicated bits of software it thought it should use. It turned out that NodeJS has a process.platform property you can interrogate to see what system is running your script.

Neat. I figured I might be able to run some bash commands through Node's child_process functionality to install various things that couldn't be pulled in via "npm install"--things like a particular version of Python, or the Python SpeechRecognition library that needs to be installed with Python's package manager, pip. It seems that operating system type can also be detected by <a href="https://stackoverflow.com/questions/394230/how-to-detect-the-os-from-a-bash-script">plain old bash scripts</a>, and that you could do<a href="https://stackoverflow.com/questions/17510688/single-script-to-run-in-both-windows-batch-and-linux-bash"> a bit of a hack with ":" characters</a> to get POSIX systems to ignore individual lines of text that a Windows shell won't in order to make one file be run appropriately by both Windows and Unix/POSIX systems. All of that would be a colossal pain, however because out of the various flavours of Unix distros, there are (seemingly) a fair number of unique package managers to go with them (yum, apt, dpkg, etc). This would require the detection of each type of OS with the hopes that the OS type process.platform detected would consistently correspond with one package manager that would be present on all of those systems. The approach of manually installing all required dependencies using a Nodejs or shell script started to seem like a terrible idea.

It was about this point where my instructor suggested I give Docker a try. When I was first introduced to Docker, I witnessed a friend using it to serve a web application. You could write some code on your host, then have it be built, or mounted into a web server that was stuck in a container that would run the web app on a certain port number. Since I had only seen Docker containers running in the context of serving web apps, I was unsure how to get it to run a NodeJS app that would take microphone audio from the host machine, and send something usable back to a VS Code extension after parsing the microphone's audio to speech. Existing code in the Code Talker project relied on  using NodeJS' child_process.spawn() method to run a Python script that used a speech to text library to return transcribed speech to stdout. Since child_process.spawn() can be set to return the child process' stdout to the calling script, you can pretty much get whatever code you want to run as a child process to parse speech to text, then feed it into your VS Code extension. With this in mind, if a Docker container could return its stdout to the host, this could in turn be returned through child_processs.spawn().

Not really knowing how to use Docker, I started with a <a href="https://training.play-with-docker.com/#ops">series of interactive labs</a> that were meant to get beginners acquainted with Docker. The series of tutorials explains how docker containers are built from hosted images: pre-configured lightweight operating systems that you can then run additional commands as you build a container (VM) image using a  <a href="https://docs.docker.com/engine/reference/builder/#entrypoint">Dockerfile</a>. In this file, you specify the base OS image to use, run shell commands to install any needed dependencies, and copy over any required files from the host machine. One of the examples I came across was a container that ran a simple program that returned an ASCII art docker whale to stdout, so this was proof of concept for getting containers working.

cowsay curse you!
A demo Docker container that will print some ASCII art to the host's stdout--also, you may need to modify the file permissions on Docker's config.json file on Ubuntu hosts to keep it complaining to you when adding new Docker images.


Another handy feature of docker is that you can mount directories from the host machine when you execute docker run. You can also see further reference on mounting volumes here.The feature for mounting volumes is useful because this allowed me to place the script files for speech to text functionality into a dedicated directory in our code repo. Contributors could edit the code in that directory, and it would then be mounted to the docker container when it was launched by the VS Code extension.

 Of particular use was the ability to mount the microphone of the host machine to the docker container. Here's the NodeJS script that runs the shell command to launch the docker container while mounting the host's microphone, and the directory that holds the speech to text scripts:

  spawn.exec('docker run -v /dev/snd:/dev/snd -v ' + mountDir + ' --privileged mic-container npm start', (error, stdout, stderr) => {

The "-v /dev/snd:dev/snd" part of the command mounts the host's /dev/snd directory to the docker container's /dev/snd directory.
 The reason this works is that  In Unix systems, hardware devices are treated as part of the directory tree. The microphone device is mounted in the /dev/snd directory. Since Docker supports users mounting directories to a container to share with the container's operating system, you can also share the directory that represents your host machine's microphone. This convenient feature is also a problem, however, because I'm not clear on whether you can mount hardware from Windows operating systems to a Docker container. This article shows the amount of effort that may be required to get a Docker container to grab a USB device from a Windows host, which requires help from Virtual Box to make things work.

Ideally the implementation of the container can be improved so that it will be able to mount the microphones of systems that don't have them accessible via the /dev/snd directory. Either calls to run the docker container with alternate options can be run, or different logic that doesn't use a docker container might be used. Choosing what logic to use in the VS Code extension (whether/how to launch a Docker container) might be informed using Node's process.platform attribute.

Even so, I managed to package the speech to text functionality for the project in a docker container so that instead of having to install the correct version of Python, pip, the SpeechRecognition library for python, and either alsa-utils or sox, then install the npm utilities. Aside from running the "npm i" command, all users have to do to get the dependencies is run the command:

>  docker build -t mic-container .

  Mounting a directory of scripts to the container when it runs also helped to create a convenient way of passing code changes to the container without the inconvenience of having to rebuild it.

Comments

Popular posts from this blog

Tinkering with Chrome Headless to Handle Mic Input

Research and issues with cross platform compatibility

Using Arrow keys to cycle through Mozilla Screenshots