Sunday, February 25, 2018

Turning stuff on and off using a Raspberry Pi: Froggy resurrected


I recently listened to an interview of Eben Upton on The Life Scientific podcast, where he talked about what lead to his development of the Raspberry Pi (RPi), a fully functioning computer that you can buy for as little as $5.  The concept of a computer that was essentially disposable fascinated me, and I decided to get one to play with.  Originally, I was going to get the cheapest model, but in order to use it, one needed to have a wireless keyboard (which I didn't have).  Since I was hoping to run the computer entirely with junk that I had lying around the house, I opted instead to get the $35 model 3B, which has 4 USB 2 ports, a standard size HDMI connector, and an Ethernet connector in addition to built-in WiFi and Bluetooth.  I was able to use a micro SD card, a mouse, and a keyboard that I already had, although I did have to pay about $15 more at WalMart to get an HDMI to VGA adapter in order to use an old monitor that I had stored down in the basement.



I was able to use an old iPad USB power supply to provide the power to the RPi, but it barely puts out the minimum required current, so I frequently see the little lightning bolt in the upper right of the screen indicating that the computer was being underpowered.  At one point, I also used a junky USB cable and it caused the computer to infinitely re-boot until I replaced it with a better quality one.

One of the features that was very attractive to me was the apparent ease with which one could interface the RPi with external devices.  The RPi 3B is about the size of a credit card (although much thicker due to the various ports sticking up out of the circuit board), and it has 40 pins sticking up on one side that serve as the general purpose input/output interface (GPIO).  Many of those pins can be used either to send output to a device being controlled by the RPi, or to receive input from some kind of sensor.  Other pins serve as ground connections or provide 3.3V or 5V power.

After the initial euphoria of successfully downloading the Linux OS (known as Raspbian) onto the micro SD card and booting the computer, I realized that I didn't have the stuff that I needed to actually do the interfacing.  In the past, I would have just made a trip to Radio Shack to pick up the items I needed, but since there is no longer any store in Nashville that sells electronics components to consumers, I had to make a careful assessment of what I needed so that I could make a minimal number of purchases online and save on shipping.  In the following section, I'll list the items that I decided I needed.

Useful stuff for interfacing the Raspberry Pi

One of the most basic things that anyone who wants to play with interfacing the RPi should have is a solderless breadboard.  I already had one, so I didn't need to order one, but if you don't have one, you need to get it.  The size isn't that important because we aren't going to be hooking up a lot of things.  In this post, I'm going to assume that the reader knows how to use a breadboard.  If not, just read about it online.  It's not complicated. 

Another item that would be difficult to do without are some jumper wires.  For $7, I found a set that had all three combinations of sockets and prongs.  (All prices are in US dollars.)  The ones I've used the most have a socket on one end (to fit over the GPIO pins) and a prong on the other end (to fit in a hole in the breadboard).  For making shorter connections within the breadboard, I used short pieces of insulated solid wire.  Wire cutters/strippers are very useful for preparing those wires.

One thing that you will hear repeatedly as you read about using the GPIO connections of the RPi is that you should never expose the pins to voltages over 3.3 volts, nor draw too much current from the power outputs or pins.  The GPIO pins can output enough current to light an LED, or to turn a transistor on, but they can't handle outputting larger amounts of current, such as would be necessary to drive a motor or the coil of a relay.  Because the pins shouldn't be exposed to voltages over 3.3 volts, they can't accept input from things like TTL chips that work at 5 volts.  The solution in both of these cases is to use optoisolator (or optocoupler) chips.  An optoisolator consists of an LED pointing at a phototransistor (effectively acting as a switch) inside an opaque package.  The operative principle is that when the LED turns on, the transistor gets turned on, but without any direct electrical connection between the two circuits.

For output, the optoisolator LED is turned on by the GPIO interface and the controlled device is turned on by the phototransistor.  For input, the optoisolator LED is turned on by the external sensor and the transistor is used to change the voltage present on the GPIO pin.  This means that all kinds of bad things (such as over-voltaging or excessive current) can happen on the external side without having any effect on the Raspberry Pi.  The worst-case scenario is that the optoisolator will get fried.  Since they only cost me 35 cents each (a bag of 20 for $7), that's no big loss.  The part number that I bought was AE1143, but there are probably others that are equivalent.  However, if you want to use them in a breadboard, be sure that the ones you buy have a normal DIP package (see picture above) that will fit in the breadboard holes.

In order to actually turn stuff on and off, you need to have relays that can be turned on and off by the optoisolator.  You could buy individual relays and the various electronic parts that need to go between them and the optoisolators, but I decided it would be simplest to just buy a board that had 8 single-pole, double-throw (SPDT) relays with most of the necessary circuitry already on board, including built-in optoisolators.  You can also buy modules that have 4 or fewer relays, but they aren't much cheaper than the $10 I paid for this one.  There are a bunch of places online that sell them and put their brand name on them, but it appears that they are all the same and made by the same manufacturer. 

If the only thing you want to do is output, you don't need to buy the discrete optoisolators I described above since they are already included on this board.  But if you also want to do input from sensors to the RPi, you should buy the separate optoisolators.  (I'm not going to describe how to do input in this post, but it isn't very complicated.)

You can actually connect the relay module directly to the RPI GPIO pins via jumper wires, but for reasons that I'll get into later, it is probably better to buy a chip like ULN2803APG that has multiple Darlington transistor arrays and use it between the GPIO interface and the relay board.  I bought a pack of two ULN2803APG chips for $5 - each chip can control 8 relays, so one chip is actually all you need to drive the relay module.

So if you already have a monitor, keyboard, USB power supply, etc. to hook up the computer, and if you already have a breadboard, your total cost to get off the ground with the RPi computer, relay module, and parts necessary to connect them is about $60 (plus whatever you have to pay for shipping). 

Important issues relating to interfacing the Raspberry Pi

Although the Raspberry Pi provides a great opportunity for learning electronics, I already had enough experience with electronics that I wasn't really looking at this project as a means to increase my knowledge in that area.  Mostly, I just wanted to turn things on and off with the minimal amount of effort.  So of course, I started by googling topics related to interfacing an RPi. 

Unfortunately, most of the results fell into two categories: questions asked by people who knew little or nothing about electronics that were answered by people who also didn't really know much about electronics, or highly technical questions asked by people who knew a lot about electronics that resulted in technical answers given by electrical engineers.  Neither of these kinds of sources of information really told me what I wanted to know: the most straightforward way to safely turn things on and off using the RPi GPIO pins.

After consulting a number of online sources, I reached some conclusions, which I will summarize below.  I should also say that I found the book "Exploring Raspberry Pi : Interfacing to the Real World with Embedded Linux" by Derek Molloy (John Wiley & Sons, Incorporated, 2016. ) very useful as a comprehensive reference.  The book went far beyond where I was interested in going, but there were two sections that were particularly helpful.  The general introduction to the GPIO (pgs. 220-223) and the introduction to digital input and output to powered circuits (pgs. 224-229) provided pretty much all of the technical details I needed to safely start interfacing without having to worry about frying the RPi.

Pins on the GPIO

One of the most important details is knowing the purpose of the 40 pins of the GPIO.  Fig. 6-1 (p. 221) of the Malloy book is probably the best diagram I've seen, but since I don't have permission to post it here, I'll instead include a diagram from https://pinout.xyz/

The orientation of the pins in this diagram correspond to the orientation shown in the close-up image of the RPi shown earlier in this post.  There are two numbering systems for referring to the pins.  One refers to the physical position of the pin. That system starts with 1 at the lower left, 2 at upper left, 3 lower 2nd column, 4 upper 2nd column, 5 lower 3rd column, etc. to pin 40 at the upper right.  The other numbering system, which is probably more commonly used, is the "GPIO" number.  I believe that this numbering system is consistent with earlier models of RPi that had fewer than 40 pins.  In the diagram above, the GPIO numbers are shown above and below the pins (e.g. GPIO14 and GPIO15 in the upper row of pins, 4th and 5th pins from the left).  In the default operating mode, any of these numbered GPIO pins can be used for input or output, with the exception of the ID_SD and ID_SC pins numbered 0 and 1 in this diagram.  You should not connect anything to these pins unless you do further research into their function.  Various pins can serve purposes (indicated by the color highlighting) other than general input and output when the GPIO is put into other modes, but that is beyond the scope of this post.  

Power from the GPIO

In the diagram above, there are 12 pins that don't have GPIO numbers.  The 8 black-colored pins are ground pins.  They are all equivalent.  The two red-colored pins labeled "5V" can provide power at 5 volts, and the two tan-colored pins labeled "3V3" can provide power at 3.3 volts.  These pins provide a convenient source of power for things that you've connected to the GPIO pins, but they have a limited power output.  Your circuit should not draw more than 200-300 mA from the 5 V source and should draw no more than 50 mA from the 3.3 V source.  

You should make sure that when the RPi is turned off, there is no power being applied to the GPIO pins. That's not a problem if you are using only the built-in power supply from the 5.0 and 3.3 V pins because they'll power down when the RPi powers down.  If your circuit needs more current than what the built-in power supply can provide, you should provide an external power supply.  But it's best that such externally-powered circuits be electrically isolated on the other side of optoisolators anyway so that you don't have to worry about them accidentally applying power to the GPIO pins of the turned-off RPi.  It is good to use the internal GPIO power pins only for parts of the circuit on the computer side of the optoisolators.


Turning on an LED with a GPIO pin

There are abundant examples on the web showing how to turn an LED on and off using one of the GPIO pins set in output mode.  Essentially, when the GPIO pin is set to "on", it outputs 3.3 volts and when it is set to "off", it is at ground.  To turn on and off an LED, you simply place an LED in series with a resistor and connect the ends to the GPIO pin and one of the ground pins.  (If you put the LED in backwards, nothing bad happens - just turn it around and try again.)  The resistance of the resistor should be low enough that the LED lights up enough to see, but not so low that the circuit draws more than 2-3 mA from the 3.3V output of the GPIO pin.  A 1 k ohm resistor should be OK for that purpose.  (If you are only turning on a single LED, you can make the LED brighter by using a smaller resistor and draw more than 3 mA from a GPIO pin. But you don't want to do that with multiple pins at once.)  I chose to use GPIO pin 18 because it was conveniently located next to a ground pin.

There are a number of ways to use software to turn a GPIO pin on and off.  Since I planned to use Python to write the controlling software, I imported a module called "gpiozero" that has a simple function for turning a GPIO pin on and off.  (There are other more sophisticated Python modules for interacting with the GPIO, but that's beyond the scope of this post.)  Here's the code:

from gpiozero import LED
from time import sleep
led18 = LED(18)

led18.on()
sleep(5)
led18.off()

The program makes GPIO18 go to 3.3 volts (turning the LED on), waits for 5 seconds, then makes GPIO18 go to 0 volts (turning the LED off).  This was pretty exciting for about the first minute or so after I got it to work, but it wasn't really what I was trying to accomplish: turning any device on and off.


Wikimedia Commons, Optoisolator_Pinout.svg

However, if you look at the circuit diagram of an optoisolator, you can see that the left side is simply an LED.  So the simple task of turning on an LED is really useful if that LED is inside an optoisolator.  Following the example of Fig. 6-7 (p. 228) of the Malloy book, I connected pin 2 of the optoisolator to one of the GPIO ground pins, connected a resistor of about 2 k ohm to pin 1 of the optoisolator, and connected the other end of the resistor to the GPIO pin that I wanted to use to control the circuit (e.g. GPIO18). (Note: pin 1 is designated by a small dot on the top of the optoisolator DIP.)  Based on Fig. 6-7, that should result in drawing a safe current of about 1 mA from the GPIO pin.

Turning something on and off with an optoisolator

The phototransistor side of the optoisolator is essentially a switch.  When sufficient light comes from the LED inside the optoisolator, current flows from pin 3 to pin 4.  When the LED is dark, no current flows.  However, the amount of current that flows through the phototransistor is pretty small when the LED is lit with only 1 mA of current.  So the phototransistor can be used to turn on another transistor in one of two ways:

Wikimedia Commons, left: Darlington_pair_diagram.svg CC BY-SA by user Michael9422, right: Compound_trans.svg

The transistor pair on the left is called a Darlington pair and the pair on the right is called a Sziklai pair.  In both pairs, the transistor on the left (Q1) would represent the phototransistor inside the optoisolator.  Instead of being controlled by current flowing into its base (B), Q1 is controlled by the light from the LED striking it.  In the Darlington pair, Q2 is an NPN transistor, which is turned on by current flowing into its base.  So when the phototransistor Q1 turns on, the small current flowing from it into the base of transistor Q2 is enough to saturate Q2, causing a lot of current to flow through Q2 from its collector (C) to its emitter (E).  In the Sziklai pair, Q2 is a PNP transistor, which is turned on by current flowing out of its base.  So when the phototransistor Q1 turns on, the small current flowing through it pulls enough current from the base of transistor Q2 to saturate Q2, causing a lot of current to flow through Q2 from its collector (C) to its emitter (E).

Either of these two configurations produces the same result: turning on the phototransistor in the optoisolator turns on a second transistor that can sink a lot more current.  The current sunk by the second transistor is enough to turn on a small light, or to energize the coil of a relay.  If the thing that you want to turn on and off is something that draws a lot of current, uses a voltage higher than about 5 volts, or uses alternating current, then you will need to use the second resistor to turn on a relay.  A relay is a mechanical switch that is closed when its coil is energized.  Since the switch is mechanical, it doesn't care about the nature of the electricity passing through it as long as the voltage and current don't exceed the maximum for which it is rated.  So for example, if you want to turn the lights of your house on and off, you'll need a relay since the voltage is over 100 volts and is alternating current.  (Note: I do NOT advise that you try this unless you are familiar with the safety hazards associated with household wiring.  You can electrocute yourself if you make a mistake.)  You also should use a relay if you want to control any kind of motor.

Turning the 8-relay module on and off

If you don't care about how the electronics work and just want to put the relay circuit together, skip this section.

This kind of setup is exactly what is built into the 8-relay module that I bought online.  Sunfounder has a useful page that provides a helpful circuit diagram of the 8-relay module.  From that page, you can download a large scale circuit diagram as well as a wiring diagram of the module.  I've pulled out the circuit diagram for one of the relay modules:


If you compare this diagram to the ones above, you'll see that the central part of the circuit is a Darlington pair.  The load that is turned on by the NPN transistor T5 is a relay, shown in the upper right of the diagram.  When the coil is not energized, pin 1 of the relay is connected to pin 2.  When the coil is energized, pin 1 is connected to pin 3 of the relay.  Thus, the relay is a single pole, double throw (SPDT) switch.  The diode D5 is there because when a coil is de-energized, its collapsing magnetic field can generate a surge of current that can damage the transistors, so the diode allows that current to safely dissipate.

According to what I've read online, you should be able to connect the input of the relay module directly to one of the GPIO pins and use it to turn the relay on and off.  The "testing experiment" for Raspberry Pi on the Sunfounder page shows how to do this.  But I would NOT recommend that you try that experiment for several reasons.  The most obvious reason is that the photos on the web page do not show clearly how to connect the wires (some wires hide others, making it hard to tell what's going on).  The other reason why following the pattern in that example is a bad idea is because it uses the Raspberry Pi's power supply to run everything.  In their example, they get away with it, but if you are really planning to use the relay module to power 8 devices, you need to have a better understanding of how to make the connections in a safe way that doesn't risk over-voltaging or drawing too much current from the GPIO connections.

The first issue with the Sunfounder example circuit involves using the RPi's 5 volt power pin to run everything, including the LEDs shown on the circuit board.  In real use, the load driven through the relays would be powered separately, using any possible voltage and current that the relays are rated for.  The particular relays in the module say that they can handle up to 10 A, AC voltages up to 250 V, and DC voltages up to 30 V.  For my testing, I connected a little battery-powered motor to pins 1 and 3 so that the motor would turn on and off.  That circuit should have no connection to anything else on the circuit board.

The other issue is whether the voltage supply pins labeled "VCC" and "JD-VCC" should be tied together and supplied with a single power supply, or if they should be supplied with power separately.  When the relay unit ships, it comes with a jumper that connects VCC and JD-VCC pins:

You should pull this jumper off the board and leave the two pins disconnected.  If you want to connect them in the future, it should be a conscious decision on your part after considering the implications, but it should not happen by default.  If you look at the circuit diagram above, you'll see that connecting VCC with JD-VCC defeats the purpose of even having the optoisolator in the circuit, since it makes an electrical connection between the circuits on its two sides.

The circuit diagram shows that JD-VCC supplies 5 volt power to the transistors and coils of the relays.  When the relays aren't energized, the current is minimal, but when a single coil is energized, it draws about 65 mA.  So if you were only going to use one of the 8 relays, you could easily use the 5 V power pin from the RPi GPIO pin set to supply the power.  However, if you used three relays and they were routinely energized at the same time, you would be approaching the 200 mA limit of output for the 5 V GPIO power pin.  Using all 8 relays would draw over 500 mA, which would probably either fry or at least crash the RPi.  In addition, if like me you are running the RPi off of an iPad charger/power supply that barely provides enough current to run the RPi even without interfacing, you might crash the RPi with even fewer than 3 relays connected.  A relatively simple solution is to just create your own battery-operated power supply using 4 D cells and a voltage regulator.  That could easily run the relay unit for a pretty long time, and by being battery powered would enable you to use it in a robot without having to have a cord plugged into a wall receptacle.  See the Appendix at the end for more on this.

The other problem with the relay board is that the inputs are "active low".  That means that they are turned on when the GPIO pins are "off" (at ground = 0 V).  To turn the relays off, the GPIO pins controlling them need to be "on" (3.3 V).  Since the GPIO pins starting state is "off", that isn't really a good thing, because it means that the coils on the relay board would be energized as soon as the RPi is turned on - before you even start running the software to control it.  It would be better to make the inputs be "active high" so that the coils would only be energized when you send a signal to make them be energized (i.e. turn the GPIO state to "on").  I became aware of this issue while reading a thread on the raspberrypi.org forum.  I don't recommend reading the thread unless you are really hard-core, because the thread really gets out into the weeds and the suggested solution involves hooking up a bunch of discreet transistors and resistors to solve the problem.

There is actually much simpler solution that has been mentioned in several other places: using a ULN2803APG chip (the last item on the list of supplies that I bought for this project).  You can view the data sheet for the ULN2803, but I'll cut to the chase by inserting the circuit diagram here:


You can ignore all of the resistors and diodes and just focus on the two transistors.  If you compare this diagram with my earlier diagrams, you'll see that the ULN2803APG  contains a Darlington pair.  When the input of the ULN2803APG goes "high", current flows into the base of the transistor on the left, turning it on.  Current flows from its emitter into the base of the second transistor, turning it on - effectively closing a switch that connects the output to ground, i.e. making the output "low".  If the output is connected to the input of one of the relay controllers, it will ground the relay input, causing both the LED inside the optoisolator to turn on and the indicator LED on the relay circuit board to turn on.

When the input of the ULN2803APG goes low, both transistors turn off, disconnecting the output from the ground and allowing its voltage to float.  If the output is connected to the input of one of the relay controllers, it will be at the VCC voltage and the optoisolator won't be turned on.

The combination of resistors inside the ULN2803APG was chosen so that the output goes low when the input exceeds 2.5 volts (just right for the GPIO output voltage of 3.3 V).

So essentially, the ULN2803APG chip flips the inputs of the relay board to make them active low rather than active high.  That function is shown symbolically in the pinnout diagram for the ULN2803:

where the circuitry is summarized as "not" gates (changing low to high and high to low).

The wiring configuration is super simple.  For each relay that you want to control, connect the output of the GPIO to one of the bottom pins on the chip (1 through 8), then connect the pin on the opposite side (11 through 18) to the input on the relay board that you want to control.  The GND pin needs to be tied to one of the ground pins on the RPi and to the ground pin on the relay board.  It does not seem to me that it should be necessary to connect the "COMMON" pin to anything, although in the examples I've seen it's been connected to VCC of the relay board.  I don't think it matters, since under normal operation the diode on the common connection will block any current from flowing anyway.

The only remaining question is what power source to use for the VCC connection on the relay board.  If one created an external 5 V power supply (such as the one I suggested using D batteries), that supply could be used.  However, in the spirit of keeping the RPi completely isolated electrically from external circuits, it would probably be best to connect the VCC connection on the relay board to one of the 5 V power pins on the GPIO since the VCC connection supplies the computer side of the optoisolators.  In my test circuit, I found that the ULN2803APG chip drew a negligible amount of current when connected by itself to the 5 V pin (as expected given the diode in the "common" connection inside the chip).  When the VCC pin of the relay board was connected to the 5 V pin of the GPIO, it only drew about 1.3 mA per relay control circuit.  So even if all 8 relays were in use, it would draw only about 10 mA from the 5 V pin - way below the 200 mA maximum "safe" output for that pin.  I didn't actually measure the current being drawn from the GPIO control output pin, but I would imagine that it would be at a safe, low value since it is only turning on the transistors on the ULN2803APG, and not actually driving the optoisolator and status LEDs on the relay board as it would have been if there were a direct connection from the GPIO to the relay board.

After all of the stress I encountered trying to figure out a "safe" way to run the 8-relay module from the RPi GPIO, I'm pretty satisfied because this setup is both really simple to wire and also keeps the currents and voltages on the GPIO pins far below their safety limits.  If a separate 5V supply is used to power the relay coils via JD-VCC (rather than using a 5 V power pin from the GPIO), the RPi is also completely isolated electrically from external circuits on the far side of the optoisolator.


Quick and dirty instructions for connecting the 8-relay board to the Raspberry Pi

It is best to make the initial connections with the RPi turned OFF in case you plug a wire in the wrong place during the setup.

1. Remove the jumper connecting the VCC and JD-VCC pins on the 8-relay board and leave it off.

2. Insert the ULN2803APG chip into your wireless breadboard.

3. Use a jumper wire to connect one of the ground pins from the GPIO (it doesn't matter which one) to the GND pin of the ULN2803APG chip.

4. Use a jumper wire to connect the GND pin of the ULN2803APG chip to the GND pin of the 8-relay board (it doesn't matter which GND pin).

5. Use a jumper wire to connect one of the 5 V pins on the Raspberry Pi's GPIO (it doesn't matter which of the 5 V pins) to the common pin of the ULN2803APG chip.

6. Use a jumper wire to connect the common pin of the ULN2803APG chip to a VCC pin on the 8-relay module (it doesn't matter which VCC pin).

7.  Connect a jumper wire from the GPIO output pin that you want to use to one of the input pins on the ULN2803APG chip.  If you want to use the code in my example, use GPIO18 (pin 12).  Using pin 11 on the ULN2803APG chip would be sensible.

8. Connect a jumper wire from the corresponding output pin of the ULN2803APG chip to the input of the relay that you want to use on the 8-relay board.  If you used ULN2803APG pin 11 in the last step, you should use pin 18 for the output pin.

9. Connect the JD-VCC pin on the 8-relay board to a 5 V source of power.  If you just want to test the system with a single relay, you could connect it by a jumper to a 5 V power pin of the RPi's GPIO (or to the common pin of the ULN2803APG chip, which is itself connected to the 5 V pin).  But don't do this if you are going to use more than 2 or 3 of the relays (see details above for the reason).  In that case, buy or make a separate 5 V power supply to supply JD-VCC (see appendix).

10. Turn on the Raspberry Pi and your external 5 V power supply (if you used one).

11.  Run the code snippet that I gave above if you are using Python (you must first have first imported the gpiozero module).  For other programming languages, look up appropriate code on the web.  If everything is working, you should see the indicator LED for your chosen relay turn on for 5 seconds, then turn off.  If you listen carefully, you should also be able to hear a quiet clicking sound as the relay closes and opens.

12. If everything has worked up to this point, connect something that you want to turn on to the relays.  You'll need a jeweler's screwdriver or some other small screwdriver to open the screw that clamps down on the output wires from the relay.  A small, battery-powered motor is good for a test.  Here's how it worked for me:  https://twitter.com/baskaufs/status/963243621012123648

Froggy the Robot, take 1

Ever since I was a kid and read Andy Buckram's Tin Men (Carol Ryrie Brink, 1966), I always thought it would be really cool to build a robot.  When I was in college, I had the opportunity to take a course that focused on digital electronics and we had fun in the class building burglar alarms and other cool stuff with TTL logic chips.  So over the years, I accumulated various power supplies, chips, and other miscellaneous junk with the intention of actually building a robot some day.

About ten years ago when my two daughters were in middle school, I decided that the time was ripe for actually building the robot as a father-daughter project.  Somewhere along the line, I acquired the plans for building an RS232 interface based on an AY-3-1015D Universal Asynchronous Receiver/Transmitter (UART) and the UART chip itself.  The plan was to use the UART to communicate between a laptop's serial port and the robot, and use the UART data bit outputs to control relays on the robot.  So after some soldering lessons for the girls, we started putting it together.  I think I underestimated the patience of pre-teens for hours of soldering and ended up doing most of it myself, but I think they understood the basic principle of what we were building.


In the end, we had a little Visual Basic program with buttons that sent a number whose bits determined which relays should be turned on and off.  Each bit of the output of the UART went through a 7404 TTL NOT chip (to prevent backwards frying of the UART chip and to invert the signal), and the output of the 7404 drove a PNP transistor in a manner very analogous to Sziklai pair discussed earlier in this post. 

One difference between the relays that we used in our project and the relays that come on the 8-relay board is that the relays in our robot project were double pole, double throw (DPDT) rather than SPDT.  The reason this was important to us was that we wanted to use the relays to be able to reverse the direction of the robot motors.  See this diagram that I borrowed from quora.com:

When the switch contacts are thrown up, the positive side of the battery is connected to the + end of the motor and the motor rotates one way.  When the switch contacts are thrown down, the positive side of the battery is connected to the - end of the motor and the motor runs the other way.  This kind of reversing action could be mimicked by the SPDT relays of the 8-relay unit, but it would require using two of the relays in tandem (i.e. 2 relays to reverse one motor).  It would also be tricky to avoid shorting the battery if the two switches didn't throw at exactly the same time. 

Our robot was not very sophisticated - it just had 2 wheels whose direction could be controlled independently.  When both wheels went forward, the robot went forward.  When both went backwards, the robot went backwards.  When one wheel went forwards and the other went backwards, the robot rotated in an appropriate direction (we had a third unpowered caster wheel to support the back side of the robot platform).

The most exciting feature of the robot was an old drawer from a CD drive (back in the days when they were motorized).  With two more relays, we could power the drawer motor and control its direction (in or out).  The out-and-in movement of the drawer reminded my daughters of a frog's tongue, so that's how the robot got the name "Froggy".  In the end, my daughters created a game where a magnet hung from the end of the "tongue" and they could drive the robot around picking up small iron BBs scattered on the floor.

After the novelty wore off, Froggy was put away in a box.  Between then and now, serial ports have virtually disappeared from computers, although I was able to get my old Dell laptop running long enough to make this video of the old Froggy in operation:


Froggy the Robot, take 2

When I decided to buy a Raspberry Pi, I knew immediately that one of my first projects would be to try to rebuild Froggy to be controlled directly by the RPi GPIO interface.  All of the parts of Froggy's brain that ran the UART could be lobotomized, leaving the board with the relays and all of their connections to the motors.  In the part of this diagram:



where the NOT gate was, I replaced it with the collector side (pin 3) of the phototransistor in one of the discrete optoisolators.  Pin 4 was connected to a common ground with the coil.  I was a bit concerned whether the phototransistor could sink enough current to light up the indicator LED as well as turning on the PNP transistor that drove the relay.  But it had no problem with that, so I was rather quickly able to run the five control wires for the relays into the outputs of 5 optoisolators.

The most difficult part of making the conversion was to make a cable to connect Froggy to the RPi.  When I ran Froggy with the RS232 interface, it only required two wires (a ground wire and the signal wire).  I was able to splice together several old telephone cords to make Froggy's teather quite long.  However, when controlling Froggy using the RPi, I needed a separate control wire for each of the five relays, plus a ground wire.  Luckily, I was able to find an ancient ribbon cable that had been spliced to a multi-wire cable, which I had salvaged from some old piece of junk.  It had only narrowly escaped being thrown out last summer when I cleaned out the basement.  Unfortunately, there were more wires in the multi-wire cable than in the ribbon cable, and apparently some of the ribbon cable wires weren't actually connected to anything.  So I had to spend over an hour with my ohmmeter trying to figure out which of wires at the two ends of the cable were actually connected to each other.  Eventually, I had something like 10 usable wires in the cable - 6 for running Froggy now and some others for future expansion.

Here is the end result:


You can see the Python code that runs the Froggy controller in this gist.

Future projects

I probably won't devote a lot of energy to embellishments for Froggy.  I may add one or more sensor buttons to the end of the tongue that will detect if the robot has run into something.  In this post, I didn't go into how to accept input through the GPIO interface.  It is much simpler than output and only requires 5 V power, an optoisolator, a resistor, and a switch.  I may write another post if I get sensor buttons working.

What I really want to do is to figure out how to set up a web server on the RPi so that I can communicate with it through WiFi and the Internet.  If I manage to do that, the RPi will just ride on Froggy with a portable power supply - no monitor, keyboard, or mouse required.  If I get that to work, I could control the robot through a remote computer or perhaps my phone.  We'll see if I ever get around to doing that!

Appendix

To run Froggy's onboard 5 V electronics, I just bought a little MC78LXXA 5 volt (positive) 0.1 A positive voltage regulator (TO-92 package).  It was super-simple to hook it up.  Here's a diagram of a different 5 V voltage regulator, but the concept is the same.

I bought a D cell holder that would hold 4 cells in series.  At 1.5 V per cell, that's 6 volts.  I connected the negative end of the cell holder to the ground pin of the voltage regulator and the positive end to the "+5.5V ... 16V" connection at the left side of the diagram.  The "+5V" connection at the right serves as the regulated 5 V supply (with a common ground to the negative end of the battery holder.  My data sheet says "Bypass Capacitors are recommended for optimum stability and transient response and should be located as close as possible."  I actually just left them out and got away with it, although it probably would have been better to have put them in.

The MC78LXXA is rated for an output of 100 mA.  I suspect that when I was driving all 5 of the relays, I might have gone over that, but it always was able to run the circuitry anyway.  When the robot is at rest with no energized relays, I don't think that it is drawing more than about 10 mA.  If you wanted to use the 4 D cell system to provide power to the 8 relay module, I think you could just use a 5 volt regulator with a higher current output rating.  For example, I googled and found a μA7805CKC regulator in a TO-220 package that is rated at 1.5 A.  That would easily provide the 500 mA that I estimated was required when all 8 relays on the module were energized, with 1000 mA to spare.

Sunday, July 2, 2017

How (and why) we set up a SPARQL endpoint



Recently I've been kept busy trying to finish up the task of getting the Vanderbilt Semantic Web Working Group's SPARQL endpoint up and running.  I had intended to write a post about how we did it so that others could replicate the effort, but as I was pondering what to write, I realized that it is probably important for me to begin by explaining why we wanted to set one up in the first place.

Why we wanted to set up a SPARQL endpoint

It is tempting to give a simple rationale for our interest in a SPARQL endpoint: Linked Data and the Semantic Web are cool and awesome, and if you are going to dive into them, you need to have a SPARQL endpoint to expose your RDF data.  If you have read my previous blog posts, you will realize that I've only been taking little sips of the Semantic Web Kool-Aid, and have been reluctant to drink the whole glass.  There have been some serious criticisms of RDF and the Semantic Web, with claims that it's too complicated and too hard to implement.  I think that many of those criticisms are well-founded and that the burden is on advocates to show that they can do useful things that can't be accomplished with simpler technologies.  The same test should be applied to SPARQL: what can you do with a triple store and SPARQL endpoint that you can't do with a conventional server, or a different non-RDF graph database system like Neo4J?

 It seems to me that there are two useful things that you can get from a triplestore/SPARQL endpoint that you won't necessarily get elsewhere.  The first is the ability to dump date from two different sources into the same graph database and immediately be able to query the combined graph without any further work or data munging.  I don't think that is straightforward with alternatives like Neo4J.  This is the proverbial "breaking down of silos" that Linked Data is supposed to accomplish.  Successfully merging the graphs of two different providers and deriving some useful benefit from doing so depends critically on those two graphs sharing URI identifiers that allow useful links to be made between the two graphs.  Providers have been notoriously bad about re-using other providers' URIs, but given the emergence of useful, stable, and authoritative sources (like the Library of Congress, GeoNames, the Getty Thesauri, ORCID, and others) of RDF data about URI-identified resources, this problem has been getting better.

The second useful thing that you can get from a triplestore/SPARQL endpoint is essentially a programmable API.  Conventional APIs abound, but generally they support only particular search methods with a fixed set of parameters that can be used to restrict the search.  SPARQL queries are generic.  If you can think of a way that you want to search the data and know how to construct a SPARQL query to do it, you can do any kind of search that you can imagine.  In a conventional API, if the method or parameter that you need to screen the results doesn't exist, you have to get a programmer to create it for you on the API.  With SPARQL, you do your own "programming".

These two functions have influenced the approach that we've taken with our SPARQL endpoint.  We wanted to create a single Vanderbilt triplestore because we wanted to be able to merge in it shared information resources to which diverse users around the campus might want to link.  It would not make sense to create multiple SPARQL endpoints on campus, which could result in duplicated data aggregation efforts.  We also want the endpoint to have a stable URI that is independent of department or unit within the institution.  If people build applications to use that endpoint as an API, we don't want to break those applications by changing the subdomain or subpath of the URI, or to have the endpoint disappear once users finished a project or got bored with their own endpoint.


Step 1: Figuring out where to host the server

For several years, we've had an implementation of the Callimachus triplestore/SPARQL endpoint set up on a server operated by the Jean and Alexander Heard Library at Vanderbilt (http://rdf.library.vanderbilt.edu).  As I've noted in earlier posts, Callimachus has some serious deficiencies, but since that SPARQL endpoint was working and the technical support we've gotten from the library has been great, we weren't in too big of a hurry to move it somewhere else.  Based on a number of factors, we decided that we would rather have an installation of Blazegraph, and since we wanted this to be a campus-wide resource, we began discussion with Vanderbilt IT services about how to get Blazegraph installed on a server that they supported. There was no real opposition to doing that, but it became apparent that the greatest barrier to making it happen was to get people at ITS to understand what a triplestore and SPARQL endpoint was, why we wanted one, and how we planned to use it.  Vanderbilt's ITS operates on a model where support is provided to administrative entities who bear the cost of that support, and where individuals are accountable to defined administrators and units.  Our nebulous campus-wide resource didn't really fit well in that model and it wasn't clear who exactly should be responsible to help us.  Emails went unanswered and actions weren't taken.  Eventually, it became apparent that if we wanted to get the resource up in a finite amount of time, our working group would have to make it happen ourselves.

Fortunately, we had a grant from the Vanderbilt Institute for Digital Learning (VIDL) that provided us with some money that we could use to set our system up on a commercial cloud server.  We received permission to use the subdomain sparql.vanderbilt.edu, so we started the project with the assumption that we could get a redirect from that subdomain to our cloud server when the time was right.  We decided to go with a Jelastic-based cloud service, since that system provided a GUI interface for managing the server.  There were cheaper systems that required doing all of the server configuration via the command line, but since we were novices, we felt that paying a little more for the GUI was worth it.  We ended up using ServInt as a provider for no particularly compelling reason other than group members had used it before.  There are many other options available.

Step 2: Setting up the cloud server

Since ServInt offered a 14 day free trial account, there was little risk in us playing around with the system at the start.  Eventually, we set up a paid account.  Since the cost was being covered by an institutional account, we did not want the credit card to be automatically charged when our initial $50 ran out.  Unfortunately, the account portal was a bit confusing - it turns out that turning off AutoPay does not actually prevent an automatic charge to the credit card.  So when our account got to an alert level of $15, we got hit with another $50 charge.  It turns out that the real way to turn off automatic charges is to set the autofunding level to $0.  Lesson learned - read the details.  Fortunately we were going to by more server time anyway.

The server charges are based on units called "cloudlets".  Cloudlet usage is based on a combination of the amount of space taken up by the installation and by the amount of traffic handled by the server.  One cloudlet is 128 MiB of RAM and 400MHz of CPU.  The minimum number of reserved cloudlets per application server is one, and you can increase the upper scaling limits to whatever you want.  A higher limit means you could pay at a greater rate if there is heavy usage.  Here's what the Jelastic Environmental Topology GUI looks like:


The system that we eventually settled on uses an Nginx front-end server to handle authentication and load balancing, and a Tomcat back-end server to actually run the Blazegraph application.  The primary means of adjusting for heavy usage is to increase the "vertical" scaling limit (maximum resources allocated to a particular Tomcat instance).  If usage were really high, I think you could increase the horizontal scaling by creating more than one Tomcat instance.  I believe that in that case the Nginx server would balance the load among the multiple Tomcat servers.  However, since our traffic is generally nearly zero, we haven't really had to mess with that.  The only time that system resources have gotten tight was when we were loading files with over a million triples.  But that was only a short-term issue that lasted a few minutes.  At our current usage, the cost to operate the two-server combination is about $2 per day.


The initial setup of Blazegraph was really easy.  In the GUI control panel (see above), you just click "Create environment", then click the Create button to create the Tomcat server instance.  When you create the environment, you have to select a name for it that is unique within the jelastic.servint.net subdomain.  The name you chose will be the subdomain of your server.  We chose "vuswwg", so the whole domain name of our server was vuswwg.jelastic.servint.net .  What we chose wasn't really that important, since we were planning eventually to redirect to the server from sparql.vanderbilt.edu .

To load Blazegraph, go to the Blazegraph releases page and copy the WAR Application download link, e.g. https://github.com/blazegraph/database/releases/download/BLAZEGRAPH_RELEASE_2_1_4/blazegraph.war .  On the control panel, click on Upload and paste in the link you copied.  On the Deployment manager list, we selected "vuswwg" from the Deploy to... dropdown next to the BaseX863.war name.  On the subsequent popup, you will create a "context name".  The context name serves as the subpath that will refer to the Blazegraph application.  We chose "bg", so the complete path for the Blazegraph web application was: http://vuswwg.jelastic.servint.net/bg/ .  Once the context was created, Blazegraph was live and online.  Putting the application URL into a browser brought up the generic Blazegraph web GUI.  Hooray!

The process seemed too easy, and unfortunately, it was.  We had Blazegraph online, but there was no security and the GUI web interface as installed would have allowed anyone with Internet access to load their data into the triplestore, or to issue a "DROP ALL" SPARQL Update command to delete all of the triples in the store.  If one were only interested in testing, that would be fine, but for our purposes it was not.

Step 3: Securing the server

It became clear to us that we did not have the technical skills necessary to bring the cloud server up to the necessary level of security.  Fortunately, we were able to acquire the help of developer Ken Polzin, with whom I'd worked on a previous Bioimages project.  Here is an outline of how Ken set up our system (more detailed instructions are here).

We installed an Nginx server in a manner similar to what was described above for the Tomcat installation.  Since the Nginx server was going to be the outward-facing server, Public IPv4 needed to be turned on for it, and we turned it off on the Tomcat server.

There were several aspects of the default Nginx server configuration that needed to be changed in order for the server to work in the way that we want.  The details are on this page.  One change redirected from the root of the URI to the /bg/ subpath where Blazegraph lives.  That allows users to enter https://sparql.vanderbilt.edu/ and be redirected to https://sparql.vanderbilt.edu/bg/ or https://sparql.vanderbilt.edu/sparql and be redirected to https://sparql.vanderbilt.edu/bg/sparql .  We wanted this behavior so that we would have a "cool URI" that was simple and did not include implementation-specific information (i.e. "bg" for Blazegraph).  Another change in the configuration facilitated remote calls to the endpoint by enabling cross-origin resources sharing (CORS).

The other major change to the Nginx configuration was related to controlling write access to Blazegraph.  We accomplished that by restricting unauthenticated users to HTTP GET access.  Methods of writing to the server, such as SPARQL Update commands, require HTTP POST.  Changes we made to the configuration file required authentication for any non-GET calls.  Unfortunately, that also meant that regular SPARQL queries could not be requested by unauthenticated users using POST, but that is only an issue if the queries are very long and exceed the character limit for GET URIs.

Authentication for administrative users was accomplished using passwords encrypted using OpenSSL.  Directions for generating and storing passwords is here.  The authentication requirements were made in the Nginx configuration file as discussed above.  Once authentication was in place, usernames and passwords could be sent as part of the POST dialog.  Programming languages and HTTP GUIs such as Postman have built-in mechanisms to support Basic Authentication.  Here is an example using Postman:



Related to the issue of restricting write access was modification of the default Blazegraph web GUI.  Out of the box, the GUI had a tab for Updates (which we had disabled for unauthenticated users) and for some other advanced features that we didn't want the public to see.  Those tabs can be hidden by modifying the /opt/tomcat/webapps/bg/html/index.html using the server control panel (details here).  We also were able to style the query GUI page to comply with Vanderbilt's branding standards.  You can see the final appearance of the GUI at https://sparql.vanderbilt.edu .

The final step in securing access to the server was to set up HTTPS.  The Jeslastic system provides a simple method to set up HTTP using a free Let's Encrypt SSL certificate.  Before we could enable HTTPS, we had to get the redirect set up from the sparql.vanderbilt.edu subdomain to the vuswwg.jelastic.servint.net subdomain.  This was accomplished by creating a DNS "A" record pointing to the public IP address of the Nginx instance.  To the best of my knowledge, in the Jelastic system, the Nginx IP address is stable as long as the Nginx server is not turned off.  (Restarting the server is fine.)  If the server were turned off and then back on, the IP address would change, and a new A record would have to be set up for sparql.vanderbilt.edu .  Getting the A record set up required several email exchanges with Vanderbilt ITS before everything worked correctly.  Once the record proliferated throughout the DNS system, we could follow the directions to install Let's Encrypt and make the final changes to the Nginx configuration file (see this page for details).

Step 4: Loading data into the triplestore

One consequence of the method that we chose for restricting write access to the server was that it was no longer possible to using the Blazegraph web GUI to load RDF files directly from a hard drive into the triplestore.  Fortunately, files could be loaded using the SPARQL Update protocol or the more generic data loading commands that are part of the Blazegraph application, both via HTTP (see examples here).

One issue with using the generic loading commands is that I'm not aware of any way to specify that the triples in the RDF file be added to a particular named graph in the triple store.  If one's plan for managing the triple store involved deleting the entire store and replacing it, then that wouldn't be important.  However, we plan to compartmentalize the data of various users by associating those data with particular named graphs that can be dropped or replaced independently.  So our options were limited to what we could do with SPARQL Update.

The two most relevant Update commands were LOAD and DROP, for loading data into a graph and deleting graphs, respectively.  Both commands must be executed through HTTP POST.

There are actually two ways to accomplish a LOAD command: by sending the request as URL-encoded text or by sending it as plain text.  I couldn't see any advantage to the URL-encoded method, so I used the plain text method.  In that method, the body of the POST request is simply the Update command.  However, the server will not understand the command unless it is accompanied by a Content-Type request header of application/sparql-update.  Since the server is set up to require authorization for POST requests, an Authorization header is also required, although Postman handles that for us automatically when the Basic Auth option is chosen.

The basic format of the SPARQL Update LOAD command is:

LOAD <sourceFileURL> INTO GRAPH <namedGraphUri>

where sourceFileURL is a dereferenceable URL from which the file can be retrieved and namedGraphUri is the URI of the named graph in which the loaded triples should be included.  The named graph URI is simply an identifier and does not have to represent any real location on the web.

The sourceFileURL can be a web location (such as GitHub) or a local file on the server if the file: URI type is used (e.g. file:///opt/tomcat/temp/upload/data.rdf).  Unfortunately, the file cannot be loaded directly from your local drive.  Rather, it first must be uploaded to the server, then loaded from its new location on the server using the LOAD command.  To upload a file, click on the Tomcat Config icon (wrench/spanner) that appears when you mouse over the area to the right of the Tomcat entry.  A directory tree will appear in the pane below.  You can then navigate to the place where you want to upload the file.  For the URL I listed above, here's what the GUI looks like:


Select the Upload option and the a popup will allow you to browse to the location of the file on your local drive.  Once the file is in place on the server, you can use your favorite HTTP tool to POST the SPARQL Update command and load the file into the triplestore.

This particular method is a bit of a hassle, and is not amenable to automation.  If you are managing files using GitHub, it's a lot easier to load the file directly from there using a variation on the Github raw file URL.  For example, if I want to load the file https://github.com/baskaufs/cv/blob/master/occurrenceStatus/occurrenceStatus.ttl into the triplestore, I would need to load the raw version at https://raw.githubusercontent.com/baskaufs/cv/master/occurrenceStatus/occurrenceStatus.ttl .  However, it is not possible to successfully load that file into Blazegraph directly from Github using the LOAD command.  The reason is that when a file is loaded from a remote URL using the SPARQL update command, Blazegraph apparently depends on the Content-Type header from the source to know that the file is some serialization of RDF.  Github and Github Gist always report the media type of raw files as text/plain regardless of the file extension, and Blazegraph takes that to mean that the file does not contain RDF triples.  If one uses the raw file URL in a SPARQL Update LOAD command, Blazegraph will issue a 200 (OK) HTTP code, but won't actually load any triples.

The solution to this problem is to use a redirect that specifies the correct media type.  The caching proxy service RawGit (https//rawgit.com/) interprets the file extension of a Github raw file and relays the requested file with the correct Content-Type header.  The example file above would be retrieved using the RawGit development URL https://rawgit.com/baskaufs/cv/master/occurrenceStatus/occurrenceStatus.ttl . RawGit will add the Content-Type header text/turtle as it redirects the file.  (Read the RawGit home page at https://rawgit.com/ for an explanation of the distinction between RawGit development URLs and production URLs.)

The SPARQL Update DROP command has this format:

DROP GRAPH <namedGraphUri>

Executing the command removes from the store all of the triples that had been assigned to the graph identified by the URI in the position of  namedGraphUri.

If the graphs are small (a few hundred or thousand triples), both loading and dropping them requires a trivial amount of time.  However, when the graph size is significant (i.e. a million triples or more), then a non-trivial amount of time is required either to load or drop the graph.  I think that the reason is because of the indexing that Blazegraph does as it loads the triples.  That indexing is what makes Blazegraph be able to conduct efficient querying.  The transfer of the file itself can be sped up by compressing it.  Blazegraph supports gzip (.gz) file compression.  However, compressing the file doesn't seem to speed up the actual time required to load the triples into the store.  I haven't done a lot of experimenting with this, but I have one anecdotal experience loading a gzip compressed file containing about 1.7 million triples.  I uploaded the file to the server, then used the file: URI version of SPARQL Update to load it into the store.  Normally, the server sends an HTTP 200 code and a response body indicating the number of "mutations" (triples modified) after the load command is successfully completed.  However, in the case of the 1.7  million triple file, the server timed out and sent an error code.  But when I checked the status of the graph a little later on, all of the triples seemed to have successfully loaded.  So the timeout seems to have been a limit on communication between the server and client, but not necessarily a limit on the time necessary to carry out actions that are happening internally in the server.

I was a bit surprised to discover that dropping a large graph took about as long as to load it.  In retrospect, I probably shouldn't have been surprised.  Removing a bunch of triples involves removing them from the indexing system, not just deleting some file location entry as would be the case for deleting a file on a hard drive.  So it makes sense that the removal activity should take about as long ad the adding activity.

These speed issues suggest some considerations for graph management in a production environment.  If one wanted to replace a large graph (i.e. over a million triples), dropping the graph, then reloading it probably would not be the best option, since both actions would be time consuming and the data would probably be essentially "off line" during the process.  It might work better to load the new data into a differently named graph, then use the SPARQL Update COPY or MOVE functions to replace the existing graph that needs to be replaced.  I haven't actually tried this yet, so it may not work any better than dropping and reloading.

Step 5: Documenting the graphs in the triplestore

One problem with the concept of a SPARQL endpoint as a programmable API is that users need to understand the graphs in the triplestore in order to know how to "program" their queries.  So our SPARQL endpoint wasn't really "usable" until we provided a description of the graphs included in the triple store.  On the working group Github site, we have created a "user guide" with some general instructions about using the endpoint and a page for each project whose data are included in the triplestore.  The project pages describe the named graphs associated with the project, including a graphical representation of the graph model and sample queries (an example is here).  With a little experimentation, users should be able to construct their own queries to retrieve data associated with the project.

Step 6: Using the SPARQL endpoint

I've written some previous posts about using our old Callimachus endpoint as a source of XML data to run web applications.  Since Blazegraph supports JSON query results, I was keen to try writing some new Javascript to take advantage of that.  I have a new little demo page at http://bioimages.vanderbilt.edu/lang-labels.html that consumes JSON from our new endpoint.  The underlying Javascript that makes the page work is at http://bioimages.vanderbilt.edu/lang-labels.js .  The script sends a very simple query to the endpoint, e.g.:

SELECT DISTINCT ?label WHERE {
<http://rs.tdwg.org/cv/status/extant> <http://www.w3.org/2004/02/skos/core#prefLabel> ?label.
}

when the query is generated using the page's default values.  Here's what that query looks like when it's URL encoded and ready to be sent by HTTP GET:

https://sparql.vanderbilt.edu/sparql?query=SELECT%20DISTINCT%20%3Flabel%20WHERE%20%7B%3Chttp%3A%2F%2Frs.tdwg.org%2Fcv%2Fstatus%2Fextant%3E%20%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23prefLabel%3E%20%3Flabel.%7D

On Chrome, you can use the Developer Tools to track the interactions between the browser and the server as you load the page and click the button.

The Jquery code that does the AJAX looks like this:

$.ajax({
    type: 'GET',
    url: 'https://sparql.vanderbilt.edu/sparql?query=' + encoded,
    headers: {
Accept: 'application/sparql-results+json'
    },
    success: function(returnedJson) {
[handler function goes here]
        }
    });

Since the HTTP GET request includes an Accept: header of application/sparql-results+json, the server response (the Javascript object returnedJson) looks like this:

{
  "head" : {
    "vars" : [ "label" ]
  },
  "results" : {
    "bindings" : [ {
      "label" : {
        "xml:lang" : "en",
        "type" : "literal",
        "value" : "extant"
      }
    }, {
      "label" : {
        "xml:lang" : "de",
        "type" : "literal",
        "value" : "vorhanden "
      }
    }, {
      "label" : {
        "xml:lang" : "es",
        "type" : "literal",
        "value" : "existente"
      }
    }, {
      "label" : {
        "xml:lang" : "pt",
        "type" : "literal",
        "value" : "presente"
      }
    }, {
      "label" : {
        "xml:lang" : "zh-hans",
        "type" : "literal",
        "value" : "现存"
      }
    }, {
      "label" : {
        "xml:lang" : "zh-hant",
        "type" : "literal",
        "value" : "現存"
      }
    } ]
  }
}

It then becomes a simple matter to pull the label values from the JSON array using this Javascript loop found in the handler function:

var value = "";
for (i = 0; i < returnedJson.results.bindings.length; i++) {
    value = value + "<p>" 
    + returnedJson.results.bindings[i].label["xml:lang"] + " " 
    + returnedJson.results.bindings[i].label.value + "</p>";
    }

then display value on the web page.  One issue with the translation from the JSON array to the Javascript array reference can be seen in the code snippet above.  The JSON key "xml:lang" is not a valid Javascript name due to the presence of the colon.  So "bracket notation" must be used in the Javascript array reference instead of "dot notation" to refer to it.

Conclusion

I am quite excited that our endpoint is now fully operational and that we can build applications around it.  One disappointing discovery that I made recently is that as currently configured, our Blazegraph instance is not correctly handling URL encoded literals that are included in query strings.  It works file with literals that contain only ASCII strings, but including a string like "現存" (URL encoded as "%E7%8F%BE%E5%AD%98") in a query fails to produce any result.  This problem doesn't happen when the same query is made to the Callimachus endpoint.  That is a major blow, since several datasets that we have loaded or intend to load into the triplestore include UTF-8 encoded strings representing literals in a number of languages.  I sent a post about this problem to the Bigdata developer's email list, but have not yet gotten any response.  If anyone has any ideas about why we are having this problem, or how to solve it, I'd be keen to hear from you.

Aside from that little snafu, we have achieved one of the "useful things" that SPARQL endpoints allow: making it possible for users to "program the API" to get any kind of information they want from the datasets included in the triplestore.  It remains for us to explore the second "useful thing" that I mentioned at the start of this post: merging RDF datasets from multiple sources and accomplishing something useful by doing so.  Stay tuned as we try to learn effective ways to do that in the future.

We also hope that at some point in the future we will have demonstrated that there is some value to having a campus-wide SPARQL endpoint, and that once we can clearly show how we set it up and what it can be used for, we will be able to move it from the commercial cloud server to a server maintained by Vanderbilt ITS.