Notes
One of the nicest things about HTTP APIs is how easy they are to interact with and debug. If you're not familiar with tools like netcat or curl, and/or you've never tried using tcpflow or ethereal to watch an HTTP conversation, it's worth doing some time. You can debug a lot of API issues using these simple tools. You can get these tools using Cygwin on windows, if you must
$ curl -s http://www2.warwick.ac.uk/sitebuilder2/api/rss/siteChangesRss.htm?page=/services/its | xmlstarlet sel -t -m '//item' -v "concat(title,',',pubDate)" -n MATLAB Site licence,Wed, 12 Nov 2008 10:44:01 GMT Latest Availability,Wed, 12 Nov 2008 10:33:18 GMT Feedback,Tue, 11 Nov 2008 16:59:09 GMT Software and applications,Tue, 11 Nov 2008 15:36:10 GMT On-line SiteBuilder2 Training,Tue, 11 Nov 2008 14:48:14 GMT SPSS 16 - Elementary Statistical Methods,Tue, 11 Nov 2008 14:29:19 GMT How do I create a Pidgin account?,Tue, 11 Nov 2008 11:57:36 GMT
$ sudo tcpflow -cs dst 137.205.195.44 or src 137.205.195.44 | grep 'HTTP' tcpflow[1276]: listening on eth0 137.205.195.044.00080-137.205.194.211.55480: HTTP/1.1 200 OK 137.205.194.211.55479-137.205.195.044.00080: GET /services/its/intranet/people/leaversjoiners/ HTTP/1.1 137.205.195.044.00080-137.205.194.211.55479: HTTP/1.1 200 OK 137.205.194.211.55479-137.205.195.044.00080: GET /services/its/intranet/ HTTP/1.1 137.205.195.044.00080-137.205.194.211.55479: HTTP/1.1 200 OK 137.205.194.211.55479-137.205.195.044.00080: GET /services/its/intranet/contactbutton.gif HTTP/1.1 137.205.195.044.00080-137.205.194.211.55479: HTTP/1.1 304 Not Modified
The fact that we use HTTP for our interfaces has one drawback worth pointing out; there is sometimes a blurring between what's an API (i.e. it is stable, easy-to-parse, and has reasonably self-evident semantics) and what's part of the UI (none of the above). Often we end up using parts of the UI like an API. This is OK, but the resulting code tends to be more fragile. Where people are building up a substantial dependency on UI-as-API, we'd like to know about it, because it shows up places where we'd get value by building a proper UI.
When we say 'be nice' we mean, primarily, 'don't do parallel requets without permission'. There are a few reasons for this:
- If you make large numbers of parallel requests, you're placing a higher load on the system. We don't currently have separate resource controls for API access vs end-user access, or for separate API users, so you'll end up breaking it for other people
- Since we use apache front-ends, which have limited numbers of connections, you may end up simply consuming all of the available connections, thus DOS-ing the server
- Parallel code is easy to get wrong, and if you make a mistake you may end up hammering the server very hard indeed. If you mess up a single-threaded app, you won't cause as much damage
- There are edge cases (multiple parallel permissions updates being one) where you can deadlock two requests, and in doing so take your site entirely offline.
- We throttle some kinds of parallel requets, so your application will need to be smart enough to detect when an operation has been throttled and retry appropriately