Getting Started with the Sitebuilder API
Accessing web pages with cURL
cURL is a command-line HTTP client, much like a web browser but driven with text commands. For example, to get the University's home page, I could use the command:
mat@augustus:~$ curl "https://warwick.ac.uk"
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://warwick.ac.uk/insite/">here</a>.</p>
</body></html>
Note that because I am on campus, the server returns an instruction to send me directly to insite.
By default, curl returns the content of a page from a GET
- this is what happens when you type an address into a web browser. If I pass the -i flag to cURL, it will also show me the server headers, and passing -I sends a HEAD
instead of a GET
- which tells the server I don't care about the contents of the page.
mat@augustus:~$ curl -i -I "https://warwick.ac.uk"
HTTP/1.1 302 Found
Date: Mon, 21 Mar 2011 17:34:51 GMT
Server: Apache/2.2.9 (Unix) mod_ssl/2.2.9 OpenSSL/0.9.7d
Location: https://warwick.ac.uk/insite/
Vary: Accept-Encoding
Content-Type: text/html; charset=iso-8859-1
A slightly more interesting URL to fetch is one of the API endpoints. Sitebuilder provides a JSON (JavaScript Object Notation) API to get information about a page at https://sitebuilder.warwick.ac.uk/sitebuilder2/api/page.json?page=...
, where the page parameter is the URL of your Sitebuilder page AFTER the warwick.ac.uk. So to get information about the IT Services home page at http://warwick.ac.uk/services/its
, we send a GET request as follows:
mat@augustus:~$ curl "https://sitebuilder.warwick.ac.uk/sitebuilder2/api/page.json?page=/services/its"
{
"url": "https://warwick.ac.uk/services/its",
"pageType": "html",
"linkCaption": "IT Services",
"shortTitle": "IT Services - Warwick University",
"pageHeading": "Welcome to IT Services",
"deleted": false,
"path": "/services/its",
"publiclyVisible": true,
"mimeType": "text/html",
"properties": {
"pageOrder": 0,
"supportsPagesToGo": false,
"showInLocalNavigation": false,
"hasThumbnail": false,
"deferJs": false,
"widePage": false,
"allowSearchEngines": true,
"spanRhs": true,
"escapeHtml": false
},
"keywords": [
"computing",
"computers",
"it services",
"information technology"
],
"siteRoot": "/services/its"
}
Note that in this example the JSON has been simplified and formatted to make it easier to read. This shows information about the page in a way that a computer program can read, but this only shows publicly visible information. If we try and get information for a protected page, it won't work, so we will need to authenticate to the server to get this.
Using HTTP Basic authentication
Adding &forcebasic=true
to the end of any Sitebuilder URL accessed over HTTPS tells Sitebuilder to ask for HTTP Basic authentication. This is a lightweight authentication protocol that requires a username and password to be sent with each request.
In cURL, this is achieved by using the -u parameter to specify a username - cURL then prompts the user for a password.
The username is your usual ITS username, with the password being your ITS account password with a valid two-step authentication code added on the end.
In this example, note how accessing a protected page without these parameters redirects the user to Web sign-on to log in, but works when a username and password is correctly specified.
mat@augustus:~$ curl -i "https://sitebuilder.warwick.ac.uk/sitebuilder2/api/page.json?page=/services/its/intranet"
HTTP/1.1 302 Moved Temporarily
Date: Mon, 21 Mar 2011 17:48:26 GMT
Server: Penny
Location: https://websignon.warwick.ac.uk/origin/slogin?shire=https%3A%2F%2...
Content-Length: 0
Vary: User-Agent
Content-Type: application/json
mat@augustus:~$ curl -i -u cuscav
"https://sitebuilder.warwick.ac.uk/sitebuilder2/api/page.json?page=/services/its/intranet&forcebasic=true"
Enter host password for user 'cuscav':
HTTP/1.1 200 OK
Date: Mon, 21 Mar 2011 17:49:19 GMT
Server: Penny
Content-Type: application/json;charset=ISO-8859-1
Content-Language: en-GB
Content-Length: 1552
Vary: User-Agent
{"editedUpdated":{"date":1297850802000,"user":"Emily Harding"},"contentUpdated":...
Important: Because of the way that HTTP Basic authentication sends an encrypted version of the username and password with every request, it is less secure than standard authentication through web sign-on. As well as this, by their very nature, automated programs require username and passwords to be stored on a computer in such a way that it would be a breach of the Terms and Conditions of IT Services accounts to store your actual ITS usercode and password in this way.
For automated systems that access Sitebuilder APIs, you must use an external user account for API access. You should not share an API account between multiple systems, and you should ensure that each account is granted the minimal permissions it needs to perform its tasks, and only on the pages where it needs to perform these tasks.
Getting an external user account for API access
In order to obtain an external user account for API access, you will need to request to become an external user creator by filling in the form here.