Cascot: Computer Assisted Structured Coding Tool
Home | Details | Online version | Further Information | Purchase | Cascot International |
Cascot is a computer program designed to make the coding of text information to standard classifications simpler, quicker and more reliable.
The software is capable of occupational coding and industrial coding to the UK standards developed by the UK Office for National Statistics. These are the Standard Occupational Classification (SOC) and the Standard Industrial Classification (SIC). Cascot currently supports SOC 2010, SOC 2000, SOC 90, SIC 2007, SIC 2003, SIC 92 and SIC 80. For more information on these classifications please refer to the ONS information on SOC 2010, SOC 2000, SIC 2007, SIC 2003 and SIC 92.
Occupational coding and industrial coding arise in a number of situations. Examples include job titles which may be entered as free text on surveys or administrative databases. A job title is indicative of the kind of work people do or would like to do, or the sorts of jobs in which employers want people to work. Information like this is collected routinely in a wide range of settings such as job vacancy advertising, careers guidance or official statistical enquiries. Coding is the process of categorising the huge range of all possible answers to a predefined set of categories (each category having a unique code).
A desktop version of the Cascot software suitable for processing high volumes of data is available for purchase. For more information about the software see further information.
A multilingual ISCO-08 version of Cascot is available. For more information see Cascot International.
Cascot International
Home | Details | Online version | Further Information | Purchase | Cascot International |
The work to develop a nine-language version of Cascot International was funded under the European Union FP7 project: Data Services Infrastructure for the Social Sciences and Humanities - DASISH (completion date: December 2014). The nine languages included were Dutch, English, Finnish, French, German, Italian, Portuguese, Slovak and Spanish. The upgraded software facilitated future extension to incorporate additional languages as and when relevant index materials became available.
In February 2016, a project began to extend Cascot to a further five languages (Arabic, Chinese, Hindi, Indonesian and Russian). This has been funded by Synergies for Europe's Research Infrastructures in the Social Sciences (SERISS) via the European Social Survey European Research Infrastructure Consortium (ESS ERIC). A beta version of Cascot with all fourteen languages is now available for users to test. It is still undergoing further fine-tuning, especially for Hindi and Chinese and updated versions will be made available as testing is completed.
In any of the language versions coding is facilitated to both ISCO 08 and, where available, a relevant national occupational classification. Users have access to all language interfaces and classification files. The Cascot International package consists of three pieces of software: Cascot (the coding tool); Cascot Editor (the tool for creating and modifying classifications and fine-tuning the coding process); Cascot Performance Tool (for evaluating coding performance).
The available languages are:
Arabic; Chinese; Dutch; English; Finnish; French; German; Hindi; Indonesian; Italian; Portuguese; Romanian; Russian; Slovak; Spanish
To purchase Cascot International (beta version) please complete the Purchase Order Form.
Participants in the DASISH and SERISS projects who wish to obtain Cascot International should contact IER in the first instance.
Cascot International
User Guide
Cascot: Computer Assisted Structured COding Tool
Home | Details | Online version | Further Information | Purchase | Cascot International |
Cascot is designed to assign a code to a piece of text. In the case of the Standard Occupation Classification (SOC) this piece of text is typically a job title. For the Standard Industrial Classification (SIC) the text is a description of the main product or services provided by an employing establishment. The quality of coding performed by Cascot depends on the quality of the input text.
Ideally the text should contain sufficient information to distinguish it from alternative text descriptions which may be coded to other categories within the classification, but it should not contain superfluous words. This ideal will not always be met but Cascot has been designed to perform a complicated analysis of the words in the text, comparing them to the words in the classification, in order to provide a list of recommendations. If the input text is not sufficiently distinctive it may not be the topmost recommendation that is the correct code.
When Cascot assigns a code to a piece of text it also calculates a score from 1 to 100 which represents the degree of certainty that the given code is the correct one. When Cascot encounters a word or phrases that is descriptive of occupation or industry but lacks sufficient information to distinguish it from other categories (i.e. without any further qualifying terms) Cascot will attempt to suggest a code but the score is limited to below 40 to indicate the uncertainty associated with the suggestion. For example 'Teacher' or 'Engineer'.
For SOC specific information including examples of problematic input text please read these further details.
The performance of Cascot has been compared to a selection of high quality manually coded data. The overall results show that 80% of records receive a score greater than 40 and of these 80% are matched to manually coded data. When using Cascot you can expect this level of performance with similar data, but be aware that the performance is dependent on the quality of your input data.
Cascot: Computer Assisted Structured COding Tool
The SOC classification is designed to code job titles. Often captured data may amount to job descriptions, industry area, or may be too verbose providing additional information beyond the job title.
Good quality input should be a job title, as would be used inside a place of work, on a business card or employment contract. Where this would give a title that would be ambiguous in meaning, the title may be qualified with one or more terms in brackets. Such qualification may be used to indicate.* Type of industry e.g. Chemist (pharmaceutical) Chemist (retail trade) Administrator (insurance) Administrator (government) Administrator (local government) Administrator (charitable organisation) Administrator (trade union) * Type of work e.g. Advisory Officer (welfare) Advisory Officer (housing) * Level of work, e.g. Mechanical Engineer Mechanical Engineer (professional) * Place of work e.g. Teacher (primary school) Teacher (university) Such bracketed expressions may be combined. e.g. Engineer (professional, structural) Engineer (professional, water) Engineer (professional, highways) Sometimes it is possible to express a job title several ways e.g. Police inspector Inspector (police service) In these cases either/any format would be as good as the next.
Examples: | "casual work" |
"temp" | |
Comment: | This should be whatever job they are doing 'casually'. |
Example: | "adult & continuing education" |
Comment: | This is where they are working (or spending time). It is not a job title or even a job description. Are they a teacher, a student or even a cleaner? |
Examples: | Examples: "advertising & marketing" |
"advertising" | |
"assurance & business advisory services" | |
Comment: | Similarly, these are the area of work. Are they an advertising executive, an advert designer or an advertising salesman? Try to find out what someone does in preference to where or what type of industry and if possible their actual job title. |
Examples: | "allocating stock to stores, supplier communication" |
"arranging flights" | |
"arranging loans" | |
"assembling kits for car parts" | |
Comment: | These are descriptions of duties, not job titles. |
Example: | "analyst, meeting corporate clients" |
Comment: | Be brief and concise. |
Alternative: | "business analyst" |
Example: | "answering phone enquiries" |
Comment: | Again this is a job description, not a title. There could be enquiries in all sorts of different jobs. If you know more about the nature of the job, use this information. |
Alternative: | "call centre operative" |
Example: | "application developer - developing lotus notes databases for clients" |
Comment: | This has too much detail. Again be brief and concise. Try to get the job title and qualify if ambiguous. |
Alternative: | "application developer (software)" |
Examples: | "assistant correspondent for Japanese broadsheet" |
"behavioural scientist - interactivity with digital technology" | |
"applications chemist - provide technical information, product training, and troubleshooting" | |
Comment: | Again, far too much detail for all of these. |
Alternatives: | "assistant correspondent (newspaper)" |
"behavioural scientist" | |
"applications chemist" | |
download
3_1_0
i18nvf
This page has no content yet.
v5_0_rc1
This page has no content yet.
Version 3_1_5
This page has no content yet.
Version 4_0_2
This page has no content yet.
Version 4_1_1
This page has no content yet.
Version 4_1_2
This page has no content yet.
Version 5_0_1
This page has no content yet.
Version 5_0_2
This page has no content yet.
Version 5_1_1
This page has no content yet.
Version 5_2_1
This page has no content yet.
Version 5_4_0
This page has no content yet.
Version 5_5_2
This page has no content yet.
Version 5_5_3
This page has no content yet.
Version 5_5_5
This page has no content yet.
FAQ
Questions about CASCOT software
- 5 digit codes
- Adding or modifying occupations
- Automatic coding
- Coding per hour
- Licensing
- Number of records
- Output files
- SIC 2003
- Socio-economic classification
- System requirements
Can the online version produce the SOC(DLHE) 5 digit codes?
The answer to this is no. The stand-alone version for high volume that we sell does not produce these codes either. However, we have produced a version which does do this. It is distributed free of charge by HESA. Please contact Rachel Hewitt at HESA: 01242 211122, for further details.
Is it possible to add additional occupations to the software?
If you are using the Standard Occupational Classification (SOC) index it is not possible to add occupations.
We sell CASCOT software with an editor package that allows you to modify the standard classifications or to introduce your own.
Can you tell me what percentage of occupations will it be able to code automatically?
You can set the automation threshold in CASCOT. If you set the threshold to e.g. 56 it means that if the score is 56-100, CASCOT will code the item automatically and if the score is 0-55 it will stop and expect the user to code manually. You can code everything automatically (resulting in lower accuracy), and code everything manually which takes more time but usually results in better accuracy.
According to tests that have been made the optimal automation level is around 64. Please bear in mind that the automatically created codes are not always correct and will need checking if you want to have a better result.
Approximately how many occupations can be manually coded per hour?
We use 100 occupations/hour as a guideline for an experienced user. It varies a lot though depending on the material you are coding. It is slow to code "difficult" occupational titles like Change Manager, New Business Coordinator or Knowledge Transfer Manager or ambiguous ones like Project Manager or Senior Supervisor where you would need more information about the occupation. "Easier" occupational titles like Primary School Teacher, Carpenter, Hotel Manager or Nurse (especially the occupations that are included in the SOC index) are much faster to do.
We licence CASCOT on an organisational and site basis. In other words, once an organisation has purchased a copy, it is free to make multiple copies as long as use is restricted to employees of that organisation. For example, a university may buy a copy, and different research groups who are employees of that university may then use the software - but students may not, given that they are not employees. If an organisation is located at different sites each site requires a licence. We monitor requests for technical help to ensure that the person requesting help is an employee at a licensed organisation.
There is no limit to the number of records that can be processed in one session. If set to code without intervention, the speed of coding depends upon the processor power you have and the hard disk read/write speed. A single processor running at (say) 1.6 with a typical disk setup will process 10,000 cases in about 30 minutes. Much faster speeds can be obtained with shorter text and faster or multiple processors.
What form is the output file with all the coding in it and how can I open it?
By default, the output file from CASCOT is a simple TAB-delimited text file (ASCII file) which can be opened with Windows Notepad or WordPad. If you do not enter a file extension when you open a new output file in CASCOT, Windows does not recognise the file type later and cannot derive with which application it should be opened. It is useful to define the output file as something like 'outputfile.txt' where the .txt extension will tell Windows it is a text file.
You can open the output file for example with Excel or SPSS as follows.
In Excel:
- select File>Open
- select Files of type 'All files' or 'Text files'
- select the output file from the file list, click Open
- select Original data type 'Delimited', click Next
- select 'Tab' as Delimiter, click Next
- click Finish
In SPSS:
- select File>Open>Data
- select Files of type 'Text (*.txt)'
- select the output file from the file list, click Open
- in Text Import Wizard Step 1 select 'No' to Does your text file match...
- in Step 2 select 'Delimited', the answer to the second question depends on whether you chose to have titles on the first row of the output file in CASCOT
- in Step 3 click Next
- in Step 4 select only 'Tab' as Delimiter
- in Step 5 you might need to change the Data format for any string fields
- in Step 6 click Finish
Please note that you can change the delimiter of the output file in CASCOT in Options>Output>Fields separated by, which will affect how the file should be opened in Excel or SPSS.
Is there a lookup file of SOC 2010 classification occupation codes to a measure of social class or socio-economic classification?
The ONS website has a lookup table which shows how to derive the NS-SEC from SOC2010 4-digit codes. Besides SOC2010 codes, the additional information needed for this purpose is establishment size and status in employment. If such additional information is not available, the website also has details of how a lower quality version of the NS-SEC can be produced. The website has an online coding tool.
Any version of Windows from Windows 98 onwards will be OK. Cascot is also available for other operating systems. The faster the processor, the better. When running in fully automatic mode you can expect to process about 100,000 job titles per hour. You will need to have the Java Runtime Environment on machines that run CASCOT. This will be checked during installation and, if you do not have the required version (or no version) on the machine, you will be directed to the Sun Microsystems site where you can download and install the Java Runtime Environment.
If you are having problems seeing SIC 2003 as a coding option you can download this file SIC2003.classification (2Mb). Place the file in the same folder from which you run CASCOT, then use the 'Open classification' option on the 'File' menu. Navigate to this folder and click on the classification.
Cascot: Computer Assisted Structured COding Tool
Home | Details | Online version | Further Information | Purchase | Cascot International |
A stand-alone version of Cascot has been developed for use with high volumes of data. It can be used by typing in each piece of text that is to be coded. Alternatively it can operate by reading records from an input file and writing resulting codes, scores, etc. to a separate output file optionally with the input data or components thereof.
A Powerpoint presentation showing the advanced features of this desktop version is available. Download the presentation. Another presentation using SOC2010.
It supports three modes of operation.
No automation:
Each piece of text is coded individually. The results are displayed to the user who must accept or change the result before moving on to the next piece of text. The text may be typed in or read from a file.
Fully automatic:
The entire input file is read and processed into an output file. A certainty score threshold is set below which no code is assigned to the input text.
Semi-automatic:
A certainty score threshold is set above which the input file is processed automatically but below which the user is prompted for a decision.
Screenshot
You can run the stand-alone version of Cascot also as a command line executable. The executable can be invoked from the command line with various options. To start, run from the command line
clcascot.exe -h
or
cascot.exe -h
To purchase the stand-alone version of Cascot please fill in a purchase order form.
If you are interested in Cascot API please contact IER.
Cascot Editor
With Cascot Editor users can create and modify their own classifications for Cascot. Editor is included in the desktop version on request. Please note that the use of Editor is not supported. A presentation of Cascot Editor is available.
img
Cascot: Computer Assisted Structured Coding Tool
Home | Details | Online version | Further Information | Purchase | Cascot International |
Cascot requires a Java 2 Standard Editition (J2SE) Runtime Environment (JRE) version 1.8 or higher.
The latest version can be downloaded for free - Oracle free Java download.
For more information about Java see Java Technnology.
You will need authority to install software on your machine.
Pricing
Cascot costs £291.67 (excluding V.A.T.) + £58.33 V.A.T. (20%) = £350.00
We charge VAT unless a VAT exemption certificate is presented or the customer is outside the UK with a valid VAT number.
Purchase Order
If you wish to purchase Cascot you will need to fill in & submit the purchase order form below. You will then be sent an email detailing how to download the software, together with an electronic key to unlock the software after it is installed. An invoice will be sent via email or post to the contact details you have provided. If you wish to pay by credit card, continue to payment page.
Credit card payment
We accept the following cards for payment:
The email address that you supply will receive a receipt for the licence fee payment and your log in details, once payment has been confirmed. If you require a VAT receipt for your payment, contact l dot marston at warwick dot ac dot uk.
CASCOT payment confirmation
Your payment has been successful. You will receive a separate email payment receipt from WorldPay. Please keep this safe as proof of payment.
Once the purchase order form has been completed, you will be sent an email detailing how to download the software, together with an electronic key to unlock the software after it is installed.
By logging into the CASCOT software, it is assumed that you have accepted the University of Warwick's Acceptable Use Policy governing access to the module, detailing issues of copyright and the non-disclosure of your login details.
Cascot: Computer Assisted Structured Coding Tool
Cascot requires a Java 2 Standard Editition (J2SE) Runtime Environment (JRE) version 1.4 or higher.
The latest version can be downloaded for free - SUN Microsystems Java install.
For more information about Java see Java Technnology.
You will need authority to install software on your machine.
Pricing
Cascot costs £255.32 (excluding V.A.T.) + £44.68 V.A.T. = £300
Purchase Order
If you wish to purchase Cascot you will need to fill in & submit the purchase order form below.
You will then be sent an email detailing how to download the software, together with an electronic key to unlock the software after it is installed. An invoice will be sent via email and post to the contact details you have provided.
Cascot: Computer Assisted Structured Coding Tool
Cascot: Computer Assisted Structured Coding Tool
Thank you for purchasing CASCOT
We will send you an email with instructions for download and installation.
Cascot: Computer Assisted Structured COding Tool
Home | Details | Online version | Further Information | Purchase | Cascot International |
The web-based version of Cascot previously available on this page has been removed because it was created using Java applets. These are no longer supported by most browsers. This does not affect the desktop version. For further information about the desktop version of Cascot please see this presentation which shows the main features and functionality of the software.