The Warwick RSE Group
What is RSE?
Research Software Engineering is the application of software engineering techniques derived from industrial best practice to research software. This involves concepts such as version control, formal development methodologies and software testing. The aim is to ensure that research software is trustworthy, reliable and citable.
Who are we and what do we do?
The RSE group is part of the Research Technology Platform in Scientific Computing. We currently have 2 staff members whose duties include writing code for specific projects, supporting the Tier-2 MidPlus HPC network and users of SCRTP resources, and providing training in programming and software development.
General Software Questions
In no particular order:
- The documentation for the code, package etc
- Your favorite search engine. Remove irrelevant specific info like the current date or your machine name from any error messages and search for them
- Any forums or issue trackers
- Any other users or developers, for example in your department
If you need our help as general programmers with a wide breadth of knowledge and experience, you can contact us via this form, or by email, post on our forum, or come along to our RSE surgeries (when/where coming soon...!).
First, make sure you need a report. Search any relevant fora or bug trackers and see if this is a known problem. There may be a solution, or you may be better off adding to an existing report.
Next, gather information about your problem or bug. Remove any very specific information: your machine name, the current date, the filename you're working with. Include relevant general information, for example, if you're working with a very large file, or a large number of files, or over a remote link. Also include specific information, the program version, your operating system, your environment and any special configuration.
Now construct your report:
- Look for an FAQ or a sticky link where you are submitting and read and follow it. For example, the SCRTP's bugzilla guidelines are here.
- A short, descriptive title. Other users might recognise the problem, and be able to help.
- Version information for your program, operating system and any other relevant things. For example, the compiler version for compiled code.
- If possible, a minimum working example. This is the smallest piece of code, script, or action which reproduces your problem.
- Explain what happens?
- Under which circumstances, for example does this only happen sometimes?
- If you get an error, include the text using the suggestions under Error Messages.
- If your bug pertains to an unexpected result or behaviour, give a brief description of what you expected would happen instead.
- What have you tried doing to solve the problem so far?
Which language should I learn or use?
Choosing the "right" language can seem crucial, but in the long-run it matters far less than you might think. You may be able to do more of use initially, but the core principles are common to nearly every language and learning those is the goal. The following, in order, are suggested:
- Whatever your supervisor uses/wants you to use
- The "one true language" of your field or research group
- Whatever your friends/colleagues use, so you can get help easily if needed
- Whichever has the best library support for your task or field (see 2. in most cases)
- Whatever appeals to you
So are there any "wrong" languages to learn?
Short answer - no. You can gain things of value from any language. Longer answer - maybe. Languages are created by humans, and some humans do things just because they can. There are many "joke" or "example" languages, called "esoteric" or "esolangs" ranging from rewordings (e.g. Shakespeare ), through silly (e.g Whitespace) to the downright absurd (Malbolge - a deliberately difficult language where it took 2 years to write the first useful program).
Esolangs aside, there are two broad categories of language, namely Functional and Procedural. The latter is probably more familiar - roughly, you do actions and things change. The program has some kinds of state, values of variables, things on a screen etc. Your program makes a series of changes to states. (Object Oriented Progamming is a kind of Procedural where the state is contained in named, defined Things). Functional programming is all about functions which return values, and these values are passed to other functions etc. There's no series of step-wise changes, just a cascade of function calls. [Note: you can do functional programming in languages not designed for it, but lose some of the benefits] Functional languages are used a lot in Computer Science and Mathematics. In a lot of scientific endeavours, there is some real state we want our program to be mimicking - we might be evaluating flows of a liquid, forces on an object, uses of words in text etc. Since the state, and its evolution, is what we're interested in, State-ful programming makes sense. Conversely, if all we are interested in is finding some single state, such as finding the forces on a static structure, or a lot of statistical things, Functional styles can work. This is not exact, but is a good first description.
So, there are unpromising and suboptimal languages for a given task, and there are poor choices for a wide range of reasons, but any language you have a compiler/interpreter/runtime for can be useful. Mostly, the only poor choice of language to learn is the one you wont use, whether because you don't like it, nobody in your field likes it, or other. Don't go mad trying to learn all of them, but another language in your tool kit is rarely a bad thing.
I know how to program but how do I learn language X?
You can often find language "cheat-sheets" to pick up syntax, and this site has nice step-by-step examples for a huge range of languages. But syntax is not programming, so you'll need to already know the base concepts. To make a bigger step, for example from an interpreted to a compiled language (e.g. Python to C) the best approach is simply to find a beginners guide and go through it at speed, or find a tutorial for something relevant and work through that. A very large change, such as your first database or starting a 'functional' language (such as Lisp or Haskell) is much like starting from the beginning all over again.
Coding or programming vs software development or software engineering: all words used to describe writing code to perform a task, so what's the difference? In practice, the first two are mostly used to describe writing a piece of code as a tool for a specific task. The task may be complex, but the code is the product. The latter two encompass the entire lifecycle of a piece of software, from deciding what it should do, through designing the structure and algorithms it uses, testing its function, documenting its usage and offering it to people to use, and may also include supporting users and maintaining the software too. Our programming resources focus on learning how to write in specific languages; use specific methods, such as parallel programming; or use specific libraries or tools. In contrast, our software development resources teach skills and principles needed for writing software in any language, and include topics like structuring your programs; providing your programs to your users; and testing software for correctness and safety.
When faced with a new project, there is no subsititute for experience in determining how long you will need. Many project management books discuss techniques based on Function Point analysis. In practice you will probably use these methods only as a guide for more intuitive estimates. The estimator sheet here (with worked example) or here (no example) is a good start.
High Performance Computing
What is HPC?
High Performance Computing is generally considered to be anything that uses "custom bought (or built) computers for doing 'hard work' ". We consider "hard work" to be either anything that slows your desktop to a crawl for an hour or more; or anything that runs across multiple compute nodes, or uses custom hardware such as graphics cards. All of these tasks indicate you may want to look into using HPC facilities, either locally at Warwick, or further afield.
How do I get started with the HPC facilities?
If you're new to HPC in general, or to the facilities at Warwick, we have a short course that may help you. We try to run this at least once per academic year, or you can work through it online. Watch the calendar, or see the materials here.
I want or need to write parallel code. Where do I start?
Writing code to run on more than one processor (parallel code) isn't hard but it is different. For Research purposes, there are three main disciplines you may need.
Firstly, a lot of codes can benefit from basic threading, where all you as the programmer need to do is to get access to multiple processors and set to parameters to a library or tool. This is particularly the case in data processing sorts of work. For this, the best place to start is with the manual for your tool.
Secondly, there is threading that you yourself write - usually either directly or using a library such as OpenMP (Open MultiProcessing). "Threading" here roughly translates to using multiple work-streams on multiple processors, but all of them having access to the same memory. We suggest looking here, and turning to our forum if you want some specific guidance.
Thirdly, there is parallel code using explicit sharing of data, usually via the MPI (Message Passing Interface) library. Here you run multiple copies of your code, which each work on their own sub-set of the problem, and explicitly talk to each other to share information. We currently run Introductory and Intermediate MPI courses in the Autumn and Spring terms respectively - watch our calendar or access the materials for self-study on our training pages.