As a student of Earth Sciences you will have to analyse data or calculate mathematical formulations at some point. Eventually you will come across some tasks you’d like to automate. For example, you may need to perform operations or operate statistical calculations on a cell-based dataset, either representing a map or a data series. Alternatively, you may want to simulate physical relationships in a model. For these cases, it can be very useful to use programming to perform simple operations or build small models.
Programming is the process of creating instructions for your computer on how to perform a task. The exact sequence of the instructions to perform the task is called an algorithm. Code consists of different commands that tell the computer what to do (i.e., how to perform the algorithm). Similarly, to other languages, commands in a programming language have a specific syntax, i.e. a set of fundamental rules the commands need to comply with for the computer to interpret and execute them correctly.
Every Earth Scientist has to work with data at some point, because it is a typical analytical step in the research process.
So why not just use Excel? This common Microsoft application is very useful, but it will eventually limit your speed in operating on larger datasets. Doing something slightly out of the scope will become very difficult and you will need additional plug-ins to continue editing your data. Using a programming language for data analysis offers you a broad range of simple operations which can quickly and logically be performed on data structures. Moreover, writing a script in a programming language makes your analysis faster, reproducible and thereby more reliable. You can run a certain analysis multiple times, e.g. for different datasets, by only pushing ‘enter’ to rerun your code. In contrast, if you use Excel, your analysis is coded within the dataset you originally used, and you will need to carry out your complete analysis repeatedly on every new data set. This also makes it more difficult to clearly document your steps. Additionally, programming languages such as Python and particularly R have much more advanced statistical and data visualisation capabilities than Excel.
There are many resources available online for different programming languages. In the earth sciences, R, Python and MATLAB are most often used. Here are a few good beginner and intermediate examples:
- R for Data Science – Hadley Wickham & Garrett Grolemund
- R Graphics Cookbook – Winston Chang – Specifically for creating high-quality graphs in R
- De Programmeursleerling – Leren coderen met Python 3 – Dutch beginners guide to Python
- Introduction to Python in Earth Science Data Analaysis – Maurizio Petrelli – Log in using Solis-ID
Python Recipes for Earth Sciences – Martin Trauth – Log in using Solis-ID
- MATLAB recipes for earth sciences – Martin Trauth – Log in using Solis-ID