Modelling language evolution with automated speech analysis and machine learning
PhDs and postgraduate research
Funded PhD Project (UK and EU students only)
School of Mathematics and Physics
4 May 2021
Candidates applying for this project may be eligible to compete for one of a small number of bursaries available; these cover tuition fees at the UK rate for three years and a stipend in line with the UKRI rate (£15,609 for 2021/22). Bursary recipients will also receive a £1,500 p.a. for project costs/consumables.
The work on this project could involve:
- Application of machine learning techniques to deconstruct speech such as boosted trees and neural networks
- Stochastic (Markovian and non-Markovian) modelling of language acquisition and evolution
- Automated collection of speech data via Web Apps powered by Flask
Languages are complex structures built from simple units of sound. They exhibit patterns on many time scales, from rules for arranging sounds within words to principles of sentence construction and timing. Each person uses a different set of sounds and patterns, and because these are learned by copying, our speech can reveal a lot about us: where we grew up, our education, ethnicity, how we wish to be seen, as well as physical and mental attributes.
In the past, we learned how languages work by painstaking data collection, analysing voices, writing, and the workings of the vocal tract. This is now changing: speech recording and recognition based on modern machine learning (typically Hidden Markov Models), is ubiquitous. However, these powerful methods are only beginning to be applied to understand how the components of speech differ between people, how these evolve over time, and what people’s voices reveal about them. Automatic speech deconstruction opens the possibility to understand human language in unprecedented detail and at large scale. It will help reveal what “black-box” algorithms can learn about us. In this project you will develop machine learning methods to deconstruct speech into units, to analyse these, and the sentences they form. You will learn how people use sound to communicate, how to analyse audio signals and train machines to recognise, deconstruct and measure them. You will build mathematical models of language change, using models of social behaviour and networks, to yield predictions about the future. The project brings together two important research themes at Portsmouth: “Future and emerging technologies”, and “Democratic Citizenship”.
The cross-disciplinary team have a track record of novel work in mathematical language models, linguistic theory and large-scale data collection. The PhD will equip you with valuable skills in data science, modelling, machine learning, automatic speech processing and linguistics.
You'll need a good first degree from an internationally recognised university (minimum upper second class or equivalent, depending on your chosen course) or a Master’s degree in an appropriate subject. In exceptional cases, we may consider equivalent professional experience and/or qualifications. English language proficiency at a minimum of IELTS band 6.5 with no component score below 6.0.
You should have an interest in and aptitude for programming, statistical and probabilistic modelling.
How to apply
When you are ready to apply, you can use our online application form. Make sure you submit a personal statement, proof of your degrees and grades, details of two referees, proof of your English language proficiency and an up-to-date CV. Our ‘How to Apply’ page offers further guidance on the PhD application process.
If you want to be considered for this funded PhD opportunity you must quote project code SMAP5960521 when applying.
Our community of independent researchers can help you through every aspect of your research degree.