PhD Study : Contextual Probability based Approach to Reinforcement Learning

Apply and key information  

Summary

Reinforcement learning is a machine learning paradigm based on Markov decision process, where an agent tries to maximise the accumulated reward it receives when interacting with a dynamic and uncertain environment. The agent is not told how to behave, but instead must learn the policy (rule of actions), through interactions, which yields the most reward by trial-and-error. Reinforcement learning has various applications including games, robotics, computer vision, and algorithmic trading and portfolio management. In the last few years, we have witnessed the renaissance of reinforcement learning. The combination of this paradigm and deep neural networks is behind many breakthrough technologies. One striking example is Alpha Go, the first computer Go program to defeat human Go masters. Nevertheless, there is still room for improvement when using this paradigm in problems where e.g. the environment is Brownian motion when the reward, which is a key component of reinforcement learning, is not easy to define. One example with such an environment is (financial) automated trading, which generated (2014) orders for more than 75 percent of the stock shares traded on United States exchanges (Wikipedia).

An automated trading system (ATS) is a computer program that automatically creates orders based on predefined rules, and submits them to an exchange. The rules determine when to enter a position, when to exit a position and how much. ATSs can be designed to trade stocks, options, futures and foreign exchange products, and they can execute repetitive tasks at speeds with orders of magnitude greater than any human equivalent.

This project will study policy optimization and reward formulation in Brownian motion environments from the contextual probability perspective. Contextual probability (Wang, Murtagh 2008) is a secondary probability defined in terms of a given primary probability in a systematic way, and they have a simple linear relationship. Accepting the principle of indifference, contextual probability can be estimated from data sample through neighbourhood counting, which is a kernel function. The interaction between primary and secondary probabilities can continue until equilibrium. The project will formulate policy optimisation as the interaction between primary and secondary probabilities under various constraints in order to optimise the accumulated reward. The project will evaluate the findings in a prototype automated trading system.

Hui Wang, Fionn Murtagh (2008) A Study of the Neighborhood Counting Similarity, IEEE Transactions on Knowledge and Data Engineering, 449-461.

Essential criteria

Applicants should hold, or expect to obtain, a First or Upper Second Class Honours Degree in a subject relevant to the proposed area of study.

We may also consider applications from those who hold equivalent qualifications, for example, a Lower Second Class Honours Degree plus a Master’s Degree with Distinction.

In exceptional circumstances, the University may consider a portfolio of evidence from applicants who have appropriate professional experience which is equivalent to the learning outcomes of an Honours degree in lieu of academic qualifications.

Desirable Criteria

If the University receives a large number of applicants for the project, the following desirable criteria may be applied to shortlist applicants for interview.

  • First Class Honours (1st) Degree
  • Masters at 70%

Funding and eligibility

The University offers the following levels of support:

Vice Chancellors Research Studentship (VCRS)

The following scholarship options are available to applicants worldwide:

  • Full Award: (full-time tuition fees + £19,000 (tbc))
  • Part Award: (full-time tuition fees + £9,500)
  • Fees Only Award: (full-time tuition fees)

These scholarships will cover full-time PhD tuition fees for three years (subject to satisfactory academic performance) and will provide a £900 per annum research training support grant (RTSG) to help support the PhD researcher.

Applicants who already hold a doctoral degree or who have been registered on a programme of research leading to the award of a doctoral degree on a full-time basis for more than one year (or part-time equivalent) are NOT eligible to apply for an award.

Please note: you will automatically be entered into the competition for the Full Award, unless you state otherwise in your application.

Department for the Economy (DFE)

The scholarship will cover tuition fees at the Home rate and a maintenance allowance of £19,000 (tbc) per annum for three years (subject to satisfactory academic performance).

This scholarship also comes with £900 per annum for three years as a research training support grant (RTSG) allocation to help support the PhD researcher.

  • Candidates with pre-settled or settled status under the EU Settlement Scheme, who also satisfy a three year residency requirement in the UK prior to the start of the course for which a Studentship is held MAY receive a Studentship covering fees and maintenance.
  • Republic of Ireland (ROI) nationals who satisfy three years’ residency in the UK prior to the start of the course MAY receive a Studentship covering fees and maintenance (ROI nationals don’t need to have pre-settled or settled status under the EU Settlement Scheme to qualify).
  • Other non-ROI EU applicants are ‘International’ are not eligible for this source of funding.
  • Applicants who already hold a doctoral degree or who have been registered on a programme of research leading to the award of a doctoral degree on a full-time basis for more than one year (or part-time equivalent) are NOT eligible to apply for an award.

Due consideration should be given to financing your studies. Further information on cost of living

The Doctoral College at Ulster University

Key dates

Submission deadline
Monday 19 February 2018
12:00AM

Interview Date
9 to 23 March 2018

Preferred student start date
Mid September 2018

Applying

Apply Online  

Other supervisors