Extracting Innovation Quality Signals from Patent Text Using Transformer-Based Language Models: Causal Evidence on Market Valuation in Peripheral Economies

Apply and key information  

This project is funded by:

    • Department for the Economy (DfE)

Summary

Financial markets must assess how valuable a company's innovations are, but this is difficult. Patents contain rich information about innovation quality, but extracting meaningful signals from complex technical documents remains challenging.

Recent advances in artificial intelligence—particularly transformer-based language models like BERT—enable us to analyse patent text in new ways.

This PhD will use state-of-the-art natural language processing (NLP) methods to extract quality signals from European patent documents and examine how financial markets incorporate this information into company valuations.

You'll apply cutting-edge machine learning techniques (transformer models, causal forests, double machine learning) to understand which aspects of patent language predict valuable innovation and whether markets efficiently process these signals.

A key focus is regional differences: do innovation-valuation relationships differ in peripheral economies like Northern Ireland compared to major innovation hubs?

This question has direct policy relevance for regional innovation support programmes.

What you'll do: Fine-tune PatentBERT on European patent data; extract interpretable linguistic features; build predictive models; apply causal machine learning methods to test market efficiency; analyse regional heterogeneity using advanced econometric techniques.

What you'll gain: Expertise in transformer-based NLP, causal machine learning, financial econometrics, and large-scale data integration (PATSTAT patent database + Bloomberg financial data). Skills applicable to careers in data science, financial technology, policy analysis, or academia.

Ideal candidate: Background in computer science, data science, finance, economics, or related quantitative fields. Strong programming skills (Python/R). Interest in interdisciplinary research combining technical methods with real-world economic questions.

Essential:

  • Strong quantitative background (computer science, data science, statistics, economics, finance, or related field)
  • Programming proficiency in Python and/or R
  • Experience with statistical analysis and data manipulation
  • Ability to work with large datasets

Highly Desirable:

  • Experience with machine learning frameworks (PyTorch, TensorFlow, scikit-learn)
  • Natural language processing knowledge (transformer models, BERT)
  • Financial data analysis or econometrics background
  • SQL or database management experience

Training will be provided in:

  • Transformer-based NLP and PatentBERT fine-tuning
  • Causal machine learning (causal forests, double ML)
  • Financial econometrics and instrumental variables
  • PATSTAT and Bloomberg data access and processing

Essential criteria

Applicants should hold, or expect to obtain, a First or Upper Second Class Honours Degree in a subject relevant to the proposed area of study.

We may also consider applications from those who hold equivalent qualifications, for example, a Lower Second Class Honours Degree plus a Master’s Degree with Distinction.

In exceptional circumstances, the University may consider a portfolio of evidence from applicants who have appropriate professional experience which is equivalent to the learning outcomes of an Honours degree in lieu of academic qualifications.

  • A comprehensive and articulate personal statement
  • Research proposal of 2000 words detailing aims, objectives, milestones and methodology of the project

Desirable Criteria

If the University receives a large number of applicants for the project, the following desirable criteria may be applied to shortlist applicants for interview.

  • First Class Honours (1st) Degree
  • Masters at 70%

Equal Opportunities

The University is an equal opportunities employer and welcomes applicants from all sections of the community, particularly from those with disabilities.

Appointment will be made on merit.

Funding and eligibility

This project is funded by:

  • Department for the Economy (DfE)

Our fully funded PhD scholarships will cover tuition fees and provide a maintenance allowance of £21,000 (approximately) per annum for three years* (subject to satisfactory academic performance).  A Research Training Support Grant (RTSG) of £900 per annum is also available.

These scholarships, funded via the Department for the Economy (DfE), are open to applicants worldwide, regardless of residency or domicile.

Applicants who already hold a doctoral degree or who have been registered on a programme of research leading to the award of a doctoral degree on a full-time basis for more than one year (or part-time equivalent) are NOT eligible to apply for an award.

*Part time PhD scholarships may be available to home candidates, based on 0.5 of the full time rate, and will require a six year registration period.

Due consideration should be given to financing your studies.

Recommended reading

1. Farre-Mensa, J., Hegde, D., & Ljungqvist, A. (2020). The bright side of patents. *Journal of Financial Economics*, 137(1), 43–79.

2. Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. *Journal of the American Statistical Association*, 113(523), 1228–1242.

3. Hsu, P.-H., Lee, D., Tambe, P., & Hsu, D. H. (2022). Deep learning, text, and patent valuation. *Journal of Financial Economics*, 143(3), 1043–1069.

4. **Lee, J.-S., & Hsiang, J. (2020)**. Patent classification by fine-tuning BERT language model. *World Patent Information*, 61, 101965.

5. **Rodríguez-Pose, A., & Wilkie, C. (2019)**. Innovating in less developed regions: What drives patenting in the lagging regions of Europe and North America. *Growth and Change*, 50(1), 4–37.

6. **Arora, A., Belenzon, S., & Sheer, L. (2021)**. Matching patents to Compustat firms, 1980–2015: Dynamic reassignment, name changes, and ownership structures. *Research Policy*, 50(5), 104217.

The Doctoral College at Ulster University

Key dates

Submission deadline
Friday 27 February 2026
04:00PM

Interview Date
Tbc

Preferred student start date
14 September 2026

Applying

Apply Online  

Contact supervisor

Professor Barry Quinn

Other supervisors