Petar Veličković

London Area, United Kingdom

20K followers 500+ connections

View mutual connections with Petar

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Google DeepMind

University of Cambridge

Portfolio

About

Researching a result is not nearly as valuable without appropriately communicating it…

Activity

I enjoy finding statues of Nikola Tesla, and luckily, there are many around the world. One was at the Croatian Machine Learning Workshop (CMLW)…

I enjoy finding statues of Nikola Tesla, and luckily, there are many around the world. One was at the Croatian Machine Learning Workshop (CMLW)…

Liked by Petar Veličković
I've just been promoted to Senior Staff Research Scientist at Google DeepMind! 🚀 This energises me to push even stronger on exciting foundational…

I've just been promoted to Senior Staff Research Scientist at Google DeepMind! 🚀 This energises me to push even stronger on exciting foundational…

Posted by Petar Veličković
I love a good leaderboard... or several! ⏫ TGR 🐅 is our graph rewiring method for temporal graphs leveraging expander graph propagation…

I love a good leaderboard... or several! ⏫ TGR 🐅 is our graph rewiring method for temporal graphs leveraging expander graph propagation…

Shared by Petar Veličković

Join now to see all activity

Experience

Google DeepMind

London, England, United Kingdom
-

Cambridge, England, United Kingdom
-

Cambridge, England, United Kingdom
-

London, England, United Kingdom
-

London, England, United Kingdom
-

London, England, United Kingdom
-

Cambridge, United Kingdom
-

Montreal, Canada Area
-

Montreal, Canada Area
-

Cambridge, United Kingdom
-

London, United Kingdom
-

Cambridge, United Kingdom
-

Belgrade, Serbia

Education

University of Cambridge

2016 - 2019

Activities and Societies: Computer Science Tripos Supervisor (Computer Laboratory, Colleges: Trinity, Jesus, Wolfson, Girton, Churchill, Sidney Sussex, Homerton, Magdalene, Hughes Hall, Newnham, Queens')

Thesis title: “The resurgence of structure in deep neural networks” (Accepted without corrections; thesis committee: Prof John Daugman, Prof Jure Leskovec)

Worked within the Artificial Intelligence Group of the Department of Computer Science and Technology.

Supervised by Prof Pietro Liò.
2012 - 2015

Activities and Societies: Computer Laboratory Open Days Demonstrator

Final year dissertation: "Molecular multiplex network inference", supervised by Dr Pietro Liò.
2008 - 2012

Final year project: "Development of a software emulator for the GameBoy console"

Publications

Pointer Graph Networks

Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020) Dec 2020
Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving. In absence of reliable domain expertise, one might resort to inferring the latent graph structure, which is often difficult due to the vast search space of possible graphs. Here we introduce Pointer Graph Networks (PGNs) which…

Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving. In absence of reliable domain expertise, one might resort to inferring the latent graph structure, which is often difficult due to the vast search space of possible graphs. Here we introduce Pointer Graph Networks (PGNs) which augment sets or graphs with additional inferred edges for improved model expressivity. PGNs allow each node to dynamically point to another node, followed by message passing over these pointers. The sparsity of this adaptable graph structure makes learning tractable while still being sufficiently expressive to simulate complex algorithms. Critically, the pointing mechanism is directly supervised to model long-term sequences of operations on classical data structures, incorporating useful structural inductive biases from theoretical computer science. Qualitatively, we demonstrate that PGNs can learn parallelisable variants of pointer-based data structures, namely disjoint set unions and link/cut trees. PGNs generalise out-of-distribution to 5x larger test inputs on dynamic graph connectivity tasks, outperforming unrestricted GNNs and Deep Sets.

Other authors
See publication
Neural Execution of Graph Algorithms

The 8th International Conference on Learning Representations (ICLR 2020) Apr 2020
Graph Neural Networks (GNNs) are a powerful representational tool for solving problems on graph-structured inputs. In almost all cases so far, however, they have been applied to directly recovering a final solution from raw inputs, without explicit guidance on how to structure their problem-solving. Here, instead, we focus on learning in the space of algorithms: we train several state-of-the-art GNN architectures to imitate individual steps of classical graph algorithms, parallel (breadth-first…

Graph Neural Networks (GNNs) are a powerful representational tool for solving problems on graph-structured inputs. In almost all cases so far, however, they have been applied to directly recovering a final solution from raw inputs, without explicit guidance on how to structure their problem-solving. Here, instead, we focus on learning in the space of algorithms: we train several state-of-the-art GNN architectures to imitate individual steps of classical graph algorithms, parallel (breadth-first search, Bellman-Ford) as well as sequential (Prim's algorithm). As graph algorithms usually rely on making discrete decisions within neighbourhoods, we hypothesise that maximisation-based message passing neural networks are best-suited for such objectives, and validate this claim empirically. We also demonstrate how learning in the space of algorithms can yield new opportunities for positive transfer between tasks---showing how learning a shortest-path algorithm can be substantially improved when simultaneously learning a reachability algorithm.

Other authors
See publication
Deep Graph Infomax

The 7th International Conference on Learning Representations (ICLR 2019) May 2019
We present Deep Graph Infomax (DGI), a general approach for learning node representations within graph-structured data in an unsupervised manner. DGI relies on maximizing mutual information between patch representations and corresponding high-level summaries of graphs---both derived using established graph convolutional network architectures. The learnt patch representations summarize subgraphs centered around nodes of interest, and can thus be reused for downstream node-wise learning tasks. In…

We present Deep Graph Infomax (DGI), a general approach for learning node representations within graph-structured data in an unsupervised manner. DGI relies on maximizing mutual information between patch representations and corresponding high-level summaries of graphs---both derived using established graph convolutional network architectures. The learnt patch representations summarize subgraphs centered around nodes of interest, and can thus be reused for downstream node-wise learning tasks. In contrast to most prior approaches to unsupervised learning with GCNs, DGI does not rely on random walk objectives, and is readily applicable to both transductive and inductive learning setups. We demonstrate competitive performance on a variety of node classification benchmarks, which at times even exceeds the performance of supervised learning.

Other authors
See publication
The resurgence of structure in deep neural networks

University of Cambridge May 2019

Machine learning with deep neural networks ("deep learning") allows for learning complex features directly from raw input data, completely eliminating hand-crafted, "hard-coded" feature extraction from the learning pipeline. This has lead to state-of-the-art performance being achieved across several---previously disconnected---problem domains, including computer vision, natural language processing, reinforcement learning and generative modelling. These success stories nearly universally go…

Machine learning with deep neural networks ("deep learning") allows for learning complex features directly from raw input data, completely eliminating hand-crafted, "hard-coded" feature extraction from the learning pipeline. This has lead to state-of-the-art performance being achieved across several---previously disconnected---problem domains, including computer vision, natural language processing, reinforcement learning and generative modelling. These success stories nearly universally go hand-in-hand with availability of immense quantities of labelled training examples ("big data") exhibiting simple grid-like structure (e.g. text or images), exploitable through convolutional or recurrent layers. This is due to the extremely large number of degrees-of-freedom in neural networks, leaving their generalisation ability vulnerable to effects such as overfitting. However, there remain many domains where extensive data gathering is not always appropriate, affordable, or even feasible. Furthermore, data is generally organised in more complicated kinds of structure---which most existing approaches would simply discard. Examples of such tasks are abundant in the biomedical space; with e.g. small numbers of subjects available for any given clinical study, or relationships between proteins specified via interaction networks. I hypothesise that, if deep learning is to reach its full potential in such environments, we need to reconsider "hard-coded" approaches---integrating assumptions about inherent structure in the input data directly into our architectures and learning algorithms, through structural inductive biases. In this dissertation, I directly validate this hypothesis by developing three structure-infused neural network architectures (operating on sparse multimodal and graph-structured data), and a structure-informed learning algorithm for graph neural networks, demonstrating significant outperformance of conventional baseline models and algorithms.

See publication
Cross-modal recurrent models for weight objective prediction from multimodal time-series data

The 12th EAI International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth 2018) May 2018
We analyse multimodal time-series data corresponding to weight, sleep and steps measurements. We focus on predicting whether a user will successfully achieve his/her weight objective. For this, we design several deep long short-term memory (LSTM) architectures, including a novel cross-modal LSTM (X-LSTM), and demonstrate their superiority over baseline approaches. The X-LSTM improves parameter efficiency by processing each modality separately and allowing for information flow between them by…

We analyse multimodal time-series data corresponding to weight, sleep and steps measurements. We focus on predicting whether a user will successfully achieve his/her weight objective. For this, we design several deep long short-term memory (LSTM) architectures, including a novel cross-modal LSTM (X-LSTM), and demonstrate their superiority over baseline approaches. The X-LSTM improves parameter efficiency by processing each modality separately and allowing for information flow between them by way of recurrent cross-connections. We present a general hyperparameter optimisation technique for X-LSTMs, which allows us to significantly improve on the LSTM and a prior state-of-the-art cross-modal approach, using a comparable number of parameters. Finally, we visualise the model’s predictions, revealing implications about latent variables in this task.

Other authors
See publication
Graph Attention Networks

The 6th International Conference on Learning Representations (ICLR 2018) May 2018
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of computationally intensive matrix…

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of computationally intensive matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).

Other authors
See publication
X-CNN: Cross-modal convolutional neural networks for sparse datasets

The 7th IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2016) Dec 2016
In this paper we propose cross-modal convolutional neural networks (X-CNNs), a novel biologically inspired type of CNN architectures, treating gradient descent-specialised CNNs as individual units of processing in a larger-scale network topology, while allowing for unconstrained information flow and/or weight sharing between analogous hidden layers of the network---thus generalising the already well-established concept of neural network ensembles (where information typically may flow only…

In this paper we propose cross-modal convolutional neural networks (X-CNNs), a novel biologically inspired type of CNN architectures, treating gradient descent-specialised CNNs as individual units of processing in a larger-scale network topology, while allowing for unconstrained information flow and/or weight sharing between analogous hidden layers of the network---thus generalising the already well-established concept of neural network ensembles (where information typically may flow only between the output layers of the individual networks). The constituent networks are individually designed to learn the output function on their own subset of the input data, after which cross-connections between them are introduced after each pooling operation to periodically allow for information exchange between them. This injection of knowledge into a model (by prior partition of the input data through domain knowledge or unsupervised methods) is expected to yield greatest returns in sparse data environments, which are typically less suitable for training CNNs. For evaluation purposes, we have compared a standard four-layer CNN as well as a sophisticated FitNet4 architecture against their cross-modal variants on the CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data being removed, and find that at lower levels of data availability, the X-CNNs significantly outperform their baselines (typically providing a 2--6% benefit, depending on the dataset size and whether data augmentation is used), while still maintaining an edge on all of the full dataset tests.

Other authors
See publication

Patents

Generating Prediction Outputs Using Dynamic Graphs

Filed June 4, 2021 17 / 338,974
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating prediction outputs characterizing a set of entities. In one aspect, a method comprises: obtaining data defining a graph, comprising: (i) a set of nodes, wherein each node represents a respective entity from the set of entities, (ii) a current set of edges, wherein each edge connects a pair of nodes, and (iii) a respective current embedding of each node; at each of a plurality of…

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating prediction outputs characterizing a set of entities. In one aspect, a method comprises: obtaining data defining a graph, comprising: (i) a set of nodes, wherein each node represents a respective entity from the set of entities, (ii) a current set of edges, wherein each edge connects a pair of nodes, and (iii) a respective current embedding of each node; at each of a plurality of time steps: updating the respective current embedding of each node, comprising processing data defining the graph using a graph neural network; and updating the current set of edges based at least in part on the updated embeddings of the nodes; and at one or more of the plurality of time steps: generating a prediction output characterizing the set of entities based on the current embeddings of the nodes.

Other inventors
See patent
User Behaviour Monitoring

Filed November 10, 2017 EU EP3483792A1
A method and apparatus is described comprising: receiving first user data of a first user from a plurality of data sources; inferring at least some missing data points in said first user data using a model to predict the missing data points, wherein the model is trained using selected data extracted from stored data relating to a plurality of users; generating modified user data from said first user data and at least some of said inferred missing data points; and converting said modified user…

A method and apparatus is described comprising: receiving first user data of a first user from a plurality of data sources; inferring at least some missing data points in said first user data using a model to predict the missing data points, wherein the model is trained using selected data extracted from stored data relating to a plurality of users; generating modified user data from said first user data and at least some of said inferred missing data points; and converting said modified user data into a plurality of representations having defined length sequences for each data source.

Other inventors
See patent
Persistent Message Passing for Graph Neural Networks

Filed January 1, 2022 US 17/829,204
Other inventors

Honors & Awards

Outstanding Reviewer

Ninth International Conference on Learning Representations

May 2021

Awarded to the top reviewers for ICLR 2021.
Best Short Paper Runner-Up

AAAI-21 International Workshop on Health Intelligence (W3PHIAI 2021)

Feb 2021

Awarded for "Predicting Patient Outcomes with Graph Representation Learning", co-authored with Emma Rocheteau, Catherine Tong, Nicholas Lane and Pietro Liò
Top Reviewer

Thirty-seventh International Conference on Machine Learning

Sep 2020

Awarded to the top 33% reviewers for ICML 2020.
Best Reviewer Award

Thirty-third Conference on Neural Information Processing Systems

Dec 2019

Awarded to the top 400 reviewers for NeurIPS 2019.
ICLR 2019 Travel Award

The 7th International Conference on Learning Representations

May 2019
TeleSign Datathon Winner

TeleSign

Apr 2019
Bronze Medal

Hack Cambridge 4D

Jan 2019

Awarded the 3rd best project award (out of ~70 participating teams) at the 24h Hack Cambridge Recurse hackathon (a Major League Hacking event), as a member of team N-hance (pls).
ICLR 2018 Travel Award

The 6th International Conference on Learning Representations

May 2018
NIPS TSW 2017 Travel Award

NIPS 2017 Time Series Workshop

Dec 2017
ACM-ICPC Northwestern Europe Regional Contest 2017 - Gold Medal (Coach)

Association for Computing Machinery

Nov 2017
Computer Laboratory Wiseman Award

Department of Computer Science and Technology, University of Cambridge

Oct 2017

Awarded for making a commendable contribution to the work of the Department in the 2016/17 academic year.
Finalist

Google Hash Code 2017

Apr 2017

Ranked 5th in the Final Round (out of 50 participating teams)
Ranked 53rd in the Qualification Round (out of ~3000 participating teams)
Bronze Medal

Hack Cambridge Recurse

Jan 2017

Awarded the 3rd best project award (out of ~100 participating teams) at the 24h Hack Cambridge Recurse hackathon (a Major League Hacking event), as a member of team facejack.
ACM-ICPC Northwestern Europe Regional Contest 2016 - Silver Medal

Association for Computing Machinery

Nov 2016
Finalist

Hack Cambridge

Jan 2016

Nominated as one of the top 7 projects (out of ~100 participating teams) at the 24h Hack Cambridge hackathon (a Major League Hacking event), as a member of team Viral.
ACM-ICPC Northwestern Europe Regional Contest 2014 - Bronze Medal

Association for Computing Machinery

Nov 2014
Senior Scholar

Trinity College, University of Cambridge

Jul 2014

Elected due to high performance in the examinations for Part IB of the Computer Science Tripos.
Re-elected for another year (in July 2015), due to high performance in the examinations for Part II of the Computer Science Tripos.
Junior Scholar

Trinity College, University of Cambridge

Jul 2013

Elected due to high performance in the examinations for Part IA of the Computer Science Tripos.
Dositeja Award

Fund for Young Talents of Serbia

Jun 2013

Awarded to high-school students for great performance at acknowledged national and international competitions during the 2011/12 school year.
Best Final Year Project in the area of Informatics

Matematička gimnazija

Jun 2012
Cambridge Overseas Trust Scholarship

Cambridge Commonwealth, European & International Trust

May 2012
Trinity Overseas Bursary

Trinity College, University of Cambridge

May 2012
Serbian High School Competition Awards

Mathematical and Physical Societies of Serbia

– Serbian Olympiad in Informatics - Third Prize (May 2012)
– National Competition in Informatics - Second Prize (April 2012)
– National Competition in Physics - Honourable Mention (April 2011)
– National Competition in Mathematics - Honourable Mention (March 2010)
– National Competition in Physics - Honourable Mention (April 2009)

Languages

English

Full professional proficiency
Serbian

Native or bilingual proficiency
French

Elementary proficiency
Romanian

Limited working proficiency

More activity by Petar

The first day of DND-24 was awesome 🔥 Today we go all in on STEM topics. My talks will be: * 7 min plenary in the second AI session (starting…

The first day of DND-24 was awesome 🔥 Today we go all in on STEM topics. My talks will be: * 7 min plenary in the second AI session (starting…

Shared by Petar Veličković
I'm attending DND-24 at the Rectorate of the University of Montenegro in Podgorica 🇲🇪 for the next two days; looking forward to meeting the most…

I'm attending DND-24 at the Rectorate of the University of Montenegro in Podgorica 🇲🇪 for the next two days; looking forward to meeting the most…

Shared by Petar Veličković
Round and Round we Go! 🔄 Rotary Positional Encodings (RoPE) are a common staple of frontier LLMs. _Why_ do they work so well, and _how_ do LLMs…

Round and Round we Go! 🔄 Rotary Positional Encodings (RoPE) are a common staple of frontier LLMs. _Why_ do they work so well, and _how_ do LLMs…

Shared by Petar Veličković
I did say Google DeepMind would one day win the Nobel Prize, I just didn't expect it to come so fast! 🧬 Heartfelt congratulations to the entire…

I did say Google DeepMind would one day win the Nobel Prize, I just didn't expect it to come so fast! 🧬 Heartfelt congratulations to the entire…

Shared by Petar Veličković
Next stop: Podgorica 🇲🇪 Vidimo se uskoro na DND-24 🚀

Next stop: Podgorica 🇲🇪 Vidimo se uskoro na DND-24 🚀

Shared by Petar Veličković
Expander Graph Propagation has levelled up! ⏫ Introducing Cayley Graph Propagation 🕸 https://lnkd.in/edVnQpQY While EGP truncates Cayley graphs…

Expander Graph Propagation has levelled up! ⏫ Introducing Cayley Graph Propagation 🕸 https://lnkd.in/edVnQpQY While EGP truncates Cayley graphs…

Shared by Petar Veličković
I recently made my very first visit to Skopje 🇲🇰 for the AI Tech Summit! I had ample and inspiring interactions with the local AI community, and…

I recently made my very first visit to Skopje 🇲🇰 for the AI Tech Summit! I had ample and inspiring interactions with the local AI community, and…

Shared by Petar Veličković

View Petar’s full profile

See who you know in common
Get introduced
Contact Petar directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Petar Veličković

16 others named Petar Veličković are on LinkedIn

See others named Petar Veličković

Add new skills with these courses

See all courses

Petar Veličković

London Area, United Kingdom 20K followers 500+ connections

About

Activity

I enjoy finding statues of Nikola Tesla, and luckily, there are many around the world. One was at the Croatian Machine Learning Workshop (CMLW)…

Liked by Petar Veličković

I've just been promoted to Senior Staff Research Scientist at Google DeepMind! 🚀 This energises me to push even stronger on exciting foundational…

Posted by Petar Veličković

I love a good leaderboard... or several! ⏫ TGR 🐅 is our graph rewiring method for temporal graphs leveraging expander graph propagation…

Shared by Petar Veličković

Experience

-

-

-

-

-

-

-

-

-

-

-

-

Education

Publications

Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020) Dec 2020

The 8th International Conference on Learning Representations (ICLR 2020) Apr 2020

The 7th International Conference on Learning Representations (ICLR 2019) May 2019

University of Cambridge May 2019

The 12th EAI International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth 2018) May 2018

The 6th International Conference on Learning Representations (ICLR 2018) May 2018

The 7th IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2016) Dec 2016

Patents

Filed June 4, 2021 17 / 338,974

Filed November 10, 2017 EU EP3483792A1

Persistent Message Passing for Graph Neural Networks

Filed January 1, 2022 US 17/829,204

Honors & Awards

Outstanding Reviewer

Ninth International Conference on Learning Representations

Best Short Paper Runner-Up

AAAI-21 International Workshop on Health Intelligence (W3PHIAI 2021)

Top Reviewer

Thirty-seventh International Conference on Machine Learning

Best Reviewer Award

Thirty-third Conference on Neural Information Processing Systems

ICLR 2019 Travel Award

The 7th International Conference on Learning Representations

TeleSign Datathon Winner

TeleSign

Bronze Medal

Hack Cambridge 4D

ICLR 2018 Travel Award

The 6th International Conference on Learning Representations

NIPS TSW 2017 Travel Award

NIPS 2017 Time Series Workshop

ACM-ICPC Northwestern Europe Regional Contest 2017 - Gold Medal (Coach)

Association for Computing Machinery

Computer Laboratory Wiseman Award

Department of Computer Science and Technology, University of Cambridge

Finalist

Google Hash Code 2017

Bronze Medal

Hack Cambridge Recurse

ACM-ICPC Northwestern Europe Regional Contest 2016 - Silver Medal

Association for Computing Machinery

Finalist

Hack Cambridge

ACM-ICPC Northwestern Europe Regional Contest 2014 - Bronze Medal

Association for Computing Machinery

Senior Scholar

Trinity College, University of Cambridge

Junior Scholar

Trinity College, University of Cambridge

Dositeja Award

Fund for Young Talents of Serbia

Best Final Year Project in the area of Informatics

Matematička gimnazija

Cambridge Overseas Trust Scholarship

Cambridge Commonwealth, European & International Trust

London Area, United Kingdom

20K followers 500+ connections