Anxhelo Diko

Anxhelo Diko

PhD Student In Computer Science

VisionLab Research Group, Sapienza University of Rome

Biography

A highly motivated and results-oriented Computer Vision Ph.D. student with a deep passion for advancing the field of artificial intelligence. My research focuses on building multimodal representations and understanding human activities from ego/exocentric perspectives, addressing key challenges for autonomous agents and AI in general. I have extensive experience with multimodal large language models for video captioning and question answering and a keen interest in view-invariant video representation learning. I’m particularly committed to exploring how to effectively bridge the gap between representations of different modalities while preserving their unique characteristics.

In addition to my research expertise, I possess a strong engineering foundation honed through academic and industry experiences. Proficient in Python, C++, and CUDA, I excel at rapidly prototyping and implementing innovative ideas. I’m eager to leverage my skills and knowledge to contribute to cutting-edge research and development in this dynamic field.

Interests
  • Artificial Intelligence
  • Computer Vision
  • Machine Learning
  • Deep Learning
  • Human Activity Understanding
  • Representation Learning
  • AI for medicine
Education
  • PhD in Computer Science, 2021 - ongoing

    Sapienza University

  • MSc in Computer Science, 2018 - 2020

    Sapienza University

  • BSc in Business Computer Science (a.k.a. Data Science), 2015 -2018

    University of Tirana

Skills

Machin Learning

5+ Years

Deep Learning

5+ Years

Computer Vision

4+ Years

Research

3+ Years

Programming

5+ Years

Experience

 
 
 
 
 
Huawer Research Center Helsinki
Computer Vision Research Scientist
March 2024 – Present Helsinki, Finland

Conducting research on the following areas of computer vision:

  • Multimodal Large Language Models
  • Long-term Video Understanding
  • Dense video captioning
  • Temporal Event Localization
  • Question and Anserwing from Videos
 
 
 
 
 
Sapienza University of Rome
Research Fellow
March 2021 – Present Rome, Italy

Main responsibilities include:

  • Designing and implementing computer vision solutions for gait analysis through RGB videos that assist physicians in diagnosing patients with mobility disorders.
  • Designing and implementing machine learning approaches for medical image processing.
  • Designing and implementing machine learning approaches on resource optimization for post-intervention patients.
 
 
 
 
 
MedLear srls
Machine Learning Specialist
March 2020 – July 2021 Rome, Italy

Main responsabilities include:

  • Designing and implementing machine learning algorithms for respiratory diagnosis and prognosis by applying classification, regression, and segmentation techniques on CT images and patient medical history. The implemented solutions would provide MedLea with a suite of algorithms that could analyze patient data for different respiratory problems.
  • Deploying machine learning models.
  • Designing and implementing a parallel and scalable Ray-Tracing algorithm for GPUs for discretizing 3D mesh representation of geometries into a volumetric representation. The implemented algorithm would cut the computational costs of the services offered by MedLear by 30% in the preparation phase.
  • Manage MedLea computing infrastructure
 
 
 
 
 
PaperClicks
Machine Learning Intern
January 2018 – July 2018 Rome, Italy

Main Responsabilities include:

  • Responsible for designing and implementing a machine learning algorithm for the optimization of affiliate marketing campaigns enabling automated profits.

Accomplish­ments

Advanced C++ developer
See certificate

Publications

(2024). S-GEAR: Semantically Guided Representation Learning for Action Anticipation (ECCV2024).

PDF Code

(2021). MS-Faster R-CNN: Multi-stream backbone for improved Faster R-CNN object detection and aerial tracking from UAV images.

PDF

(0001). Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks (CVPRW).

PDF Code

Collaboration (inter-disciplinary)

(2023). COVID-19 therapy optimization by AI-driven biomechanical simulations.

PDF

(2022). A novel gan-based anomaly detection and localization method for aerial video surveillance at low altitude.

PDF

(2021). Low-altitude aerial video surveillance via one-class SVM anomaly detection from textural features in UAV images.

PDF

(2021). In-silico analysis of airflow dynamics and particle transport within a human nasal cavity.

PDF

Contact

Contact Me