is a software engineer,
researcher, and


I currently work at Instagram, the social media platform for sane humans. I'm also building entendr. in my spare time.

I seek to develop better interactive experiences for people to cultivate and share ideas so they can meaningfully express what brings them joy.

I am a generalist toward this goal and care deeply about systems, which form the fundamental building blocks of our applications; interaction design, which governs how we use and live with computers all around us, and ambient intelligence, which enables us to interact with our environment in more natural ways.


Are you excited about any of these things?
Do reach out!



Participated in a 10-week summer fellowship organized by Y Combinator that engages current and aspiring startup founders with experienced founders and investors to cultivate a strong acumen for entrepreneurship.

Work Experience


Joining the Instagram team starting July 2024 to work on infrastructure supporting 1.4 billion Instagram users.


Worked with Professor Alberto Quattrini Li and graduate student Mingi Jeong at the Dartmouth Reality and Robotics Lab.

Research work focused on applications of modern deep learning and computer vision techniques in robotics for more robust waterline detection.


At the time, Meta was pivoting its machine learning strategy to be more efficient with limited data in light of cross-site tracking restrictions introduced by Apple in iOS 14.5.

To facilitate this transition;

  • I prototyped a new concept for a closed-loop ML pipeline from Facebook and Instagram apps to internal data-stores, to machine learning workflows, and eventually relaying feedback to the apps.
  • I developed a command-line and web utility (in C++ and PHP) for monitoring and controlling the data throughput of the ML pipeline.

The tool enabled machine learning engineers to fine-tune the data rates that they desired for their models, depending on scale of experimentation.


Copia re-imagines e-commerce for developing communities in East Africa by providing solutions for the restrictions of the region, such as ordering products via text message and paying via mobile money.

At my time there:

  • I developed a new widget for the Copia website that simplified the order-tracking process for orders made via text message.
  • I also worked with the engineering team that maintained Copia's database and internal APIs. Technologies used included Python, Django, and PostgreSQL.

It was estimated that more than 80 per-cent of Copia’s two-million customers ordered via text message at least once a month.


Compsight specializes in building tech solutions for SMEs and other small players in Kenya that otherwise would not have leeway to build entire engineering departments.

Featured Projects


Optical Flow

Experimentation with different techniques for tracking optical flow in a video sequence. Optical flow is the apparent motion of objects, either due to the motion of the objects themselves or the motion of the camera.
Techiniques implemented include:

You can view the report here.


Augmented Reality with Planar Homographies

Experimentation with using planar homographies to overlay images on top of other images. The goal is to create an augmented reality effect by transforming the perspective of the overlay image to match the perspective of the background image.
Techiniques implemented include:

  • Image feature descriptors, such as BRIEF and ORB, for detecting and describing keypoints in images.
  • RANSAC algorithm for estimating the homography between two images.
  • Perspective Transform for transforming the perspective of an image to match the perspective of another image.

You can view the report here.


Societal Attitudes Toward AI

Artificial Intelligence (AI) is a hot topic, especially in light of recent improvements in its capabilities. This research project studies the societal attitudes toward AI, both present and how they have evolved over the years, as a way to understand how different events have shaped the public's perception of AI. We use topic modeling, sentiment analysis, and procrustes analysis to analyze relationships across time periods and extract insight into the changing story of artificial intelligence. Here's the studied dataset and the project report.

Collaborative project with Aimen Abdulaziz and Angelic McPherson.


AI / Tech Dataset

I wrote a high-performant web scraper in Haskell to scrape 17000+ articles from online technology websites such as DeepMind, MIT Tech Review, OpenAI, Singularity Hub, and TechCrunch

The scraper uses Arrows and other functional programming patterns to ensure concurrency and efficiency. The dataset is open-source and available on HuggingFace.

Collaborative project with Aimen Abdulaziz and Angelic McPherson.


Generative Pre-trained Transformer

For fun, I implemented the Generative Pre-trained Transformer (GPT) architecture in PyTorch. GPT is a language model that uses a transformer architecture to generate text. I trained the model on the tiny Shakespeare dataset and used it to generate text in the style of Shakespeare.

Transformers are a type of neural network architecture that employes techniques such as:

  • Self-attention, which allows the model to learn the relationships between different parts of the input data.
  • Positional encoding, which allows the model to learn the relative positions of the input data.
  • Multi-head attention, which allows the model to learn different relationships between different parts of the input data.
  • Residual connections, which allows the model to learn the difference between the input data and the output data.

Adversarial Training for Neural Networks

Experimentation with various adversarial training techniques for neural networks. Adversarial training is useful to improve the robustness of neural networks to adversarial attacks — which often happen as noise in the input data. Techniques explored include:

  • Data augmentation, which helps the model have more data to learn from. New samples are generated by randomly cropping and/or flipping some images. We also add pertubations, to a random subset of images, which helps the model learn to be robust to noise.
  • Dropout, which entails randomly blocking a subset of the neurons in the network from transmitting information during training. This helps huge models avoid overfitting, thus generalize better.
  • Ensemble learning, which entails training multiple models on the same dataset and then averaging their predictions. This helps the model generalize better by mitigating the effects of a single model overfitting.

Relational Database App

A relational database with a Python frontend and a MySQL backend, including various triggers and SQL automations.

Collaborative project with Ke Lou.


Logisim Processor

A fully functional 16-bit CPU implemented in Logisim. All CPU components including the ALU, registers, control unit, program counter, RAM, micro-sequencer, finite-state machine, and IO are implemented from basic gates.


Intelligent Chess bot

A chess bot that uses various strategies including minimax, alpha-beta pruning, iterative deepening, transposition tables, move ordering, null-move pruning, aspiration windows, and quiescence search to maximize outcome against an opponent in chess.


Tiny Search Engine

A hyper-efficient search engine that crawls webpages (whose domain can be restricted to a given subset) and indexes them, then handles user queries on the contents of the collection of pages, with results ranked by frequency. It also supports query modifiers such as AND, OR, and NOT.


siavava [at] outlook [dot] com