head shot

Amittai Siavava



My primary interests are in

, and , and a little bit of . However, I love to problem-solve across a wide range of domains.

I love spending time crafting things—



Participated in a 10-week summer fellowship organized by

that engages current and aspiring startup founders with experienced founders and investors to cultivate a strong acumen for entrepreneurship.

Work Experience


Worked with Professor

and graduate student at the Dartmouth Reality and Robotics Lab.

Research work focused on applications of modern

and techniques in for more robust .


Assisted in the instruction and facilitation of various courses:


At the time, Meta was pivoting its machine learning strategy to be more efficient with limited data in light of cross-site tracking restrictions introduced by Apple in iOS 14.5.

To facilitate this transition;

  • I prototyped a new concept for a closed-loop ML pipeline from Facebook and Instagram apps to internal data-stores, to machine learning workflows, and eventually relaying feedback to the apps.
  • I developed a command-line and web utility (in C++ and PHP) for monitoring and controlling the data throughput of the ML pipeline.

The tool enabled machine learning engineers to fine-tune the data rates that they desired for their models, depending on scale of experimentation.


Copia re-imagines e-commerce for developing communities in East Africa by providing solutions for the restrictions of the region, such as ordering products via text message and paying via mobile money.

At my time there:

  • I developed a new widget for the Copia website that simplified the order-tracking process for orders made via text message.
  • I also worked with the engineering team that maintained Copia's database and internal APIs. Technologies used included Python, Django, and PostgreSQL.

It was estimated that more than 80 per-cent of Copia’s two-million customers ordered via text message at least once a month.


Compsight specializes in building tech solutions for SMEs and other small players in Kenya that otherwise would not have leeway to build entire engineering departments.

Featured Projects


Built a landing-page and documentation website for my Computer Science senior design challenge project.


Wrote a functional program in Haskell to scrape 3000+ articles from online technology websites such as

The scraper uses

and , among other functional programming patterns to ensure concurrency and efficiency. The articles were used for a subsequent study on the changing attitudes of society toward technology.

The dataset is open-source and available on


Collaborative project with

and .


Generative Pre-trained Transformer

For fun, I implemented the

(GPT) architecture in PyTorch. GPT is a language model that uses a transformer architecture to generate text. I trained the model on the and used it to generate text in the style of Shakespeare.

Transformers are a type of neural network architecture that employes techniques such as:

  • , which allows the model to learn the relationships between different parts of the input data.
  • , which allows the model to learn the relative positions of the input data.
  • , which allows the model to learn different relationships between different parts of the input data.
  • , which allows the model to learn the difference between the input data and the output data.

Adversarial Training for Neural Networks

Experimentation with various adversarial training techniques for neural networks. Adversarial training is useful to improve the robustness of neural networks to adversarial attacks — which often happen as noise in the input data. Techniques explored include:

  • , which helps the model have more data to learn from. New samples are generated by randomly cropping and/or flipping some images. We also add , to a random subset of images, which helps the model learn to be robust to noise.
  • , which entails randomly blocking a subset of the neurons in the network from transmitting information during training. This helps huge models avoid overfitting, thus generalize better.
  • , which entails training multiple models on the same dataset and then averaging their predictions. This helps the model generalize better by mitigating the effects of a single model overfitting.

Relational Database App


with a frontend and a backend, including various triggers and automations.

Collaborative project with



Logisim Processor

A fully functional 16-bit

implemented in . All CPU components including the , , , , , , , and are implemented from basic gates.


Intelligent Chess bot

A chess bot that uses various strategies including

, , , , , , , and to maximize outcome against an opponent in chess.


Tiny Search Engine

A hyper-efficient

that crawls webpages (whose domain can be restricted to a given subset) and indexes them, then handles user queries on the contents of the collection of pages, with results ranked by frequency. It also supports query modifiers such as AND, OR, and NOT.


siavava [at] outlook [dot] com