head shot

Amittai Siavava

amittai.studio

About

I am majoring in computer science and mathematics at Dartmouth College, with particular interests in deep learning, category theory, functional programming, and design.

I carry a keen sense of responsibility for my work, across the entire data stack:

  • Ethical and responsible data collection and warehousing.

    Knowing what to collect, how, why, and most importantly how to respect user privacy and copyright issues where applicable is important. I have relevant experience and coursework in data mining and ethics therein.

    Proper warehousing, be it in data lakes or SQL/NoSQL databases, is also critical. I have experience working with both SQL and NoSQL databases, and I am working to better understand the underlying architectures and implementations of SQL databases such as MySQL and PostgreSQL.

    I am also curious about vector databases and how the nuances they introduce to the data stack.

  • Ethical analysis, interpretation and usage.

    We are in the age of AI, indubitably.
    I am interested in deep learning and the applications of novel neural network architectures, especially transformers, to real-world problems. I have experience working with neural networks (and other less interesting ML models!) in various application contexts, including computer vision, language understanding, robotics, and reinforcement learning.

  • Presentation and use in production.

    Much of anything is useful once it can be presented to an end user in a system designed and customized for their needs. I have experience and interests in both building front-end, user-facing applications and building back-ends to support them. I am experienced with both React and Vue, and their proxies (e.g. Nuxt, Next, etc.).

    I also have experience building systems with a focus on efficiency and high performance (C, C++, Rust) and reliability (Rust, Haskell).

I am currently working on some stuff I am excited about over at entendr and completing my final year of college. I also recently started learning Racket because of its langauge-oriented development features. You can find quick links below this, and social links at the bottom of this page.

blogartphotographypresentationsentendr

Education

06/2209/22
Fellow at Y Combinator (Startup School)

Participated in a 10-week summer fellowship organized by Y Combinator that engages current and aspiring startup founders with experienced founders and investors to cultivate a strong acumen for entrepreneurship.

Work Experience

06/2312/23
Researcher at D-RLab

Worked with Professor Alberto Quattrini Li and graduate student Mingi Jeong at the Dartmouth Reality and Robotics Lab.

Research work focused on applications of modern deep learning and computer vision techniques in robotics for more robust waterline detection.

06/2209/22
Software Engineering Intern at Facebook

At the time, Meta was pivoting its machine learning strategy to be more efficient with limited data in light of cross-site tracking restrictions introduced by Apple in iOS 14.5.

To facilitate this transition;

  • I prototyped a new concept for a closed-loop ML pipeline from Facebook and Instagram apps to internal data-stores, to machine learning workflows, and eventually relaying feedback to the apps.
  • I developed a command-line and web utility (in C++ and PHP) for monitoring and controlling the data throughput of the ML pipeline.

The tool enabled machine learning engineers to fine-tune the data rates that they desired for their models, depending on scale of experimentation.

06/2108/21
Fullstack Engineering Intern at Copia

Copia re-imagines e-commerce for developing communities in East Africa by providing solutions for the restrictions of the region, such as ordering products via text message and paying via mobile money.

At my time there:

  • I developed a new widget for the Copia website that simplified the order-tracking process for orders made via text message.
  • I also worked with the engineering team that maintained Copia's database and internal APIs. Technologies used included Python, Django, and PostgreSQL.

It was estimated that more than 80 per-cent of Copia’s two-million customers ordered via text message at least once a month.

01/1811/18
Web Development Intern at Compsight

Compsight specializes in building tech solutions for SMEs and other small players in Kenya that otherwise would not have leeway to build entire engineering departments.

Featured Projects


archive
11/2023
Societal Attitudes Toward AI

Artificial Intelligence (AI) is a hot topic, especially in light of recent improvements in its capabilities. This research project studies the societal attitudes toward AI, both present and how they have evolved over the years, as a way to understand how different events have shaped the public's perception of AI. We use topic modeling, sentiment analysis, and procrustes analysis to analyze relationships across time periods and extract insight into the changing story of artificial intelligence. Here's the studied dataset and the project report.

Collaborative project with Aimen Abdulaziz and Angelic McPherson.

11/2023
AI / Tech Dataset

I wrote a high-performant web scraper in Haskell to scrape 17000+ articles from online technology websites such as DeepMind, MIT Tech ReviewOpenAI, Singularity Hub, and TechCrunch

The scraper uses Arrows and other functional programming patterns to ensure concurrency and efficiency. The dataset is open-source and available on HuggingFace.

Collaborative project with Aimen Abdulaziz and Angelic McPherson.

9/2023

Generative Pre-trained Transformer

For fun, I implemented the Generative Pre-trained Transformer (GPT) architecture in PyTorch. GPT is a language model that uses a transformer architecture to generate text. I trained the model on the tiny Shakespeare dataset and used it to generate text in the style of Shakespeare.

Transformers are a type of neural network architecture that employes techniques such as:

  • Self-attention, which allows the model to learn the relationships between different parts of the input data.
  • Positional encoding, which allows the model to learn the relative positions of the input data.
  • Multi-head attention, which allows the model to learn different relationships between different parts of the input data.
  • Residual connections, which allows the model to learn the difference between the input data and the output data.
5/2023

Adversarial Training for Neural Networks

Experimentation with various adversarial training techniques for neural networks. Adversarial training is useful to improve the robustness of neural networks to adversarial attacks — which often happen as noise in the input data. Techniques explored include:

  • Data augmentation, which helps the model have more data to learn from. New samples are generated by randomly cropping and/or flipping some images. We also add pertubations, to a random subset of images, which helps the model learn to be robust to noise.
  • Dropout, which entails randomly blocking a subset of the neurons in the network from transmitting information during training. This helps huge models avoid overfitting, thus generalize better.
  • Ensemble learning, which entails training multiple models on the same dataset and then averaging their predictions. This helps the model generalize better by mitigating the effects of a single model overfitting.
10/2022

Relational Database App

A relational database with a Python frontend and a MySQL backend, including various triggers and SQL automations.

Collaborative project with Ke Lou.

11/2021

Logisim Processor

A fully functional 16-bit CPU implemented in Logisim. All CPU components including the ALU, registers, control unit, program counter, RAM, micro-sequencer, finite-state machine, and IO are implemented from basic gates.

10/2021

Intelligent Chess bot

A chess bot that uses various strategies including minimax, alpha-beta pruning, iterative deepening, transposition tables, move ordering, null-move pruning, aspiration windows, and quiescence search to maximize outcome against an opponent in chess.

5/2021

Tiny Search Engine

A hyper-efficient search engine that crawls webpages (whose domain can be restricted to a given subset) and indexes them, then handles user queries on the contents of the collection of pages, with results ranked by frequency. It also supports query modifiers such as AND, OR, and NOT.

Contact

linkedin
github
huggingface
email
siavava [at] outlook [dot] com
instagram
leetcode
stackoverflow