Rough AI Blog

Local Memory for Autoregressive Language Models

Research · 28 Aug 2024

Small Autoregressive Language Models like GPT do not always produce desirable outputs. To make the model remember certain pattern of desired output, we use Bounded Memory to localize the context and move the hidden representations towards desired output token (or words). We use series of steps to successfully memorize a fact in the language model.

Dynamic 1D Piecewise Linear Spline Function Approx.

Research, Algorithm · 01 May 2024

The difficulty of understanding 1D ReLU-based Piecewise MLP guides us to work on Piecewise Linear Spline, its benefit for interpretation and ease of control. This experimental research starts with defining linear spline and deriving its gradient function. Finally, we create an algorithm to dynamically adjust the pieces of the linear spline to approximate some functions.

Perceptron to Deep-Neural-Network

Algorithm · 06 Jun 2020

A Journey From Perceptron to Deep Neural Networks in a sequential fashion. Start with Perceptron, move to Logistic Regression, Single Layer Neural Network, Multilayer Perceptron (1 hidden layer) and finally to Deep Neural Network. Understand the algorithms sequentially along with visualization and math.

Artificial Neural Network Back Then

Algorithm · 24 May 2020

Artificial Neural Network (ANN) is one of the most popular Machine Learning Algorithm. As the name suggests, the algorithm tries to mimic the Biological Neural Network, i.e. the Brain. In this post, we explore the development of the Algorithm from the very begining till development of Multilayer Perceptron.

Exploring Polynomial Regression

Algorithm · 04 Nov 2019

Polynomial Regression is the generalization of Linear Regression. It is simple to understand but can do a lot. It is used to approximate any Non-Linear functions, which is almost always better than Linear Regression. Here, we extend the idea of curve fitting, learn its capacity, problems and its limitations.

Blog Posts

Posts

Local Memory for Autoregressive Language Models

Dynamic 1D Piecewise Linear Spline Function Approx.

Perceptron to Deep-Neural-Network

Artificial Neural Network Back Then

Exploring Polynomial Regression