{\Large\bfseries A Recurrent Neural Network on chip\par}
\vspace{10pt}
\textbf{Authors:}\\
David Lanzendörfer
%David Lanzendörfer, Hernani Marques
\vspace{10pt}
%\textbf{Contributors:}\\
%Crow
\vspace{10pt}
\textbf{Date:}\\
\today
\vspace{10pt}
%\includegraphics[width=300pt]{img/sydney1.jpeg}
\vspace{10pt}
\textbf{Institution or Organization:}\\
Freedom Club
\vspace{10pt}
\textbf{Contact Information:}\\
leviathan@libresilicon.com
\end{titlepage}
\begin{abstract}
\setlength{\parindent}{0pt}
In the past couple of years generative AI has received a lot of attention and is getting increased traction in various fields, being used for generating audio, video, image and text.
However, current day AI solutions are centrally hosted and the generation is being done by transformers running distributed on a huge array of GPUs.
This solution has many drawbacks, mainly the power consumption which comes from such a large set of GPUs, but it also is a problem when it comes to privacy.
All the inference requests have to be forwarded to those server farms and it's not certain what the operator will do with the logs of the inference requests.
Considering that OpenAI is an American company and as such subject to FISA, it makes it impossible for public administration offices like the European federal bureaucracies to use ChatGPT, despite its potential benefits and decrease in inefficiencies it would offer to government administration processes which are well known to be inherently inefficient by nature.
Personal GPT addresses those problems by packing a Large Language Model onto a chip which can be attached to a computer over USB and can generate text without the need of an internet connection.
All the data, confidential or not, stays within your own four walls.
\end{abstract}
\section{Introduction}
The Personal GPT architecture consists of three types of layers:
\begin{enumerate}
\item The Input layer, also known as the encoder
\item The Hidden layers, where all the parameters for the knowledge are being stored
\item The Output layer, also known as the decoder, which maps the positional vector provided by the hidden network back to a token.
\end{enumerate}
\begin{figure}[h]
\begin{tcolorbox}
\center
\includegraphics[width=400pt]{diagram1.png}
\caption{General diagram}
\end{tcolorbox}
\end{figure}
\subsection{Encoder Layer}
The encoder layer is composed of a configurable amount of input perceptrons, which are wired in a recurrent neural network configuration.
In this RNN configuration, the neuron does not only contain input synapses for the 16 input token bits (the token space of the GPT2 tokenizer) but also an input bit array containing the last encoded positional vector.
This way, the positional encoding becomes a continuous time series.