Research Works

Research Works

This page contains the abstracts of some of the research projects I’ve worked on recently.

On the Continued Fraction Expansion of Almost All Real Numbers

It is well-known that every irrational number x has a unique expansion as a continued fraction x = [a_0; a_1, a_2,\dots], where a_0 = a_0(x) = [x] and the numbers a_i = a_i(x), i = 1,2,\dots, are positive integers, called the continued fraction digits of x. By a classical result of Gauss and Kuzmin, the continued fraction expansion of a random real number x contains, with probability 1, each digit a\in\NN with asymptotic frequency P(a) = \log_2(1 + 1/(a(a + 2))); that is, almost all real numbers x satisfy

    \begin{equation*} lim_{n\to\infty}\frac1n\#\{1\le i\le n: a_i(x) = a\} = P(a),\quad a = 1, 2, \dots \end{equation*}

In this paper we consider two related questions: First, for certain infinite subsets A\subset \NN, we establish simple closed formulas for the frequency with which the continued fraction digits of almost all real numbers belong to the set A. For example, we show that for almost all real numbers x, a proportion \log_2(\pi^2/6) = 0.71803\dots of the continued fraction digits of x is of the form p - 1, where p is a prime.

Second, we determine explicitly the frequency with which a string of k consecutive digits a appears in the continued fraction expansion of almost all real numbers. For example, we show that for almost all real numbers x a string of k consecutive digits 1 appears in the continued fraction expansion of x with frequency |\log_2(1 + (-1)^k/F_{k + 2}^2)|, where F_n is the nth Fibonacci number.

Finally, we compare the frequencies predicted by these results with actual frequencies found among the first 300 million continued fraction digits of \pi. Our results show that the latter frequencies are statistically indistinguishable from those of a random real number.

Predicting Flight Delays Caused By Weather

This project is an attempt to predict the amount by which a given flight is delayed due to weather-related causes. We used a dataset on flight delays and their causes, and another about weather conditions in the US, and determined, for each plane, if its takeoff and/or landing time is during a severe weather event. If so, we used the time between the end of the event and the departure/arrival, as well as the type of weather event at each airport as features to try to predict the actual delay time due to weather. We tested two different models, KNN and SVM, to see which gave the better accuracy. Neither model was very successful due to a large number of confounding variables.

Link to our code:
https://gist.github.com/shreyas-s125/f42d9c435786aaa472e73b81c7d12eb8