Summer Research School.

Published 22 August 2021 at Yours, Kewbish. 2,061 words. Subscribe via RSS.

This post is unlisted and has been archived. This doesn't represent my best work; please check out the posts listed here instead.

Introduction

I’ve mentioned it a couple times in the footnotes and conclusions of previous posts, but I recently wrapped up a wonderful time at the Summer Research School. Hosted by the Bulgarian High School Student Institute for Mathematics and Informatics, the program takes in about forty Bulgarian and international students to pursue an intensive research program over summer break. For three weeks, each student works with their mentor on a topic of inquiry of their choosing, and end up both writing a paper and creating a presentation to defend against mentors’ questions. In past years, all students, including international ones, gathered in Bulgaria for the camp, but COVID restrictions meant that only the Bulgarian students were able to do so this year. Even though I participated virtually with a ten-hour time difference (I’ve always been an early bird, but I will confess I dropped a couple of the 6AM lectures for the sake of my sanity), I had an amazing time, and I’ve gained many valuable experiences: learning to write academically, creating my own research program, and hanging out with other like-minded youth.

I’d heard about SRS from a friend a couple years ago (thanks Umang!) over the summer, just when the program was starting up. I thought it sounded enticing, what with the opportunity to be mentored through a proper research article, and getting exposed to an abridged and compressed version of the typical academic. To be honest, I didn’t think my limited research experience - restricted to a very scuffed audio-visual random number generator at a district science fair and maybe playing up whatever I did during Google Code-in - was sufficient to be accepted. But when I got the acceptance email, I was super excited: I’d been hoping to participate for a while, so it was a pleasant suprise.

SRS was a great opportunity for me, both to explore the fields of theoretical mathematics and computer science, and to meet other students passionate about their interests. I’ve made fantastic memories here, and I’m very happy I attended.

Probabilistic Primality Testing

On my application, I listed cryptography, random number generation, and programming language theory as my main interests - three things that I have very insubstantial experience with. Going into SRS, I had no concrete project ideas, so when the coordinators sent out the list of proposed topics, I was happy to find something that sort of fell into the crypto bucket and seemed to combine CS with maths. That something was a project on primality testing, and all the interesting theory and algorithms behind it. While I had absolutely no idea about the field before (except perhaps a cursory glance at the Sieve of Eratosthenes), I’m pretty proud of what I was able to do in my three weeks here.

My project specifically centered around probabilistic primality testing. In layman’s terms, they’re tests that differentiate prime numbers from non-prime, or composite, ones and that do this differentiating with some element of randomness. These primality tests are essential in modern cryptography, ensuring the security and reversibility of encryption and decryption in most systems. For the number-theory inclined, I analyzed the Fermat, Euler (Solovay-Strassen), and Miller-Rabin tests, specifically. I looked at the accuracy, or the number of composite numbers marked as prime (in essence, wrong results), and the efficiency, or the wall time elapsed to test the given numbers, for each test.

There also exist deterministic primality tests, or tests that verify the primality of a number with none of the randomness, and with 100% accuracy. For example, the Agarwal-Kayal-Saxena (AKS) test is notable for being a deterministic primality test, and one that runs in polynomial-bounded (more efficient than other options) time, at that. However, though deterministic tests are perfectly accurate and return no incorrect results, they’re very slow compared to the probabilistic tests. This is why probabilistic primality tests are usually used instead - they’re just more practical.

My research also included a proposal for a new variant of AKS, which I called probabilistic AKS. It replaced the computationally expensive and extensive checks needed to verify primality with fewer ones, at the cost of returning some incorrect results (as with all probabilistic tests). This test ran significantly faster than the deterministic algorithm, reducing the bounds from soft-O((log n)^15/2) to soft-O(k(log n)^5), further dropping to soft-O(k(log n)^4) with the best case scenario.

All this analysis was done through Python (more on the technical implementations later) and a bunch of data conversion scripts that pulled the results out to a CSV file. I then collected all the data into a couple central Excel sheets for graphing. All the pretty figures and results are available in the actual paper and in the project repository.

I chose this project as I wanted to work on something related to my existing interests of CS and crypto, but I also wished to explore something new. That something new ended up being a whole bunch of modular arithmetic and the world of number theory - a decent challenge, given that I’d never really looked at these concepts systematically before. I was also looking forward to having a mentor with me, which made me feel more confident to step outside my knowledge comfort zone and engage in something that’d really let me learn something new.

My Saviour, Numba

Even with my mentor, Pressiana Marinova, and her consistent support and guidance, there were a couple obstacles throughout this whole experience. One of which, and also the title of this section (you’ll see why in a bit), was optimization. I initially had written my primality test implementations in Sage, a Python-like mathematics system that I’d heard of previously. I knew its more advanced capabilities for maths were quite powerful, so I thought I’d try that. I also thought it’d be faster than raw Python - but I was incredibly wrong. Once I got to writing the Miller-Rabin tests (read, the primality tests with more computations), testing even 10000 integers was becoming frustratingly slow. I tried switching to pure Python, which delivered a slight speed increase, but it was my friends (thanks Pranav and Sayam!) who suggested I take a look at Numpy and, the saving grave of my entire project, Numba. These compile Python code down to lower-level languages, which, after a brief but difficult struggle with decorators and external functions, sped up my code from an average runtime of five minutes to five seconds. Pretty significant, and incredibly liberating (especially when I started implementing deterministic AKS - I don’t think I could have gotten reasonable data for even a hundred integers without Numba).

Learning the background material itself was also a challenge. I had a vague understanding of some of the modular arithmetic and group theory from my past forays into crypto with Cryptohack. However, trying to wrap my head around all this new notation in the context of primality testing and the papers I had to read was a challenge. The first couple days of SRS were spent speedrunning modular arithmetic on Khan Academy and binge-watching Fermat’s Little Theorem proofs off YouTube. Still, trying to comprehend all the theory was pretty difficult, especially since primality testing is a complex topic with a decent amount of prerequisite knowledge, so there wasn’t a lot of easily accessible explanations of proofs or the tests. Then again, the project was an amazing crash course in number theory and modular arithmetic, and I’d like to extend endless thanks to my amazing mentor and the resources she provided - without them, I couldn’t have cracked even the first couple days’ worth of material.

The last major struggle I had was dealing with the program deadlines. I kept up a pretty reasonable working pace throughout the entire camp, but I still ended up speedrunning the paper and presentation last minute. I wrote a good chunk (pretty much the entire analysis of results and discussion) on the day before the deadline day. To my credit, it was still a decent amount of time in advance, since I was behind Bulgarian time by ten hours. I also started the presentation completely from scratch (and with absolutely no knowledge of Beamer) the day before I was scheduled to present my paper - not something I’m most proud of. I ended up not having a ton of prep time, but I managed to muddle my way through satisfactorily. I remember planning to have finished my paper by the beginning of the last week, and having an entire week to prep my presentation, but that was certainly not the case. I don’t even think it was due to time management - the program just crams a lot of material into a short period of time.

The Cult of Doner

But it wasn’t all work - there was an appropriate amount of fun and games as well, even if I was an entire 9271km away from the Bulgarian students. There were a couple social nights and weekend activities over the course of the program, especially in the first week. Our online quiz night and extraordinarily cursed Cards Against Humanity game stand out as my favourite memories. I also enjoyed receiving a very thorough Bulgarian education on everything from Bulgarian music, food, innovations, and culture in our SRS Discord, which I unfortunately started.

Speaking of starting the Discord, I’m quite glad we had that community as well, since I would have never been graced with the Bulgarian sensation of doner (or duner, apparently). I don’t think it even originated in Bulgaria (at least, the places I found in Vancouver were either advertising ‘Berlin-style’ or ‘authentic Middle-Eastern’, so take that as you will), but regardless: doner is a pita-style wrap-sandwich-burrito-esque concoction. And it’s apparently really popular among the Bulgarian students of SRS: the literal second day the server had been running, I woke up to a deluge of doner pictures and discussion. Over the course of the week, I’d come to learn what doner was, as well as agree to trying it. I eventually stopped by this little place in Vancouver to try it, thereby fulfilling my SRS doner bet duties.

Conclusion

Overall, I really, really liked SRS. As my first research experience, I gained valuable skills and insights regarding the whole academic procedure, things I’m sure to apply in my future studies. However, I wish the program were longer, since it felt a bit like I was cramming an entire thesis process into three weeks. I didn’t find that as enough time for the theory to really sink in, as well as not enough time to finish up my analysis to my standards. I also wish I’d have been able to see what SRS was like in person - the photos and updates the Bulgarian students were posting were very enticing. Unfortunately, COVID was a major obstacle (and apparently they had a slightly scuffed location), but it would have been nice to get to know fellow participants more. As well, I would have been able to have closer feedback loops with my mentor - but that’s besides the point. All in all, it was still an amazing experience, and has left me with incredible memories.

I’d like to thank my mentor for her hard work and support throughout this entire rollercoaster ride of a program. Her clear explanations and daily checkins helped me get through even the worst obstacles. As well, much gratitude to the HSSIMI and SRS coordination team - I really appreciate all the work the team’s put in to make the experience enriching and fulfilling, even for international students half a world away.

In the future, I plan to continue tinkering with my analysis, and finish editing my paper for potential publishing - I’ve been looking at a couple youth journals in Canada. I’d like to adjust my testing bounds, as well as iron out a couple inconsistencies in the amount of data collected due to time constraints during SRS. I also plan on diving deeper into the theory and underlying system of the AKS test to see if I can improve the probabilistic version more. I’m decently proud of what I’ve managed to produce given three weeks’ time, but I can’t help but focus on the tiny things I’d like to fix. SRS has inspired me to put more time into considering research opportunities like this, but we’ll see how that all pans out as school starts.

‹ go back