At the beginning of my freshman year at Stanford, I set out to shake a thousand hands. In January, I ended the challenge and stopped counting handshakes.

Here’s a look into what I realized along the way, where I went wrong when embarking on this challenge in September, and some other updates from my freshman year thus far –

A Principle from Reinforcement Learning

I begin with a brief scenario, that will seem unrelated to handshakes at first, but in fact, illustrates a key underlying principle.

Imagine the pit boss of a casino has given you a total of 1000 free plays among all of the 1000 slot machines in the casino, and you’re completely unfamiliar with the casino. Your goal is to make the most money from the 1000 slot machines using your 1000 plays. Suppose that you use your first 100 plays on Slot Machine 1, at which point you hit the jackpot and win $100k from Slot Machine 1. Then, you can estimate that, if you continued using your remaining 900 plays on Slot Machine 1, you’d make $1M in total. 

However, now, you have a dilemma. You haven’t tried playing the other 999 slot machines, and they could be better or worse than Slot Machine 1. Maybe Slot Machine 257 gives you $100k every time you play, but Slot Machine 926 loses every time you play; at this point, you have no idea. You’re now left with the choice of deciding if you want to “exploit” Slot Machine 1 and use your remaining 900 plays there, or if you want to “explore” other slot machines systematically to see if you can identify others that are even better than Slot Machine 1.

This “explore” vs “exploit” tradeoff is the bedrock of a branch of machine learning called reinforcement learning (RL), which tries to optimally solve problems similar to the slot machine problem above. In RL, an agent finds themselves in an unknown world. They can observe the state of the world, take actions based on the states they observe, and finally use these states and actions to construct a policy that maximizes the reward they receive. 

In the above example, an RL agent would run through the scenario thousands of times, developing an optimal strategy (maybe, for instance, once it finds Slot Machine 257, it realizes the best strategy is to use all 1000 pulls there). While, in principle, the agent could try every possible combination of using its 1000 pulls on the 1000 slot machines, that would take a very long time to compute. The “explore” vs “exploit” tradeoff is important in RL to speed up the learning process — the more efficiently the agent can try out a variety of different strategies, the more quickly it can arrive at the optimal strategy. In other words, it is best at first for the agent to prioritize exploring to learn about the slot machines, then once it is confident that it has sufficiently explored, it can “exploit” the slot machines that pay the most on average.

How “Explore” vs “Exploit” Relates to the Handshake Challenge

Now, back to the handshake challenge. Coming to Stanford as a freshman, I found myself in an unknown environment, and as everyone else, my goal was to socialize “well.” The idea of socializing “well,” however, is not as clearly defined as maximizing the payoff from playing slots, and certainly varies from person to person. Roughly, though, this involves a golden mean on the axes of quality and quantity. I sought to find high-quality people who share similar interests to me that I gained value from talking to, and simultaneously to find the balance of what quantity of my time should be spent socializing.

In the slot machine example, the optimal strategy was for the agent to spend their initial trials efficiently exploring the slot machines, then once they have found the slot machines with the highest payoff, they can eventually exploit only those slot machines. Clearly, there must be a balance between “exploring” and “exploiting.” In this context, one has their social time to spend, and “exploring” would be spending that time meeting unfamiliar people while “exploiting” would be engaging with your existing friends. 

This leads me to the shortcoming of the handshake challenge. I feel as though I generally came in with the right goals, but the reason that the handshake challenge was problematic was because it encouraged me only to “explore” rather than to “exploit.” Initially upon entering Stanford, shaking hands with a lot of people made a lot of sense. After all, it was indeed aligned with what truly is optimal (exploring a wide variety of new people). As time went on, though, I felt that I should “exploit,” or spend some of my social time going deeper with some of the interesting people that I met during my period of exploration. However, the problem was that the handshake challenge created a strange incentive only to “explore,” not to “exploit.” 

Ultimately, this was the reason that I quit the handshake challenge – I wanted to let this “explore” vs “exploit” balance take its natural course, rather than artificially forcing myself to “explore.”

I want to mention a key caveat that differentiates socializing at Stanford from playing slot machines. When playing slot machines, the agent is best off “exploring” for some time, then exploiting once it finds the high-payoff slot machines. It never has a reason to return to exploration, because it would be less profitable to explore low-payoff slot machines once it knows how to optimally “exploit” the best slot machines.

However, in the social example, this same approach does not quite work, harkening back to the point about “socializing well” not being as clearly defined. If, after an initial period of meeting people, I decided to “exploit” the 5 most interesting people that I had met by spending 100% my time with them, that would be quite troublesome. In real life, there always is value in meeting new people, even if one has already “explored” a fair amount. This was one of the positive aspects of the handshake challenge – it encouraged me not to get complacent with just interacting with a small set of people. In this social context, I believe that the optimal end result is not to just “exploit” by having a small set of friends as it was in the slot machine example, but rather to arrive at a balance between socially “exploring” and “exploiting” that leans towards “exploiting.” For instance, maybe it’s best for 20% of your time to be spent with unfamiliar people and 80% of your time to be spent with your close friends; these optimal percentages likely vary between people. Either way, always exploring or always exploiting is troublesome, thus the handshake challenge was slightly misguided.

Funnily enough, I think you can roughly see these natural periods of “exploring” and “exploiting,” when I am meeting more and less people at times, on my progress tracker below –

Mobile users — flip your phone horizontal to view the entire graph 😀

Analyzing my Initial Motivations for the Handshake Challenge

After applying the “explore” vs “exploit” principle from RL to the handshake challenge, you may think it was foolish because it created a suboptimal artificial incentive around socializing. Honestly, after writing the above, I started to convince myself of the same, and wondering, “Why did I do this in the first place?” 

As it turns out, I answered this exact question in a section of my original blog post about the handshake challenge. Re-reading it, I don’t think it was quite so foolish. I recall why it was a compelling challenge to undertake. Overall, I think my intentions were valid, I just misperceived the actual effects of the challenge at the time of writing. Here are my specific reflections on the reasons why I began the challenge, taken directly from the previous blog post – 

1] Making the World Better and Diversity – My mission is to make the world a better place, and in the original blog post I argued that the only way to benefit those around me is by meeting them first. Additionally, there indeed is intrinsic value in meeting a diverse set of people and learning about their different perspectives. However, being willing to meet new people and having a challenge that forces you to meet new people are two very different things. Again, I’m still quite open to meeting new people, but there definitely needs to be a balance between spending time with familiar and unfamiliar people.

2] Helping Each Other – I still stand by everything that I said in the original blog post on this point. While the goals of helping people when they are in need and getting help from others when I am in need are valid, I now believe that they are better accomplished by a balance of spending time with familiar and unfamiliar people.

3] Build Personal Relationships – In the original blog post, my claim was that, in order to identify the people with whom you want to get closer to in college, you first should meet a wide variety of people. This is true – an initial period of “exploration” is indeed necessary in order to be able to “exploit” later. However, in the initial blog post, I mistakenly extrapolated this fact to conclude that “exploring” would always be good for the entire year. In reality, though, there is indeed a time and place for “exploiting” – “exploring” is not always the best option.

That’s all I’ve got about officially closing the handshake challenge. I hope that you can gain some insight from seeing the progression of my thoughts on this topic from both the beginning and end of my first year of college. 

It was fun while it lasted though; here’s a photo of handshake #191 with John Doerr –

raj pabari john doerr stanford

Three Other Updates from My Freshman Year So Far

It’s been pretty quiet on my end regarding updates for the past few months, both on my social media and on my blog. There’s way more that’s happened since my last update, other than ending the handshake challenge. Here are three notable updates –

1] Major – I switched my intended major from Management Science and Engineering (MS&E) + Computer Science to Math + Computer Science. Because my long-term goal is to found an impactful tech startup, when I came to Stanford, I thought that studying MS&E + CS would be a good way to prepare me for both business and tech. However, after taking an MBA course and a challenging masters-level MS&E course, I felt like I wanted to try something new. The business principles I learned were already fairly intuitive to me given my previous experience operating a startup and teaching a business course.

After Fall quarter, I decided that I wanted to explore a different discipline. I ventured out and tried math for the first time in Winter quarter. I conjectured that proof-based math would push me to think in a new way. My homework is a collection of new theorems, and I am tasked with providing a proof for each of the theorems. When solving these problems, I combine the seemingly disparate propositions that my professor mentioned with my own intuition and argumentation to arrive at a cogent proof. Notably, I believe this is quite distinct from the other ways in which I am familiar with solving problems, for instance when writing software algorithms or iterating my startup’s minimum viable product.

In Winter quarter, I took two proof-based math classes – Linear Algebra and Graph Theory. I went from never having written a proof to having written hundreds by the end of the quarter. I believe learning the aforementioned problem solving approach has opened me up to a new way of thinking. I truly do consider problems in a slightly different way, which is an invaluable takeaway to leave a class with. The skills that one learns in MS&E are valuable, but I’ve been able to learn them outside of the classroom through my startup and clubs, while I feel that the problem-solving techniques I’m learning in math are best learned inside the classroom.

This quarter, I’m taking another intense proof-based math course that is a successor to real analysis. I’ve coupled this with a computer science course on Deep Reinforcement Learning and a philosophy course on Ethical Theory, which has created a well-rounded quarter that balances my newfound interest in math with my longtime pursuit of AI and passion for philosophy. Overall, I’m learning a lot and now really enjoying my classes 😀

2] Startup – I built and launched a startup called SalesBoom.AI, which is harnessing generative AI to conduct personalized outreach at scale. Building SalesBoom was a good exercise in building a scalable platform that applies modern AI to solve a real problem. Additionally, because I launched SalesBoom in Winter quarter, it helped me simultaneously exercise the “move fast and break things” approach to problem solving alongside rigorously proving theorems in my math classes. It’s been an exciting journey so far building the product, launching it on social media, and gaining initial customers. Check out my product’s demo video below!=

3] Running – Recently, I ran the 4x4x48 challenge created by David Goggins, where I ran 4 miles, every 4 hours (on the hour), for 48 hours straight. It was quite an insightful experience, and I learned quite a lot about myself and the nature of conquering seemingly impossible challenges. The next blog post will be dedicated to my takeaways from the challenge, but if you’re curious in the meantime, check out my Instagram story highlight where I filmed a short update live after each leg of the challenge.

raj pabari david goggins 4x4x48 challenge

It won’t be as long before the next blog post about the 4x4x48 challenge 🙂

Until then…

Thanks for reading this blog post! Feel free to contact me or leave a comment.


Raj Pabari

Raj Pabari is a driven, inquisitive, outgoing self-starter with a passion for learning and inventing. A student at Stanford University, he sees his future innovating at the intersection of technology, business, and impact. Learn more about Raj: rajpabari.com

1 Comment

Yohann · June 1, 2023 at 11:04 am

interesting read!

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *