Text Analysis of The Good Place

Exploring NBC’s The Good Place with Text Analysis Methods in R

Dara Tan
14 min readOct 19, 2020

This past summer, I spent my time in between summer classes and extracurriculars re-watching NBC’s The Good Place from start to finish and learning to use R for data and text analysis with the help of R for Data Science and Text Mining with R. I wanted to combine these two activities, so I transcribed the show with reference to Netflix subtitles, then applied the techniques I learned to explore the resulting data. While I know that “pobody’s nerfect”, but in the spirit of the show, I decided to try. The results of my attempt have been organized into the analysis below.

Emotional Journeys

Despite being a comedy, The Good Place takes viewers on an emotional roller coaster; the show is insightful, witty, profound, heartwarming, melancholic, cheery and suspenseful all at once. Diving into this project, I wondered how these qualities played out in the lines spoken by the show’s characters. To investigate this, I conducted a sentiment analysis of the lines of the six main characters: Chidi, Eleanor, Janet, Jason, Michael and Tahani, collectively known as the Soul Squad.

Group Overview

I began by looking at the sentiments expressed by the group as a whole to get an overview of their emotional journey over the course of the series. To do so, I obtained the sentiment values for the six characters using the AFINN lexicon, then summed them by chapter and weighted them to account for differences in chapter lengths. The results are shown in Figure 1 below.

Figure 1: Soul Squad’s Sentiments by Chapter

Looking at Figure 1, the clusters of red bars, particularly that between Chapter 16 and 26, instantly stood out. This long stretch of negative sentiments spanned most of Season 2, during which the dead humans were psychologically tortured, Michael struggled to learn ethics after Vicky usurped his leadership position and the Soul Squad as a whole had to walk through the Bad Place in plain sight in an attempt to escape to the Good Place. Given the numerous stressful turns that their journey took, it is unsurprising that the Soul Squad, then known as Team Cockroach, expressed rather negative sentiments during this period.

Of the green bars, my attention was drawn to the cluster in the beginning of the series, from Chapter 1 to 6. These bars correspond to the period during which the dead humans were still under the illusion that they were in the Good Place, despite actually being subjects in Michael’s psychological torture experiment at the time. The positive sentiments found in their lines during this period no doubt aided in misleading the audience as to the show’s true setting, which then contributed to the effectiveness of the iconic reveal in the Season 1 finale.

Highlighting Specific Group Members

I followed my exploration of the experiences of the group as a whole with a closer look at the sentiments of each Soul Squad member individually. For this, I used the NRC lexicon to categorize words into eight sentiments: anger, disgust, fear and sadness, the four negative sentiments, as well as the four positive sentiments of anticipation, joy, surprise and trust. Again, I aggregated the sentiments by chapter and weighted them to account for differences in chapter lengths. The analysis that follows walks through the three sets of results I found most interesting.

Figure 2: Chidi’s Sentiments by Chapter

Figure 2 shows the graph for Chidi, in which two chapters immediately caught my eye. The first was Chapter 7, The Eternal Shriek, which rated the highest levels of all four negative sentiments. In this chapter, Chidi was faced with a massive dilemma: he could either help Eleanor remain in the Good Place by keeping her secret, or he could reveal that she did not belong to save Michael from an excruciating retirement. As Eleanor put it in a later episode, decisions were “Chidi kryptonite”, but this situation was particularly stressful for Chidi because he had to choose between honoring an earlier promise to Eleanor and knowingly supporting a lie with dire consequences. As a staunch Kantian, neither option was even remotely permissible for Chidi, leaving him between a rock and a hard place, to put it lightly. These impossible circumstances provide context for the high levels of anger, disgust, fear and sadness that he expressed in this chapter.

Also of note in Chidi’s graph is Chapter 48, The Answer, where the peaks of all four positive sentiments coincided. By presenting defining moments from Chidi’s lives and afterlives, this chapter mapped out his long-standing obsession with finding ‘the answer’, beginning with a ‘marriage-saving’ lecture he gave as a child which convinced him that every question had an answer, through relationships that suffered because he was “incapable of making a single decision”, to advice he received from other Soul Squad members and ending with the serenity he achieved right before he was rebooted at the end of Season 3. Although this chapter revisited some unpleasant moments from his past, the enlightenment that Chidi eventually achieved, which enabled him to overcome his lifelong Achilles’ heel, explains his positive tone in this chapter overall. Furthermore, the remarkably high level of anticipation detected in his lines mirrored my experience as an audience member watching with bated breath each time Michael’s drawn-out snap was used to transition between scenes.

Figure 3: Janet’s Sentiments by Chapter

Another character whose graph intrigued me was Janet. Relative to other Soul Squad members, she had consistently low ratings in every sentiment for most of the chapters, probably because, as a non-human “anthropomorphized vessel of knowledge”, emotions did not come as naturally to her. Her lines in Chapter 20, Janet and Michael, however, caused substantial peaks in all four negative sentiments as well as all four positive ones. Reviewing the plot of this chapter, one can see why. Not only did Janet have to help Jason, who she was in love with, improve his relationship with Tahani, but she also caused seemingly unexplained glitches in the neighborhood, causing trouble which she thought could only be resolved by activating her self-destruct mechanism. Michael’s refusal to do so then led to an emotionally-charged discussion in which she questioned his motives, ultimately driving him to admit, “The reason is friends!” In a nutshell, this chapter placed Janet in many arguably human situations that exposed her to a gamut of human emotions. The uncharacteristically intense sentiments she expressed in this chapter were likely due to this myriad of emotions, which ranged from the devastation she felt watching the man she loved pursue a relationship with someone else to the warm friendship extended to her, both from Michael in his reluctance to lose her and from Eleanor who offered her advice on heartbreak.

Figure 4: Tahani’s Sentiments by Chapter

The final graph which piqued my interest was Tahani’s. On first glance, I noticed that, contrary to the findings from my analysis of the Soul Squad as a whole, she appeared to express more positive sentiments than negative ones. This could indicate that she dealt with her emotions in “the British way”, which she once described as smiling bravely, burying one’s feelings and allowing a steady drizzle to slowly wash away one’s sadness.

For me, this general preference for positive words served to highlight the peaks in her negative sentiments. In Chapter 33, A Fractured Inheritance, which produced the highest ratings of fear and sadness, Tahani tried to mend her relationship with her sister, Kamilah, whom she had always regarded with jealousy. Eventually, she realized that although her sister had constantly received the approval from their parents which she craved but never got, her sister was just as much a victim of their parents being “wankers” as she was: Kamilah felt “just as alone” as she did. This recognition in turn facilitated their reconciliation. In Chapter 42, Chillaxing, which produced the highest ratings of anger and disgust, Tahani attempted to connect with John by having Janet recreate exclusive and luxurious experiences for him. However, when she later brought up his need for self-improvement, he took poorly to it, causing him to lash out and label their week spent together as a “wind-up to a sucker punch”. Finally, it struck her: John’s unhappiness stemmed from his preoccupation with the velvet rope, a flaw that she too had while she was alive. This understanding then helped her to craft a suitable course of study for him.

Both of these chapters marked significant turning points for Tahani. In each, she explored a long-held and defining personal flaw, worked through painful memories and ultimately drew on her own experiences to connect with and help someone else. While both journeys were far from smooth, as reflected in the high ratings of the negative sentiments, they each had uplifting conclusions, as is clear from the concurrently high positive sentiment ratings, particularly in joy and trust.

Characteristic Words and Phrases

Having analyzed the emotional journeys of the six main characters, I started exploring a different aspect of their characterization: their speech patterns. For this, I employed TF-IDF, a statistic that quantifies the importance of a word to a document in a collection, to identify frequently used unigrams, or single words and bigrams, which are two-word phrases.

Differentiating Characters

One unique aspect of The Good Place that I noticed as a viewer and especially while creating the transcripts I analyzed in this project was the significant number of actors who played multiple distinct characters within the series. Many of these were actors that portrayed demons masquerading as Good Place neighborhood residents. Arguably the most outstanding example, however, was D’Arcy Carden, who breathed not-a-human life into every Janet-looking character in the show. Not only did she expertly mimic Chidi, Eleanor, Jason and Tahani’s speech patterns and mannerisms in the highly-acclaimed episode Janet(s), she also played each of the four different Janet types: Bad Janet, Disco Janet, Good Janet and Neutral Janet. Figure 5 below shows the words that were most frequently used by each Janet type, as determined by TF-IDF values. I omitted Disco Janet due to the brief nature of her appearance on the show.

Figure 5: Unigrams by Janet Type

Clearly, the Janet types in Figure 5 are distinguishable not only by their differently-colored signature outfits which inspired the color palette for the graph, but also by their unique word choices. For instance, five of Bad Janet’s top 10 words were words she commonly used in insults, emphasizing the sizable role that insults played in her speech patterns. In contrast, three accounting-related words, ‘accountant’, ‘calculations’ and ‘accounting’, were found to be key to Neutral Janet’s vocabulary, likely because most of her scenes took place in the Accounting Department. Lastly, although there is no discernible theme in Good Janet’s top 10 words, it is worth noting that ‘void’ was the word with the highest TF-IDF value for her. While all Janets have voids, only Good Janet frequently mentions them. Having ‘void’ top her list in TF-IDF values is thus reflective of the significance of voids for Janets, particularly those of the Good variety.

Identifying Unique Characters

When seeking out the most frequently used phrases, I looked specifically at the bigrams spoken by Soul Squad members. Figure 6 shows the most common of such phrases as ranked by TF-IDF values.

Figure 6: Soul Squad Bigrams by Chapter

As can be seen in Figure 6, the most common bigrams were spoken by only three of the six Soul Squad members, with one member in particular, Jason, accounting for seven of the top 10 spots. This initially took me by surprise, as I had expected the phrases to be more evenly spread among the six characters. However, on closer examination, this finding makes sense.

Although some two-word phrases spoken by other Soul Squad members might have been used more frequently based solely on raw counts, TF-IDF balances the sheer number of appearances of a phrase with how rarely it is used across ‘documents’, or characters in this case. To receive high TF-IDF values then, phrases must be both unique to a character and spoken frequently. Jason’s rather limited yet eccentric vocabulary generated phrases that fit this bill well. In fact, some of the phrases featured in Figure 6, such as ‘Blake Bortles’ and ‘Molotov cocktail’ are so specific to his speech patterns and repeated so often that seeing or hearing them, even in other contexts, reminds me of him. Clearly, these iconic phrases set him apart from other Soul Squad members.

Character Relationships

On top of examining the characters of the show individually, I was also curious about the relationships between them. In particular, I wanted to find out the most common scene partnerships among the 15 characters with the most lines. To do so, I used an estimate of 40 lines to divide the show into scenes, then calculated the pairwise correlation of the characters who spoke in each scene. The results are shown in Figure 7 below, with the darkness of the edges representing the strength of the correlations and the colors of the labels signifying the characters’ affiliations e.g. red characters are from the Bad Place while blue indicates the members of the Soul Squad.

Figure 7: Character Pairwise Correlation

As shown in Figure 7, the strongest pairwise correlations exist between Brent, John and Simone. This makes sense: other than Simone’s brief arc as part of the ‘Brainy Bunch’ in Season 3 and short moments in the final chapter, all three characters were nearly always engaged in the same activities; by extension, they were most often scene partners. Their relationships with other characters differ slightly, however. For Brent and John, their next most frequent scene partner was Bad Janet, who had posed as the Soul Squad’s Janet for an extended period during the Afterlife Improvement Experiment. Lazy and entitled, Brent constantly ordered ‘Janet’ to do his bidding, while John shared numerous scenes with Bad Janet because Tahani often sought Bad Janet’s assistance in creating experiences that she and John could bond over. In comparison, Simone’s next most common scene partner was Chidi. This was to be expected, given that they were romantic partners both in Australia and during the Afterlife Improvement Experiment.

A group of characters whose relationships I was especially interested in was the Soul Squad. I hypothesized that, having shared the spotlight among themselves for most of the show, they would rate highly as scene partners. Yet Figure 7 suggests that, aside from the Eleanor-Chidi and Tahani-Jason pairings, my theory was largely inaccurate. One possible explanation is that, while the members of the Soul Squad worked together on a regular basis, as the main characters their names also occurred with the greatest frequency overall in the show. Since the pairwise correlation evaluated how often two characters appeared in the same scene relative to how often they appeared separately, it can be reasoned that the values obtained for scene partnership among the group were lowered by the large number of scenes each character had. But this caveat adds to the intrigue surrounding the high correlation values for both the Eleanor-Chidi and Tahani-Jason pairings, which coincidentally were the soulmate pairings in Michael’s first version of Neighborhood 12358W. Personally, I like to think that the exceptionally strong correlations found in these two pairings are a testament to the bonds that the characters formed throughout their many lifetimes, by, as Eleanor put it, finding each other “hundreds of times”, over the course of the series.

Unsupervised Classification

Earlier, I analyzed the characters and their relationships, using my knowledge of the show as a viewer to guide my exploration. However, I wondered if a model fit by an unsupervised classification method would be able to pick up on the similarities and differences between the characters as well. To answer this question, I grouped the lines into 10 documents according to the characters that spoke them; each of the Soul Squad members was allotted one document, while the remaining characters were divided into four documents, namely Bad Place, Good Place, Human and Other Immortal. I then used Latent Dirichlet Allocation (LDA) to fit a 10-topic model to the scripts. Figure 8 shows the most common words found in each topic of the model.

Figure 8: Most Common Words by Topic

Looking at Figure 8, the map from document to topic is immediately apparent for some of the topics. For instance, the words ‘moral’, ‘ethics’ and ‘philosophy’ in Topic 5 are a clear nod to Chidi, while Topic 10, with words such as ‘Jacksonville’ and ‘dope’, has Jason written all over it. Additionally, Topic 4, which features the words ‘party’ and ‘darling’, calls Tahani to mind, while the word ‘void’ in Topic 9 makes reference to Janet. To ascertain the documents that contributed to some of the other topics, however, a key is needed. Figure 9 provides such a guide.

Figure 9: Document-Topic Matches with Large Gamma Values

As can be seen, Eleanor’s lines were split into Topic 3 and 8, with each taking a significant fraction. On the other hand, Topic 2 included lines spoken by characters from both the Good Place and Other Immortal groups. I examined each of these topics further below.

Figure 10: Eleanor’s Words in Topic 3 and 8

Figure 10 above is a comparison word cloud, showing the 50 words each from Topic 3 and 8 that Eleanor used most often. While both topics contain words such as ‘philosophy’, ‘train’, ‘ethics’ and ‘demon’ which are related to Eleanor’s afterlife experiences, Topic 3 seems to make more references to the time she spent being tortured in Michael’s fake Good Place. For example, Topic 3 includes the word ‘version’ which brings to mind the 803 different simulations she lived through, as well as ‘fork’, the automatic substitute invoked whenever she tried to curse in the fake Good Place. On the other hand, Topic 8 appears to focus more on her two real lifetimes, with words like ‘job’ and ‘bunch’, which could refer to the Brainy Bunch from Season 3.

Figure 11: Words by Good Place and Other Immortal Characters in Topic 2

Just as I visualized the two topics containing lines spoken by Eleanor, I also took a closer look at the words in Topic 2 that were contributed by the two ‘documents’, Good Place and Other Immortal. The 50 words from each document that appeared most often are shown in Figure 11 above. As can be seen from the comparison word cloud, prominent words spoken by the Good Place characters include ‘milkshake’, ‘hear’ and ‘compromise’, while characters grouped under Other Immortal, most notably the Judge, contributed words such as ‘earth’, ‘judge’ and ‘human’. Given the distinct words spoken by the characters from these two groups, I did not expect the model to place their lines in the same topic. However, this classification makes sense: most of the characters from the Good Place sat on the Good Place Committee, which often consulted on matters relating to the afterlife points system just as the Judge did.

To quote one of my favorite lines from the show, “Simply put, we are not in this alone.” My analysis has come to an end, but I would love to connect with others who, like me, are fans of The Good Place or learning about data analysis! Thank you for reading and as Michael would say, “Take it sleazy.”

--

--