This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
TL;DR is, now you get to see numbers and leaderboards associated with your WAYR posts, and youāre entered into a raffle drawing for prizes if you post three times in one month. So go and frolic and bash on worst girl and argue whose taste is the most shit and complain about which subreddit you think is better subtly between the lines of a well thought out and introspective analysis of Boob Wars 2.
Large Post Warning - Take your time reading this one. Thereās a lot here, Iām not expecting you to go through it all in one sitting. You essentially have a month to go through this before you start missing out on anything, worst case you have until the first WAYR thread of the month. But hey, Iām not a cop, I just like to make long posts. Enjoy.
Whatās up everyone! Nameās Arcanus, Iāve been around for a while; if you recognize the name great to talk to you again, if not then itās nice to meet you. Now if you have been around for a while, then doubtless this screenshot is something you remember from a long time ago. If thatās Greek to you, the short and sweet of it is, back in the day this subreddit was managed much more by hand and as a result we had a fair bit more extra features and add-ons. Custom CSS banners for specific VN discussions, weekly threads not even being handled by AutoModerator, and a hand-made WAYR archive, complete with a user leaderboard hosted on the subredditās wiki.
That screenshot was taken from the only internet archive snapshot that exists of it, circa Winter 2015. Back then, I loved that leaderboard. It was this tiny little corner of the internet that even plenty of subreddit regulars at the time didnāt care too much about, but in some sense, it was validation; it was motivation to keep posting in the WAYR threads. More importantly, it gamified the WAYR threads. Gamification is as the word implies, it is the process of turning any given action, habit, or task into a game with various degrees of interactivity. Something as simple as a leaderboard with one or two numbers representing a āscoreā can have powerful motivating effects on people, and as a result can increase activity, engagement, and popularity. Over time, however, this leaderboard was abandoned for a number of reasons, but the simplest being it became too arduous to maintain manually, and demand was not high enough to either tackle that issue head on or find automated alternatives.
Nowadays, additionally for a number of reasons, the WAYR threads arenāt pushing the level of popularity and activity that could be seen in āthe good olā daysā as it were. Simple as it was that motivation to get onto and climb the leaderboard is gone; posting in the WAYR thread nowadays is an exercise in self-satisfaction. Thereās only a small handful of users posting consistently in there - write-ups get less upvotes overall, thereās less comment replies, and AutoMod-chan is the most active user in the threads. Now itās certain that this can grow organically; a lot has happened since the start of the year and especially with things āstarting freshā in a couple senses these weekly threads still need to get their feet under them again. Lately Iāve been mulling this over, of what there may be that I can do to boost activity in these threads, and one of the things I always found myself going back to was that leaderboard. So, I decided why not make my own?
It is with this thread that I announce the spiritual successor to the classic WAYR leaderboard. But this one will be better. This one will be automatic, this one will have more features, track more, be more public, and incentivize different things. This will be more than what itās based on. Ladies and (mostly) Gentlemen, I present to you a personal project Iāve been working on: WAYRStats.
The Monthly WAYRStats, Leaderboard, and Competition
What is WAYRStats?
Iām so glad you asked in such an incredibly convenient and properly formatted manner! WAYRStats is a data science script written in Python that passes a search query to the Reddit API and processes the results. In plain English, my script passes a string to Reddit to run as a search on /r/visualnovels; this search string isolates and returns nothing but the WAYR threads. Using this fucking mountain of data I get as a result I can go through ALL that information, parse it in various ways, and set up basically whatever leaderboards/metrics/analytics I (or you!) could want. Later on in this post Iām gonna be breaking down one of the main modules of the script in more detail than you could ever want, so stay tuned for more details. Before that though, Iām excited to announce a new monthly competition revolving around the WAYR posts!
The WAYR Monthly Competition
I say competition but really everyone wins man. The idea behind this is to boost activity in the WAYR threads by providing tangible and intangible incentives to do so. This could work amazingly, this could be downvoted to oblivion, who knows. This competition as it were was the original idea when I came up with this whole project. At the start I asked āwhat would motivate [user]
to participate more in the WAYR threads?ā And the answer I came up with is what we all love to see: free shit! The system works like this:
- Post in the WAYR threads three times in one month, and youāre eligible for a raffle drawing at the end of said month. Your post must be a parent comment, presumably talking about, well, uhh, What Are You Reading. Subtle, I know.
- Raffle prizes are currently limited to sidebar suggestions and custom text or image flairs. If you have an idea for a prize that seems reasonable, let me or better yet one of the mods know and they will come to a decision about it. The prize selection is small for now, but we plan on expanding the prize pool if this new system gains traction.
- There is no minimum character requirement, no extras for posting essays; post literally one sentence three times in one month and youāre golden, but we encourage you to discuss!
- Every month I will be posting a huge thread going over the stats and leaderboards for the month; that thread will also be used to announce the winner of the previous month. I will try to post these threads in the first week of every month, most likely the day of the first new WAYR thread so the data is the most accurate.
Thatās the super basics of the competition. Talk about what youāre reading, get cool stuff. Very simple, no? It took me maybe a day to write the code that determines contest eligibility. I looked at things and just though āyeah nah thereās more I can do hereā and so I got to work. Over the past month Iāve been communicating with the mods about how they want this to go and Iām very pleased with the modules that are in the script now. So now that you know about the competition, letās go on to the next section, breaking down that module.
WAYRStats
So this is literally the debut so nomenclature is still relatively WIP, but WAYRStats is basically half of what Iāve put together so far (Iāll tell you about the other half in the next section of this post). As I said above I send a search query to isolate the WAYR posts; WAYRStats specifically singles out every WAYR post for the entire month and slowly parses through all the data, building data structures as it goes. One term Iāll use consistently throughout here is the word Dictionary
which if youāre unfamiliar with it, itās a data structure that stores data in key : value
pairs; keys
are used to access values
. Itās a really flexible way of storing data that has a lot of options with it. As a special surprise, yāall will get to see (most of) Aprilās full suite of WAYR analytics as a part of me explaining what all this stuff does and how it does it. I say most of because thereās still a few more days for other users to comment in the most recent thread that may alter the data. Once the first WAYR in May goes up Iāll prolly run the script again and update it or create another thread. I wonāt drone on with the opening paragraph, lemme show you what this thing can do.
You know what, fuck it. Have the largest data set first: Average User Data.
Average User Data
Average user data: [user] - [total char. count / num. of posts = avg character count]
[PHNX_Arcanus]----------[25794/6 = 4299] [deathjohnson1]---------[21969/7 = 3138]
[UnknownNinja]----------[19217/5 = 3843] [Some_Guy_87]-----------[19192/2 = 9596]
[alwayslonesome]--------[16856/2 = 8428] [Alexfang452]-----------[15636/5 = 3127]
[KaveAhangar]-----------[13366/2 = 6683] [GitahMuttan]-----------[12837/2 = 6418]
[superange128]-----------[8328/6 = 1388] [greenhillmario]---------[7605/2 = 3802]
[Betteroni]--------------[7554/1 = 7554] [fallenguru]-------------[7410/3 = 2470]
[eiruki]-----------------[6983/2 = 3491] [JayOutslee]-------------[6750/2 = 3375]
[RisingChaos]------------[6616/2 = 3308] [nwl123]-----------------[6377/2 = 3188]
[GorbyVodka]-------------[6063/1 = 6063] [Kiesuu]-----------------[6013/3 = 2004]
[DarknessInferno7]-------[5273/2 = 2636] [SSparks31]--------------[4182/1 = 4182]
[MidgetPanda3031]--------[3946/1 = 3946] [SignificantMaybe]-------[3684/1 = 3684]
[GeneralGom]-------------[3558/3 = 1186] [Stefan474]--------------[3390/1 = 3390]
[Worluvus]---------------[3259/1 = 3259] [SpectrumDT]-------------[3200/1 = 3200]
[iT__jUsT__WoRks]--------[3072/1 = 3072] [Eterna1Ice]-------------[2768/1 = 2768]
[therumisallgone]--------[2745/1 = 2745] [faiiper]----------------[2610/1 = 2610]
[Lastshade01]-------------[2381/3 = 793] [tintintinintin]---------[2333/1 = 2333]
[sorathecrow_]-----------[2117/1 = 2117] [a_pale_horse]-----------[1983/1 = 1983]
[KnightLunaaire]---------[1966/1 = 1966] [caspar57]----------------[1947/3 = 649]
[Inara_Seraph]-----------[1841/1 = 1841] [Zagorz]-----------------[1557/1 = 1557]
[SortaWeeb]--------------[1548/1 = 1548] [drinkyourmilk94]--------[1506/1 = 1506]
[tauros113]---------------[1322/2 = 661] [OdaNova]-----------------[1226/3 = 408]
[OhLookAtMeImSpecial]----[1179/1 = 1179] [AngristIron-Cleaver]----[1173/1 = 1173]
[yolo1234123]------------[1154/1 = 1154] [AssembledVoid]----------[1136/1 = 1136]
[sfisher923]--------------[1108/2 = 554] [ShinKozato]-------------[1050/1 = 1050]
[sirflimflam]------------[1019/1 = 1019] [tostitosruler]------------[888/1 = 888]
[nanogenesis]--------------[878/1 = 878] [WalriderCosplay]----------[855/1 = 855]
[Deost8003]----------------[677/1 = 677] [Adan181]------------------[616/1 = 616]
[YossaRedMage]-------------[590/1 = 590] [Hikagura]-----------------[577/1 = 577]
[Codex28]------------------[515/1 = 515] [August_Hail]--------------[482/1 = 482]
[Oglifatum]----------------[465/1 = 465] [totallyhuman939]----------[415/1 = 415]
[sultonydp]----------------[402/1 = 402] [PlasmaLeaderN]------------[339/1 = 339]
[VeriDF]-------------------[332/1 = 332] [Jazz_Musician]------------[325/1 = 325]
[morphogenic96]------------[313/1 = 313] [davisjryoung]-------------[264/1 = 264]
[Cenriqu3]-----------------[236/1 = 236] [iHicham]------------------[216/1 = 216]
[metroman1]----------------[172/1 = 172] [Koyomi-senpai]------------[166/1 = 166]
[ShoujoKakumeiLea]---------[139/1 = 139] [Nirvash78]----------------[129/1 = 129]
[cerek17]--------------------[70/1 = 70] [chrispy4627]----------------[35/1 = 35]
The number youāre wondering about is 74, by the way. This one was one of the earlier modules I put in and the formatting used to be horrible. Hereās a sample:
OdaNova...............[1] Total Posts, [528] Total Post Length, [528] Average Post Length.
stealthswor...........[1] Total Posts, [755] Total Post Length, [755] Average Post Length.
deathjohnson1.........[4] Total Posts, [8270] Total Post Length, [2067] Average Post Length.
Thatās just 3 lines. It did that for every user. Look we learn from our mistakes. Anyways, another little module attached to it:
Averages: Top 5 for the month
#1: /u/Some_Guy_87[9596]
#2: /u/alwayslonesome[8428]
#3: /u/KaveAhangar[6683]
#4: /u/GitahMuttan[6418]
#5: /u/PHNX_Arcanus[4299]
#6: /u/UnknownNinja[3843]
#7: /u/greenhillmario[3802]
#8: /u/eiruki-----[3491]
#9: /u/JayOutslee-[3375]
#10: /u/RisingChaos[3308]
It says top 5 because I intend to do only top 5s for monthly leaderboards, however this is the debut so Iāll give yāall a little extra. This one was fun to do because it pushed me to use dictionaries in a creative way. I created my own data structure here and stored that as the value
in the dictionary, so one key
could access more than one piece of data associated with it. In this case, your Reddit username serves as the key
(which itās gonna do that for almost every single module) which accesses both the total character count and the total number of your posts, both of which it adds up as the script goes through the threads one by one. Afterwards, do a bit of math and print it out nice and neat.
Single Line Statistics
As the name says, these arenāt large aggregations of data, rather singular calculations on said large aggregations of data. Say that five times fast. Hereās some cool stats for the month:
- The longest post in Apr was written by [Some_Guy_87] on [Apr 22] and had a length of [9964] characters.
- We had an average of [14] comments per thread this month.
- The average length of posts for this month is [2276] characters.
- The Pretty People Coefficient: Percentage of users who have set custom character flairs: [29%]
Alright so the first one is relatively easy, itās a dictionary with a few key : value
pairs for username, character count, and post date. It goes through the threads and immediately takes the first comment it sees and calls it the biggest, then compares every comment through the whole month, swapping out the data when a bigger comment comes along.
The next two come in a pair and are a very simple module, hereās the code:
def FindAvgPostInfo(self, submission):
self.totalThreads = 1
for comment in submission.comments:
self.totalComments = 1
self.totalCharacters = len(comment.body)
This function is called once per thread, increments the thread count, then for every comment in the thread it increments the count for comments and adds to the total character count. Again, a bit of math and print to console.
The last one is unique and lets me explain a bit how the Reddit API works. The very first call you make in the program is an authentication call to Reddit, which gives you an instance of their API, fancy words for our own little copy of Reddit we can work with. The Reddit object has subreddit objects you can grab from it, so we get the VNs subreddit. We send a search on that subreddit object and we get a list of submission objects. On a submission object youāll find a comments object, then an individual comment, then that commentās author, and then there is an author_flair_css_class
object on that. It goes alllllll the way down the rabbit hole, but at least we set up a base camp around the search layer so itās easy to build off of. By checking if that member variable has either no data at all or a default blank value we can get the total percentage of custom flairs.
The Early Bird Club
The Early Bird Club is easy to figure out - post your review or comment within the first hour of the thread going live, and you get a point. Hereās this monthās notables:
The Early Bird Club:
#1: /u/UnknownNinja-----[5]
#2: /u/superange128-----[4]
#3: /u/PHNX_Arcanus-----[3]
#4: /u/alwayslonesome---[2]
#5: /u/Alexfang452------[2]
#6: /u/Some_Guy_87------[2]
#7: /u/KaveAhangar------[2]
Like I said, normally I only want to do top 5s for monthly leaderboards, but it doesnāt feel fair to 6 and 7 and I havenāt implemented a weighted ranking system yet. This one is also easy to explain. By now you know we like dictionaries with usernames as our key, same deal here. By checking your post date against the threadās post date, simply check if the hour value is the same and give a point if so. Sort it and print to console.
After writing this post I realize because this script only checks the post hour, that now gives a 1-hour window every 24 hours to qualify as an early bird; gonna have to update that shit before the end of the month.
The Sweet Talkerās Club
This was one of my favorite modules I came up with, and also a headache to implement. The Sweet Talkerās Club tracks total comment replies for the entire month, including shit that goes back and forth. Funny story about February 12th, Iāll tell you in a bit. Top 10 for this month is:
The Sweet Talker's Club:
#1: /u/Some_Guy_87------[17]
#2: /u/AutoModerator----[17]
#3: /u/PHNX_Arcanus-----[11]
#4: /u/UnknownNinja-----[8]
#5: /u/tintintinintin---[7]
#6: /u/tauros113--------[4]
#7: /u/GeneralGom-------[4]
#8: /u/Bruxae-----------[3]
#9: /u/Zagorz-----------[3]
#10: /u/Veshurik--------[3]
Honestly I canāt fucking believe /u/Some_Guy_87 tied with Automod. Iāve got the year-in-total leaderboard and sheās got first place by a long shot. So how do we go about getting this data? Let me tell you about something called recursion. People familiar with this concept are already groaning, for those out of the loop (no pun intended but that was godlike), recursion is where you call a function within itself, basically forcing the compiler to do over the same code again and again and again. However, think of it like adding layers to a cave; everything is identical but youāre still going deeper. When you find what youāre looking for you need to go back up to the surface, you donāt just warp back to where you started; you have to manage your ascent. (Sorry for all the cave analogies I watched Made in Abyss last week and this is how Iām coping) Iāll give you the code on this one:
def FindSweetTalkers(self, submission):
submission.comments.replace_more(limit = None)
for comment in submission.comments:
self.RecurseSweetTalkers(comment)
def RecurseSweetTalkers(self, comment):
if not comment.stickied:
for reply in comment.replies:
if reply.replies._comments.__len__() > 0:
self.RecurseSweetTalkers(reply)
if reply.author and reply.author.name not in self.sweetTalkers:
self.sweetTalkers[reply.author.name] = 1
elif reply.author:
self.sweetTalkers[reply.author.name] = 1
That first function just sets up a loop to recurse through the thread. The RecurseSweetTalkers
function will continue to call and execute itself until it sees that the Reddit API tells it there are no more comment replies. Then it goes all the way back up the comment chain and loops again. This crawls its way through every comment in every thread and gives points to replies only. So, February 12th fucking broke the shit out of this code because two crazy edge cases happened in a single thread. First, /u/tauros113 posted and stickied a comment in the thread. Stickied comments do all kinds of fucked up shit with the API, itās got absolutely no idea what to do about it, and crashes my code. Secondly, a handful of users went back and forth long enough for the standard ācontinue this threadā or ākeep readingā prompt to show up. This also breaks the shit out of the API. That ācontinue readingā is an actual entity, the API thinks itās a comment reply, but has absolutely no data associated with it because itās just a button to keep reading. Thankfully with the line submission.comments.replace_more(limit = None)
is basically a global command on the thread to flush out those prompts and load every comment in the thread. Even funnier story, fixing that issue was the difference between a user being on the leaderboard or not. Wild stuff.
The Perfect Attendance Club
I saved my favorite for last, I really do hope this category gets bigger as time goes on. The Perfect Attendance Club naturally is for users who posted in every thread for the month:
The Perfect Attendance Club:
/u/UnknownNinja
/u/deathjohnson1
/u/PHNX_Arcanus
/u/Alexfang452
Congratulations guys, good stuff. The way I get this list is actually kind of fun; I start with an array of every user in the last thread of the month (Reddit can only sort by new, so all of my data processing actually happens backwards), then one by one strip away any user who is in the next thread and is not in the current list. By the end, only the perfect attendees remain. Shouts out to these guys, theyāre doing godās work.
The WAYR Leaderboards
The WAYR Leaderboards are........a surprise! Now that WAYRStats has been announced, you all knowing of its existence changes the nature of the data that this thing aggregates, so for now I gave yāall a full suite of monthly analytics for April. At the end of May youāll get the full suite and that thread will debut the year-in-total leaderboards, and this project will be going open source for anyone who is curious about Python, about how this all works, or if they want to run it on their machine to fiddle with things. In addition, Iām thinking that throughout the month of May, Iāll make some smaller threads to spotlight individual facets of my code, Python in general, and getting feedback from you guys; Iāll probably debut a couple leaderboards in that time. The leaderboards module is fully functional and I actually do have a couple ideas for more, so stay tuned. But, maybe I can give you a little peek...
For real, shouts out to /u/deathjohnson1 for being the only user to have posted in every single thread for the year of 2020 thus far (not including untranslated threads).
In Conclusion
Yes I know Iām in just about every metric, I made this shit donāt you think I wouldnāt completely rig it my way? /s
Iām very excited to debut this to the subreddit, and a bit nervous as well. In its own right I learned a lot about Python doing this and this project will likely end up on my portfolio as a code sample. I think you guys will like it, Iāve spoken to a few people that miss that old leaderboard, and I hope that this spiritual successor will feel more noticed, more accessible, more engaging, and more fun.
Do you have ideas to make this project better? By all fucking means shoot me a line if it has to do with the code or the mods if you want to contribute to the prize pool or offer suggestions!
Itās been a pleasure getting back into this community after a long break; having a place to just let those creative juices flow and pop off about something I care about is really important to me, and I wanted to show my appreciation with this. So at the end of it all, I would like to ask you a question:
What are you reading?
Subreddit
Post Details
- Posted
- 4 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/visualnovel...