Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

18
WAYRStats: Version 1.3, Tidbits, and Code Structure
Post Body

Howdy y'all, time for another WAYRStats update! This one will be a lil different, as I won't be debuting any new features today; instead, I wanna talk about updates I have made to the program since I first announced things two weeks ago, how things are currently shaping up, why I'm pissed off beyond belief, and my personal favorite (and most boring), code structure!

Updates to WAYRStats

  • The Early Bird Club conditions have changed

Originally this was a top 5 list based around the number of times you posted within the first hour of the thread going live. As was pointed out, this is a bit of time zone bias, as plenty of people are still asleep when this thread goes up. So, the mini leaderboard has been ditched, and this is now a flat list of users who posted within 24 hours of the post going live twice in one month. I think this change feels more balanced and does a better job of highlighting what this category is meant to.

  • The largest post of the month now handles child replies of the same author

This was functionality I came up with at the very outset of this project, but I wasn't sure how to actually get it done. Thankfully I later developed a proof of concept in the form of the Sweet Talker's Club - the code from that module was largely copied and adapted to this module. There actually was a triple post by one user last month which blew the previous largest post out of the water, so I'm happy this works more accurately now. As result I went up a couple spots on some leaderboards, I love rigging statistics /s

  • That same functionality has been added to User Average Data

Same as above, concatenating child replies of the same author should count towards these totals and averages, so that's been added in.

  • Scaling has been added to User Character Count Levels

Originally this module took the total character count of all of your posts, divided that number by 1000 and spat out your "level" so to speak. This is cool on its own but it doesn't mimic the exact functionality of game leveling; level two takes more EXP than level one, three takes more than two, etc. So, I refactored this module and added a 1.044 scale factor to each level; level one takes 1000 characters, level two takes 1044 characters, and so on. The leaders of the pack are still at the top, but this allows users with lower exp to feel like they're climbing the ranks quickly compared to the top 5.


Why I'm pissed the fuck off

I went to the mods about these changes, just keep them in the loop about everything, and BY THE GODS I SWEAR I'M GOING TO KILL HIM /u/SSparks31 asks "HUrr duRR what about level 69" FUCK YOU GREG. I decided to do the math because I don't have anything better to do AND IT TAKES 420,000 CHARACTERS TO GET TO LEVEL 69 WITH MY SCALE FACTOR I'M GOING TO COMMIT A FELONY. Fuck this shit I quit please don't leave me This was unintentional, the numbers are all made up and the points don't matter, but god is a filthy memer and I suffer for it. By the way, you wanna hit level 69 you're gonna have to hit the character limit for 42/52 posts in the year. You're all fucked.


Two weeks in, are things different?

Yeah the fuck they are. Our comment averages are up, more people are posting even if its just small snippets, the first 24 hours of WAYR are a fucking disco party, we have more replies, longer posts, more detail, more discussion, more fucking everything. This is only the first month, long as I keep maintaining and improving this system things should only go up from here. "But there's only more activity because we merged the threads" not even close. The two threads we have in this month so far hit the record numbers of last month. We're doing fuckin great. As always it's been a pleasure working on this, and seeing it make an impact is all the more satisfying and fulfilling.


The boring part: code structure

YAAAAAY. yay. Fuck you it's interesting. So here's the thing; the single most important line of code in this whole kit and kaboodle is this bad boy:

for submission in visualNovels.search("\"What are you reading?\" author:AutoModerator", sort = "new", time_filter = "all"):

So for loops in Python are basically foreach loops in other languages; basically for each thread that gets returned by sending the search query "What are you reading?" with AutoModerator as the author, sorted by new, filtered by all time, the code within the for loop will execute. This line will come up in the future.

So originally this project started simply as a way of checking contest eligibility, and when I finished that it was a single script, ran top to bottom, nothing fancy whatsoever. All the relevant code was executed right within that for loop, no bells or whistles. As I thought more about stats, metrics, leaderboards and stuff, I realized that things needed to be more organized. Enter classes! Classes are custom-made data structures that you can design to hold any and all kinds of information that you want. In its current state, WAYRStats has two major and two minor classes, shown here. The two on top are mini classes I use within the two larger classes below; as you can see the WAYRStats is split into two main classes; one class handles all the leaderboards and another handles all the monthly analytics. I liked separating these out since even though there will be a lot of copied code between them, keeping data in concise packages like that is never a bad thing. So now with class structures being implemented we can go back to that for loop, right? Like I said originally all the code was in the for loop, but now that all the computational code is within the classes, we can set up a highly modular scheme where we can put in the modules we want and toggle them on or off by simply changing the single lines to comments. Right here, this is where the magic happens:

for submission in visualNovels.search("\"What are you reading?\" author:AutoModerator", sort = "new", time_filter = "all"):
    if int(datetime.utcfromtimestamp(submission.created_utc).year) == YEAR:
        print(submission.title)
        WAYRStats.FindAvgPostLengths(submission)
        WAYRStats.FindEarlyBirds(submission)
        WAYRStats.FindSweetTalkers(submission)
        WAYRStats.FindBlabberMouths(submission)
        if "untranslated" not in submission.title.lower():
            doOnce = WAYRStats.InitStreakData(submission) if not doOnce else WAYRStats.FindLongestStreak(submission)
    else:
        break

    if int(datetime.utcfromtimestamp(submission.created_utc).month) == MONTH:        
        WAYR.FindWinCandidates(submission)
        WAYR.FindBigBoi(submission)
        WAYR.FindAvgPostInfo(submission)
        WAYR.FindUserPostInfo(submission)
        WAYR.FindEarlyBirds(submission)
        WAYR.FindSweetTalkers(submission)
        WAYR.FindPrettyPeople(submission)
        if "untranslated" not in submission.title.lower():
            WAYR.FindPerfectAttendees(submission)

This block of code is my baby, this is where every single data call is made. Whenever I create a new module, it gets added to the relevant list here. A couple things to note about structure with this one - we only care about 2020 results, so when the posted year comes back as outside of that, we immediately break and go to the next part of the code. The lower block after the break statement is for the monthly analytics, so it makes sure that the post date matches the target month before doing any calculations. I've broken down a number of the modules here, and this is a secret sneak peek at the modules I haven't mentioned; see if you can figure 'em out. Finally, after all those calculations are finished and we've crawled through all the threads for 2020, we go to the next, final blocks:

# This block of code sets up the debug logger, for outputting info to the console as well as saving the output to a file.
# Gonna be honest I copied this bit straight off of stackOverflow, it works that's all I care about
root = logging.getLogger()
root.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter("%(message)s")
handler.setFormatter(formatter)
root.addHandler(handler)

# These functions are for parsing the data that has been gathered in the above for loop.  There's a TON of data to go through so we can't do it on the fly as it's being gathered.
WAYR.PrintWinCandidates()
WAYR.PrintBigBoi()
try:
    WAYR.PrintAvgPostInfo()
    WAYR.PrintPrettyPeople()
except:
    logging.info("Average post length error: API sucks\n")
WAYR.PrintPerfectAttendees()
WAYR.PrintUserPostInfo()
WAYR.PrintEarlyBirds()
WAYR.PrintSweetTalkers()

WAYRStats.PrintStreakData()
WAYRStats.PrintAverages()
WAYRStats.PrintEarlyBirds()
WAYRStats.PrintSweetTalkers()
WAYRStats.PrintBlabberMouths()

That first block sets up logging stuff; basically it lets me write information both to the console and to a log file, just for ease of access. Below that is all our print statements, which as you can see are functions on the two main classes that simply print out all the data it's gathered. That try:except you see is a catch for those two print statements, so in case the API bugs out those modules will end up dividing by 0 and that catches that and handles it without automatically crashing.

So that's basically the update! Y'all know more about how the script is put together, changes and improvements I've made, and dumb shit that decides to meme on me. As a last little tidbit, here's a couple of features I plan on implementing, but I don't think they'll be ready for the full suite at the end of this month:

  • Implement a more advanced user level that's based on all the various leaderboards/metrics they've achieved
  • Change leaderboard functionality to better display ties: for a small small example, imagine top 3 user levels: #1 is level 50, #2 and #3 are level 49. Instead of displaying #2 and #3, it will output two individuals as tied for second place, and then the next entry will start at #4, make sense? Both of these modules are kinda tricky to wrap my head around so I don't wanna promise I can get them done right now, but they are on the brain.

See y'all soon, don't forget to upvote automod-chan, and as always: what are you reading?

Author
Account Strength
100%
Account Age
6 years
Verified Email
Yes
Verified Flair
No
Total Karma
8,774
Link Karma
2,002
Comment Karma
6,521
Profile updated: 1 month ago
Posts updated: 11 months ago
ChizuChizu | vndb.org/u86636

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
4 years ago