This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
I tend to get hot comments and replies in a subreddit and I separate the program into two steps:
Step1: Retrieve submissions from a dict(which include all submission I want) using the code reddit.submission
.Because I want to create dataset used for chat, I should promise the comment have at least one reply. So next I check the hot comments top 100 of this submission and their replies, record comments' id and their corresponding replies' id by code submission.comments
, top_level_comment.id
and second_level_comment.id
. And I can get a file loading all comments and corresponding replies id.
Step 2: I extract hot comments and their replies by code reddit.comment(id=i)
from file I got in step 1 .
As the title said, my program is slow and I checked the program. Then, I find the step 2 cost most time .Then I used reddit.auth.limits['remaining']
to check requests and realized every time I extract comment by reddit.comment(id=i)
, it will send a request.
So, if it is necessary to reduce this kind of code: top_level_comment.id
, second_level_comment.id
, reddit.comment(id=i)
?
Subreddit
Post Details
- Posted
- 1 year ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/redditdev/c...