Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

3
[Help] A phantom " " char haunts my reddit-twitter bot. Can you help me bust this ghost?
Author Summary
ZAKagan is in help
Post Body

(This is a cross post with /r/learnpython)

I've been making a Twitter bot that composes tweets based on top posts on a given subreddit. You can see my testing account here: https://twitter.com/subreddit_sim

(In this case I chose /r/subredditsimulator to collect posts from. I'm not doing any Markov chain stuff here, although I'd like to experiment with that in the future.)

I'm no expert in Python, but I'm kinda proud of this project. There is one issue that I can't seem to solve though. Look at the links. Particularly the links that are shortened by twitter t.co url shortening method.

There's an extra space char before the ellipses. Each time. And I can't figure out why.

Here's what I mean.

I've been trying to hunt this down for a few days. Here's what I've considered/looked for as a possible source of this bug:

  1. Something went wrong with how I am concatenating the strings. I store each piece of content collected from reddit (via Praw) in an array. Then at the end I concatenate my array together like so: tweet_content= " ".join(content_list). This shouldn't produce an extra space following the link, since the link should be the last string in the array. I've checked the array before the join method is called, and there is no extra space.

  2. I'm messing up with unicode encoding/decoding. Some of the content gathered from the reddit post object is in unicode. I also use unicode for the ellipses character and arrow emoji (u'\u2026',⬆, for an upvote arrow). The link itself is just a string, and does not need to be encoded or decoded. But perhaps the phantom space is created when this string is concatenated with unicode objects?

  3. This is some issue with Tweepy, the python library that is used to send content to twitter. I have tried googling for such issues, but I've come up short.

Still, after looking at my tweet composing function and playing around with these ideas, I haven't solved this one. The location of the space is what really weirds me out. It's always right before t.co's ellipses, as if it were part of the link itself.

If you're not familar, t.co is twitter's url shortening service that comes built in. Whenever you post a tweet with a link or a piece of media, t.co will reduce it to 24 characters (provided that it's longer than that to start with).

Please check out my repository on Github. There I have a description of the bot as well as all of the code required. I want to do a more thorough write-up once it's finished. The issue I seem to be having is with my tweetComposer function- I think at least.

I would really apreciate any help, guidance, suggestions, or feedback that you could provide!

Author
Account Strength
100%
Account Age
14 years
Verified Email
Yes
Verified Flair
No
Total Karma
34,917
Link Karma
19,089
Comment Karma
15,828
Profile updated: 1 day ago
Posts updated: 16 hours ago

Subreddit

Post Details

Location
We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
8 years ago