Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

4
[P] Using a Tree-LSTM to identify the page headline on a landing page
Post Flair (click to view more posts with a particular flair)
Post Body

Hi, I'm part of the R&D team at Unbounce, a SAS company that provides web page hosting for marketers. We've been working on a system to identify semantic components of a webpage such as the headline. The model itself is pretty straightforward, but we though the process of how we collected a labeled set and cleaned the data in an industry setting might be interesting to share.

As part of this, we also built an optimized Tree-LSTM implementation in PyTorch that we've open-sourced: https://github.com/unbounce/pytorch-tree-lstm

Our blog post on the project: https://medium.com/unbounce-engineering/using-machine-learning-to-analyze-landing-pages-b3a5c4c96500

Author
Account Strength
90%
Account Age
16 years
Verified Email
Yes
Verified Flair
No
Total Karma
2,089
Link Karma
69
Comment Karma
2,020
Profile updated: 1 day ago
Posts updated: 6 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
5 years ago