This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
I am trying to use hourly household water data to determine whether a home is occupied by a permanent resident or is used as a vacation rental. One identification strategy that seems logical to me lies in the fact that the water use in Airbnbs will be from different types of people, and different sizes of groups. This would mean that the scale and pattern of water use would change frequently.
Initially, I tried to measure the variability of water use by performing seasonal decomposition analysis and linear regression, then calculating the R2. Unfortunately, I couldn't get results that performed well. That is, houses that I was sure were permanently occupied seemed just as likely to have a high (or low) R2 as homes that I was confident were not permanently occupied. It turns out that water use data can be fairly noisy.
Now, I'm trying to find a method where I can define a model (linear model preferred), and then split the time series of water use into periods based on how well they fit that linear model. Ideally, applying this method to a house with a permanent resident would result in longer periods of continuous good fit, while an Airbnb would have shorter periods of good fit.
I'm currently experimenting with the segmented
package in R, but it seems like it can only change one variable at each breakpoint, so it's not as flexible as I was hoping.
I really appreciate any help I can get on this question. This is a bit outside of my area of expertise, but I'm still a student and hoping to learn more about time series analysis.
Does anybody have insight on what else I can try looking into?
Does this seem like a reasonable approach to this problem?
Subreddit
Post Details
- Posted
- 4 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/AskStatisti...