Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

18
A "Traditional" algorithm vs. Machine Learning
Post Body

My Question is: What distinguishes a traditional algorithm from machine learning?

Apologies for the wall of text.

I manage a product with a massive amount of data (1m weekly users, 50 demographic datapoints on each user user history as well as their interactions with hundreds of customers). At the core of the product is an algorithm that takes a number of inputs (based on trailing historical data) to predict the revenue-optimizing decision.

Recently, our new leadership has begun to call this Data Science and touts this as "Machine Learning". I'm proud of what we've put together and the impact its had on the business, but this feels like the wrong characterization of what is just a semi-complex algorithm with almost all of the calculations occurring in SQL.

This has become a sort of big issue as they've asked me to speak to our "Machine Learning" implementation to customers, investors, and others. I dodged that characterization by instead calling it a "model" or "algorithm" and they took notice and have asked me to embrace the term and update our materials (presentations, roadmap items, etc). Compounding this, they've hired a data scientist who concurs with them that we're using a "predictive machine learning" model. I'm skeptical of his expertise and feel like he should be making an effort to create an actual ML model we can compare against our current model.

The whole thing feels dishonest and misleading. Machine learning feels far outside my depth: I couldn't hold a conversation about it and I have no real clue what a decision forest, neural network, tensors, gradients, or any of the other machine learning terms I see across this sub or elsewhere mean. More details specific to my situation below:

------------------------------------------------------------

The core goal of our data effort is: Based on what we know about a user and what we know about a customer and their provided estimates, what's the optimal revenue-maximizing decision?

There's many calculations that are factored in to accomplish this, for example:

  • We calculate the median average deviation of a customer's proposed vs actual success rate on a rolling basis.
  • We segment our users based on demographic (age/gender/etc) and calculate their success rate relative to the population's average for a success coefficient based on a rolling basis.
  • We run a simple regression between user characteristics and historical success rates for each customer.
  • We factor in historical reconciliation rates from the customer (% of successes that are ultimately rejected by the customer at invoicing) to discount revenue estimations.
  • We determine whether the user's experience should be optimized using a revenue-per-minute or revenue-per-opportunity approach. If we expect them to make a limited number of attempts, we maximum the expected revenue of each interaction. If we expect them to make a larger number of attempts, we optimize for potential revenue per minute. (EPC vs EPM for those in the advertising space)

It gets pretty gnarly, but what we end up with is a huge number of coefficients that inform our user to opportunity matching logic. An example of how this could result in different opportunity rankings for a pair of users could be:

User 1 - Average Attempts per Session 2.1 ( to be ranked by Expected Revenue)

  1. Project A - Potential Revenue $10 | Expected Revenue $2 | Estimated Success Rate 20% | 30 Minutes | Expected Earnings Per Minute $0.06
  2. Project B - Potential Revenue $25 | Expected Revenue $1 | Estimated Success Rate 4% | 10 Minutes | Expected Earnings Per Minute $0.10
  3. Project C - Potential Revenue $1 | Expected Revenue $0.80 | Estimated Success Rate 80% | 5 Minutes | Expected Earnings Per Minute $0.16
  4. Project E - Potential Revenue $10 | Expected Revenue $0.6 | Estimated Success Rate 6% | 4 Minutes | Expected Earnings Per Minute $0.15

User 1 - Average Attempts per Session 6.3 ( to be ranked by Expected Earnings Per Minute)

  1. Project C - Potential Revenue $1 | Expected Revenue $0.90 | Estimated Success Rate 100% |5 Minutes | Expected Earnings Per Minute $0.18
  2. Project D - Potential Revenue $4 | Expected Revenue $0.75 | Estimated Success Rate 18% | 7 Minutes | Expected Earnings Per Minute $0.15
  3. Project E - Potential Revenue $10 | Expected Revenue $0.5 | Estimated Success Rate 5% | 4 Minutes | Expected Earnings Per Minute $0.125
  4. Project B - Potential Revenue $25 | Expected Revenue $0.75 | Estimated Success Rate 3% | 10 Minutes | Expected Earnings Per Minute $0.075

Author
Account Strength
100%
Account Age
5 years
Verified Email
No
Verified Flair
No
Total Karma
20,102
Link Karma
1,001
Comment Karma
19,079
Profile updated: 4 days ago
Posts updated: 8 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
2 years ago