Scammer Analytics: Using Loan Descriptions to Determine Repayment
Warning People SCAM and LIE
Sorry to break the bad news to you. If you're new to the industry, take this as a warning. If you're a veteran, never get complacent and always remain leery. For those that have been around P2P lending I would venture to guess you learned this lesson the hard way.
In short Banking with Bitcoin is different than traditional lending. It’s much riskier for various reasons, but that risk also brings with it higher interest and profit potential. For more info on how banking with Bitcoin is different check out this blog post along with this one on tips to successful P2P lending.
With that said, P2P lending can still be profitable, and tools such as collateral make it much easier. However, end of the day it always comes down to individual decisions on if they are going to invest or not (until AI does this for us).
Tools to use for your decision to invest or not:
Using analytics to your advantage
Collateral and reputation are pretty straightforward and are primary resources for deciding to invest or not. But, I don’t think investors weight Analytics as much as they should. As more time passes and the pool of data increases Analytics becomes an increasingly valuable tool in your toolbox. And as we all know, every tool possible should be used in higher risk P2P lending environment.
Studies and Research
The New York Magazine article How to Predict If a Borrower Will Pay You Back http://nymag.com/scienceofus/2017/05/what-the-words-you-use-in-a-loan-application-reveal.html (excerpted from the new book Everybody Lies http://www.mymoneyblog.com/amazon.php?asin=0062390856) discusses an academic paper https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2865327 that analyzed 18,000 funded loan request for Prosper.com P2P loan listings against their default history.
Generally, if someone tells you they will pay you back, the alternative is more likely to happen. And the more assertive the promise, the more likely it will be broken. If a user writes “I promise to repay, on my honor, so help me God,” they are among the least likely to pay you back. Also, mentioning a family member—a husband, wife, son, daughter, mother, father, grand parent, nephew, niece, or cousin—is a sign someone will not be paying back anytime soon. Another word that indicates default is “explain,” i.e. if someone tries to explain using complex language and terms specifically how and why they are going to be able to pay back a loan, they very likely won’t.
According to the article, the five phrases used by folks who are most likely NOT to payback their loans are:
- Will Pay
- Thank You
If someone promises that they will pay you back, and thanks you for your consideration, they probably won’t ever pay you back. The more emotions and pleas to your sensibilities involved, the less likely they are able to pay you back. Remember, if there is no money available, then there is little chance for repayment, i.e. you can’t get blood from stone.
“In sum, according to these researchers, a user giving a detailed plan of how he can make his payments and mentioning commitments he has kept in the past are evidence someone will pay back a loan. Making promises and appealing to your mercy is a clear sign someone will go into default.”- Blank-
Good borrower prediction as well
This tool can also be used in a positive sense. For example the article “How to predict if a borrower will pay you” top phrases for repayment are also listed:
- Lower interest rate
- Minimum Payment
Financial and Demographic information
Keep Demographics in mind
Traditional lending relies heavily on the financials of borrowers along with demographic information. If you don't believe the demographic bias (as their are usually laws against it) try to get a $1MM loan while poorly dressed in a bad neighborhood bank. Financial and Demographic information has an obvious and useful correlation to determining if a borrower will pay back. However, researchers in the academic paper found that language analysis carried with it about as much weight as demographic information.
"We find that the predictive ability of the textual information alone is of similar magnitude to that of the financial and demographic information."
Even better is that when demographic and financial information was supplemented with textual information the predictive correlation was even better.
The data positively supports the logical argument that multiple predictive information sources used together produces a better result. So in P2P Bitcoin lending it is to your benefit to use as much information as possible for every investment decision including analyzing the text of borrowers.
Demographic info at Btcpop
While no official data analysis has been done and my opinion is completely subjective, borrowers from poorer less developed countries are more likely to default. But, I will also say that big defaulters (which cost investors a much larger total amount of Btc) are more likely from richer countries. Btcpop has taken a proactive action on the topic. In order to prevent spam loans from poor countries where there is no intention of repayment. Btcpop requires Address+ verification for 7 countries that it has found to have the highest default rates. Address+ is also required for high IP risk on people who might be using VPN or other deceptive tools.
Homemade Statistics for Btcpop loans
Interested in the concept and findings, I completed a non-professional analysis of 250 of the most recent Btcpop defaulters to analyze their data.
Please feel free to add/edit data and comment any interesting or helpful correlations you found and I will add them to this post.
Here is the original data view only-
My Subjective findings
Using a text analyzer tool on loan descriptions I was able to find some words that were used most often by past Btcpop defaulted borrowers.
- Time- 102 mentions (I noticed many scam posts emphasized how quick it would be paid back. Or that they just needed time to do something)
- Trading- 99 mentions (its a generally accepted rule that 95% of traders lose money)
- Thanks- 150 mentions (lots of scammers thanked the people they were scamming. Be leery of overly thankful borrowers. Its a mutually beneficial transaction...not a gift.)
- Reputation- 68 mentions (Reputation loans as a general rule are very high risk)
As I don't possess the analytic skill to accurately make statements about all the data, I thought I would just share some general patterns I noticed when obtaining the data.
- ***Most of these defaults are from prior management and obvious scams. Check the dates, only 20% of defaulters registered after 6/2016. The Default rate is down to 5% since new management has taken over.***
- Cancelled Loans: Defaulters tend to have a higher ratio of cancelled loans/repaid loans
- High APR: Higher than average or affordable interest rates seemed to be a very positive indicator. 65% of all the defaulters analyzed had an APR over 100%. I would recommend not investing in extremely high APR loans.
- Paid back previous loans: Besides the quick 0 pay back defaulters (from prior managements poor verification) borrowers don't tend to default on their first loan. So just because a borrower paid back previous loans doesn't necessarily mean they are safe. Always do your due diligence
- Typos: I found a lot of typos in defaulted loans.
Please comment any other interesting correlations you found and add them to a separate tab in google sheet and it will be added to this post.
In Summary, analytics is an increasingly valuable tool to use. Btcpop as a platform has future plans to make use of this information to help stop scammers and increase investor ROI. But, even now being aware of the correlations and doing some of your own research and data analysis on the topic is a worthwhile investment.
Article topic and text was submitted by Btcpop user -Blank-. If you have a blog post idea please feel free to share in the comments and someone will follow up with you.