Isaac Scientific Publishing

Frontiers in Signal Processing

Research on the Installment Risk of P2P Network Loan

Download PDF (435.2 KB) PP. 1 - 9 Pub. Date: January 5, 2020

DOI: 10.22606/fsp.2020.41001


  • Bo LI
    Southwest Minzu University, Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission, Chengdu Sichuan, 610041 China
  • Du-yu LIU*
    Southwest Minzu University, Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission, Chengdu Sichuan, 610041 China


The problems about some borrowers default in the rapid development of P2P network loan
causes economic losses to online lending platforms and investors. Based on the situation of borrowers
repaying loans in installments, this paper uses automatic binning to select features, builds a risk
monitoring model, and predicts whether the borrower will perform next month. The model can discover
the signs of borrower’s default in advance, so that the platform can take preventive measures earlier
and prevent the problem of platform fund circulation caused by insufficient repayment. In addition, it
can provide reference for the platform to estimate the monthly payment amount. In this paper, the
borrower data of 2016-2018 on Lending Club is used. The risk monitoring models of borrowers are
based on CART algorithm, random forest algorithm and XGBoost algorithm respectively. The
precision accuracy of the algorithms above is above 95%. The repayment amount of borrowers at last
month, the borrower's occupation, the total amount of borrowing, the borrower's monthly repayment
amount, and whether the borrower is working with the debt settlement company which are very
effective in analyzing the willingness of the borrower to perform on time next month. Therefore, they
could be used as the main basis for analysis.


Peer to peer lending, CART decision tree, random forest, XGBoost


[1] QuYanting.Random Forest Prediction Model of P2P Network Lending Default[D].Chongqing University,2018.

[2] Chaohui Wang. Credit Risk Assessment Model based on Feature Generation and Historical Records[D].Zhejiang University,2018.

[3] Freedman S M, Jin G Z. Learning by Doing with Asymmetric Information: Evidence from[J]. NBER Working Papers, 2011:203--212.

[4] XINGYU XHOU. Research on the operation mode of P2P network Lending industry—Taking Lufax as an example[D].Zhejiang University,2018.

[5] CHEN Xiao DING Xiao-yu WANG Bei-fen. A Study of the Overdue Behaviors in Private Borrowing——Empirical Analysis Based on P2P Network Borrowing and Lending [J].Finance Forum,2013,18(11):65-72.

[6] Shen D , Krumme C , Lippman A . Follow the profit or the herd? Exploring social effects in peer-to-peer lending[C]// IEEE Second International Conference on Social Computing. IEEE, 2010.

[7] ZHANG Ning CHEN Qin. P2P loan default prediction model based on TF-IDF algorithm [J].Journal of Computer Applications,2018,38(10):3042-3047.

[8] Zhang Mingjin Wang Mingwei. Use of Binning-based CARS method for feature selection from gene expression data [J]. Computers and Applied Chemistry, 2015, 32(8):001004-1006.

[9] Yangqiujie. The Research on Random Forest Based on IV Feature Selection[D]. HeFei University of Technology, 2010.

[10] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic Minority Over-sampling Technique[J]. Journal of Artificial Intelligence Research, 2002, 16(1):321-357.

[11] LIU Xiang-dong LI Fen. The Evaluation of the Borrower’s Credit Risk in Peer-to-Peer Lending under the Background of Big Data:Evidence from RenRen Dai[J].Statistics&Information Forum,2016,31(05):41-48.

[12] Breiman L , Firedman J H ,Olshen R A , et al. Classification and Regression Trees. Wadsworth , Inc.1984.

[13] N. Schnitzler,P.-S. Ross,E. Gloaguen. Using machine learning to estimate a key missing geochemical variable in mining exploration: Application of the Random Forest algorithm to multi-sensor core logging data[J]. Journal of Geochemical Exploration,2019,205.

[14] Chen T , Guestrin C . XGBoost: A Scalable Tree Boosting System[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.