Alibaba Cloud Machine Learning Platform for AI: Using Regression Algorithm to Predict Agriculture Loan Issuing
By Garvin Li
Note: The data in this article is fictitious and is only used for experimental purposes.
Issuing agriculture loans is a typical data mining case. Lenders use an experience model built based on statistics of past years (including a borrower’s yearly income, types of planted crops, loan history, and other factors) to predict that borrower’s repayment ability.
This document is based on agriculture loan scenarios and shows you how to use a linear regression algorithm to handle loan issuing business.
Linear regression is a widely applicable statistics analysis method used in statistics to determine the quantitative relation that two or more variables depend on. This article predicts whether to issue requested loan amounts to users in the prediction set by analyzing the issuing history information of agriculture loans. We will be performing all our data analysis on the Alibaba Cloud Machine Learning platform.
Dataset Introduction
The specific fields are as follows:
The following is a screenshot of the data.
Data Exploration Procedure
The following diagram shows the experiment process.
1. Data Source Preparation
Input data is divided into two parts:
- Loan training set: Over 200 pieces of loan data are used to train the regression model. This training set includes features such as “farmsize” and “rainfall”. “claimvalue” is the recovered loan amount.
- Loan prediction set: This prediction set includes a total of 71 loan applicants this year. “claimvalue” is a farmer’s requested loan amount.
Predicate whom of the 71 applicants will receive loans based on the existing 200+ pieces of history data.
2. Data Pre-Processing
Map data of string type to numbers according to data meanings. For example, for the “region” field, map “north”, “middle”, and “south”in order to 0, 1, and 2 respectively, then convert the field to the double type by using the type conversion component, as shown in the following diagram. You can perform model training after data is pre-processed.
3. Model Training and Prediction
Use linear regression components to train history data and generate a regression model, which is used in the prediction component to predict data in the prediction set. Use the column merge component to merge user ID, prediction score and claim value, as shown in the following screenshot.
The prediction score indicates a user’s loan repayment ability (expected loan repayment amount).
4. Regression Model Evaluation
Use the regression model evaluation component to evaluate the model. The following table describes evaluation results.
5. Loan Issuance
Use filtering and mapping components to determine applicants that can receive loans. The principle of the experiment is that, if an applicant’s repayment ability is predicated to be greater than the requested loan amount, that applicant will receive a loan. This principle applies to each potential customer.
To learn more about Alibaba Cloud Machine Learning Platform for Artificial Intelligence (PAI), visit www.alibabacloud.com/product/machine-learning