In machine learning, we work with a set of data whose purpose is to train the model and help it learn patterns. This data is initially divided into two parts: Our test data and our train data test and train. Our train data includes our main data, which has an output, and that output is the test output of our train data. For example, imagine we want to see if, based on a person’s history, we can give them a loan or not.💸 All of our data becomes train. For this, we have a set of columns and features for example, the last time the person took a loan, their phone number, the loan amount, whether they were trustworthy or not, etc.
x_test is like the Midterm Exam.
The student, based on what they’ve learned and the teacher’s explanations, must answer the questions without cheating. The answer the student writes on the paper is y_test.
y_pred is like the final exam.
In the first term, the teacher might have given hints during class and helped the student with the questions. But here, the student has no idea what the questions might be and must answer based only on what they’ve learned to show whether they truly understood the lesson or not.
Or the entrance exam (konkur) is the same:
x_train The student studies
Mentally reviews the concepts and asks themselves questions y_train
Takes practice tests without cheating x_test
The answer the student writes y_test
And on the actual exam day, the answers they write become y_pred
Meaning the y that they predict on their own based on the x’s they’ve learned.