A question about random_state in the kfold CV

cross-validationpython

Trying to do CV for my polynomial regressor. However, when I use random_state in my CV, all the polynomial degrees are giving me the same values for R^2. Any clue? I think I have something wrong in my code. THANK YOU VERY MUCH.

lin_regressor = LinearRegression()

# pass the order of your polynomial here  
poly = PolynomialFeatures(1)
# convert to be used further to linear regression
X_transform = poly.fit_transform(x_train)

# fit this to Linear Regressor
linear_regg=lin_regressor.fit(X_transform,y_train)


from sklearn.model_selection import cross_val_score
    from sklearn.model_selection import KFold
    from sklearn.metrics import r2_score
    
crossvalidation_poly = KFold(n_splits=3, shuffle=True,random_state=0) 



for i in range(1,11):
    poly_cross_validation = PolynomialFeatures(degree=i)
    X_current = poly.fit_transform(X_normalized)
    model = lin_regressor.fit(X_current, y_for_normalized)
    scores = cross_val_score(model, X_current,y_for_normalized, scoring='r2', cv=crossvalidation_poly,
 n_jobs=1)
    

    print("\n\nDegree-"+str(i) +" polynomial: R^2 for every fold: " + str(np.abs(scores)))
          
          #+" training: " + str(np.abs(train_index))+" \ntesting: " + str(np.abs(test_index)))

    print('\033[1m'+"Degree-"+str(i)+ '\033[1m'+ " polynomial: Average R^2 for all the folds: " + str(np.mean(np.abs(scores))) + '\033[0m'+ ", STD: " + str(np.std(scores)))

Degree-1 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-1 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-2 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-2 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-3 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-3 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-4 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-4 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-5 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-5 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-6 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-6 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-7 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-7 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-8 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-8 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-9 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-9 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Degree-10 polynomial: R^2 for every fold: [0.81372196 0.77127462 0.37915208]
Degree-10 polynomial: Average R^2 for all the folds: 0.6547162186718273, STD: 0.19562232218476114

Best Answer

You create poly_cross_validation, but never use it:

poly_cross_validation = PolynomialFeatures(degree=i)
X_current = poly.fit_transform(X_normalized)

you probably want to call poly_cross_validation.fit_transform instead

Related Question