pandasのDataFrameの概要と生成方法 | hydroculのメモ
pandasにはSeriesとDataFrameという2つのデータ構造があり、 Seriesは1次元配列に似ているのに対して、 DataFrameは2次元配列というかエクセルのようなスプレッドシートに似ている。
import pandas as pd
csv_file_name = 'data/WA_Fn-UseC_-HR-Employee-Attrition.csv'
df = pd.read_csv(csv_file_name)
df.head()
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 41 | Yes | Travel_Rarely | 1102 | Sales | 1 | 2 | Life Sciences | 1 | 1 | ... | 1 | 80 | 0 | 8 | 0 | 1 | 6 | 4 | 0 | 5 |
| 1 | 49 | No | Travel_Frequently | 279 | Research & Development | 8 | 1 | Life Sciences | 1 | 2 | ... | 4 | 80 | 1 | 10 | 3 | 3 | 10 | 7 | 1 | 7 |
| 2 | 37 | Yes | Travel_Rarely | 1373 | Research & Development | 2 | 2 | Other | 1 | 4 | ... | 2 | 80 | 0 | 7 | 3 | 3 | 0 | 0 | 0 | 0 |
| 3 | 33 | No | Travel_Frequently | 1392 | Research & Development | 3 | 4 | Life Sciences | 1 | 5 | ... | 3 | 80 | 0 | 8 | 3 | 3 | 8 | 7 | 3 | 0 |
| 4 | 27 | No | Travel_Rarely | 591 | Research & Development | 2 | 1 | Medical | 1 | 7 | ... | 4 | 80 | 1 | 6 | 3 | 3 | 2 | 2 | 2 | 2 |
5 rows × 35 columns
xlsx_file_name = 'data/WA_Fn-UseC_-HR-Employee-Attrition.xlsx'
xl = pd.ExcelFile(xlsx_file_name)
xl.sheet_names
['WA_Fn-UseC_-HR-Employee-Attriti', 'Data Definitions']
df = xl.parse('WA_Fn-UseC_-HR-Employee-Attriti')
df.head()
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 41 | Yes | Travel_Rarely | 1102 | Sales | 1 | 2 | Life Sciences | 1 | 1 | ... | 1 | 80 | 0 | 8 | 0 | 1 | 6 | 4 | 0 | 5 |
| 1 | 49 | No | Travel_Frequently | 279 | Research & Development | 8 | 1 | Life Sciences | 1 | 2 | ... | 4 | 80 | 1 | 10 | 3 | 3 | 10 | 7 | 1 | 7 |
| 2 | 37 | Yes | Travel_Rarely | 1373 | Research & Development | 2 | 2 | Other | 1 | 4 | ... | 2 | 80 | 0 | 7 | 3 | 3 | 0 | 0 | 0 | 0 |
| 3 | 33 | No | Travel_Frequently | 1392 | Research & Development | 3 | 4 | Life Sciences | 1 | 5 | ... | 3 | 80 | 0 | 8 | 3 | 3 | 8 | 7 | 3 | 0 |
| 4 | 27 | No | Travel_Rarely | 591 | Research & Development | 2 | 1 | Medical | 1 | 7 | ... | 4 | 80 | 1 | 6 | 3 | 3 | 2 | 2 | 2 | 2 |
5 rows × 35 columns
df = xl.parse(xl.sheet_names[0])
df.head()
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 41 | Yes | Travel_Rarely | 1102 | Sales | 1 | 2 | Life Sciences | 1 | 1 | ... | 1 | 80 | 0 | 8 | 0 | 1 | 6 | 4 | 0 | 5 |
| 1 | 49 | No | Travel_Frequently | 279 | Research & Development | 8 | 1 | Life Sciences | 1 | 2 | ... | 4 | 80 | 1 | 10 | 3 | 3 | 10 | 7 | 1 | 7 |
| 2 | 37 | Yes | Travel_Rarely | 1373 | Research & Development | 2 | 2 | Other | 1 | 4 | ... | 2 | 80 | 0 | 7 | 3 | 3 | 0 | 0 | 0 | 0 |
| 3 | 33 | No | Travel_Frequently | 1392 | Research & Development | 3 | 4 | Life Sciences | 1 | 5 | ... | 3 | 80 | 0 | 8 | 3 | 3 | 8 | 7 | 3 | 0 |
| 4 | 27 | No | Travel_Rarely | 591 | Research & Development | 2 | 1 | Medical | 1 | 7 | ... | 4 | 80 | 1 | 6 | 3 | 3 | 2 | 2 | 2 | 2 |
5 rows × 35 columns
df = pd.read_excel(xlsx_file_name, sheetname = 'WA_Fn-UseC_-HR-Employee-Attriti')
df.head()
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 41 | Yes | Travel_Rarely | 1102 | Sales | 1 | 2 | Life Sciences | 1 | 1 | ... | 1 | 80 | 0 | 8 | 0 | 1 | 6 | 4 | 0 | 5 |
| 1 | 49 | No | Travel_Frequently | 279 | Research & Development | 8 | 1 | Life Sciences | 1 | 2 | ... | 4 | 80 | 1 | 10 | 3 | 3 | 10 | 7 | 1 | 7 |
| 2 | 37 | Yes | Travel_Rarely | 1373 | Research & Development | 2 | 2 | Other | 1 | 4 | ... | 2 | 80 | 0 | 7 | 3 | 3 | 0 | 0 | 0 | 0 |
| 3 | 33 | No | Travel_Frequently | 1392 | Research & Development | 3 | 4 | Life Sciences | 1 | 5 | ... | 3 | 80 | 0 | 8 | 3 | 3 | 8 | 7 | 3 | 0 |
| 4 | 27 | No | Travel_Rarely | 591 | Research & Development | 2 | 1 | Medical | 1 | 7 | ... | 4 | 80 | 1 | 6 | 3 | 3 | 2 | 2 | 2 | 2 |
5 rows × 35 columns
df = pd.read_excel(xlsx_file_name)
df.head()
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 41 | Yes | Travel_Rarely | 1102 | Sales | 1 | 2 | Life Sciences | 1 | 1 | ... | 1 | 80 | 0 | 8 | 0 | 1 | 6 | 4 | 0 | 5 |
| 1 | 49 | No | Travel_Frequently | 279 | Research & Development | 8 | 1 | Life Sciences | 1 | 2 | ... | 4 | 80 | 1 | 10 | 3 | 3 | 10 | 7 | 1 | 7 |
| 2 | 37 | Yes | Travel_Rarely | 1373 | Research & Development | 2 | 2 | Other | 1 | 4 | ... | 2 | 80 | 0 | 7 | 3 | 3 | 0 | 0 | 0 | 0 |
| 3 | 33 | No | Travel_Frequently | 1392 | Research & Development | 3 | 4 | Life Sciences | 1 | 5 | ... | 3 | 80 | 0 | 8 | 3 | 3 | 8 | 7 | 3 | 0 |
| 4 | 27 | No | Travel_Rarely | 591 | Research & Development | 2 | 1 | Medical | 1 | 7 | ... | 4 | 80 | 1 | 6 | 3 | 3 | 2 | 2 | 2 | 2 |
5 rows × 35 columns
len(df)
1470
df.shape #(行数、列数)の形で返す
(1470, 35)
df.info() #カラム名とその型の一覧
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1470 entries, 0 to 1469 Data columns (total 35 columns): Age 1470 non-null int64 Attrition 1470 non-null object BusinessTravel 1470 non-null object DailyRate 1470 non-null int64 Department 1470 non-null object DistanceFromHome 1470 non-null int64 Education 1470 non-null int64 EducationField 1470 non-null object EmployeeCount 1470 non-null int64 EmployeeNumber 1470 non-null int64 EnvironmentSatisfaction 1470 non-null int64 Gender 1470 non-null object HourlyRate 1470 non-null int64 JobInvolvement 1470 non-null int64 JobLevel 1470 non-null int64 JobRole 1470 non-null object JobSatisfaction 1470 non-null int64 MaritalStatus 1470 non-null object MonthlyIncome 1470 non-null int64 MonthlyRate 1470 non-null int64 NumCompaniesWorked 1470 non-null int64 Over18 1470 non-null object OverTime 1470 non-null object PercentSalaryHike 1470 non-null int64 PerformanceRating 1470 non-null int64 RelationshipSatisfaction 1470 non-null int64 StandardHours 1470 non-null int64 StockOptionLevel 1470 non-null int64 TotalWorkingYears 1470 non-null int64 TrainingTimesLastYear 1470 non-null int64 WorkLifeBalance 1470 non-null int64 YearsAtCompany 1470 non-null int64 YearsInCurrentRole 1470 non-null int64 YearsSinceLastPromotion 1470 non-null int64 YearsWithCurrManager 1470 non-null int64 dtypes: int64(26), object(9) memory usage: 402.0+ KB
import numpy as np
from pandas import *
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(font='IPAexGothic')
df["DailyRate"].hist(linewidth = 1, alpha=.5)
plt.xlabel("DailyRate")
plt.ylabel("Freq")
plt.show()
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(font='IPAexGothic')
df["DailyRate"].hist(orientation='horizontal', alpha=.5)
plt.xlabel("DailyRate")
plt.ylabel("Freq")
plt.show()
plt.scatter(df['HourlyRate'], df['DailyRate'])
plt.show()
pd.plotting.scatter_matrix(df[['HourlyRate', 'DailyRate', 'DistanceFromHome']], alpha=0.2, figsize=(6, 6), diagonal='kde')
plt.show()
df[['HourlyRate', 'DailyRate', 'DistanceFromHome']].cov()
| HourlyRate | DailyRate | DistanceFromHome | |
|---|---|---|---|
| HourlyRate | 413.285626 | 191.800350 | 5.130567 |
| DailyRate | 191.800350 | 162819.593737 | -16.308004 |
| DistanceFromHome | 5.130567 | -16.308004 | 65.721251 |
df.cov()
| Age | DailyRate | DistanceFromHome | Education | EmployeeCount | EmployeeNumber | EnvironmentSatisfaction | HourlyRate | JobInvolvement | JobLevel | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 83.455049 | 39.298434 | -0.124873 | 1.946390 | 0.0 | -55.797199 | 0.101319 | 4.510422 | 0.193841 | 5.153276 | ... | 0.528776 | 0.0 | 0.291977 | 48.361684 | -0.231093 | -0.138695 | 17.423359 | 7.046750 | 6.373743 | 6.587332 |
| DailyRate | 39.298434 | 162819.593737 | -16.308004 | -6.945424 | 0.0 | -12386.713294 | 8.095750 | 191.800350 | 13.246309 | 1.324944 | ... | 3.423048 | 0.0 | 14.489565 | 45.570709 | 1.275892 | -10.789322 | -84.187085 | 14.520296 | -43.206982 | -37.957055 |
| DistanceFromHome | -0.124873 | -16.308004 | 65.721251 | 0.174705 | 0.0 | 160.649502 | -0.142451 | 5.130567 | 0.050667 | 0.047586 | ... | 0.057478 | 0.0 | 0.309961 | 0.291951 | -0.386118 | -0.152094 | 0.472219 | 0.553521 | 0.261991 | 0.416715 |
| Education | 1.946390 | -6.945424 | 0.174705 | 1.048914 | 0.0 | 25.939251 | -0.030370 | 0.349263 | 0.030927 | 0.115170 | ... | -0.010097 | 0.0 | 0.016076 | 1.181612 | -0.033143 | 0.007105 | 0.433659 | 0.223515 | 0.179056 | 0.252390 |
| EmployeeCount | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| EmployeeNumber | -55.797199 | -12386.713294 | 160.649502 | 25.939251 | 0.0 | 362433.299749 | 11.595582 | 430.551701 | -2.950629 | -12.341279 | ... | -45.473775 | 0.0 | 31.920482 | -67.289749 | 18.320126 | 4.384426 | -41.458396 | -18.357800 | -17.496817 | -19.755358 |
| EnvironmentSatisfaction | 0.101319 | 8.095750 | -0.142451 | -0.030370 | 0.0 | 11.595582 | 1.194829 | -1.107908 | -0.006438 | 0.001466 | ... | 0.009059 | 0.0 | 0.003197 | -0.022905 | -0.027283 | 0.021335 | 0.009761 | 0.071317 | 0.057040 | -0.019496 |
| HourlyRate | 4.510422 | 191.800350 | 5.130567 | 0.349263 | 0.0 | 430.551701 | -1.107908 | 413.285626 | 0.620006 | -0.626800 | ... | 0.029244 | 0.0 | 0.870674 | -0.369139 | -0.224036 | -0.066170 | -2.438866 | -1.775575 | -1.750142 | -1.459700 |
| JobInvolvement | 0.193841 | 13.246309 | 0.050667 | 0.030927 | 0.0 | -2.950629 | -0.006438 | 0.620006 | 0.506319 | -0.009948 | ... | 0.026386 | 0.0 | 0.013049 | -0.030634 | -0.014071 | -0.007348 | -0.093097 | 0.022473 | -0.055454 | 0.065951 |
| JobLevel | 5.153276 | 1.324944 | 0.047586 | 0.115170 | 0.0 | -12.341279 | 0.001466 | -0.626800 | -0.009948 | 1.225316 | ... | 0.025901 | 0.0 | 0.013190 | 6.737044 | -0.025961 | 0.029574 | 3.626435 | 1.561913 | 1.262322 | 1.482250 |
| JobSatisfaction | -0.049285 | 13.604357 | -0.032802 | -0.012759 | 0.0 | -30.705067 | -0.008179 | -1.599339 | -0.016853 | -0.002373 | ... | -0.014850 | 0.0 | 0.010046 | -0.173208 | -0.008217 | -0.015161 | -0.025693 | -0.009209 | -0.064728 | -0.108830 |
| MonthlyIncome | 21412.198982 | 14641.125975 | -649.386355 | 457.874204 | 0.0 | -42028.530023 | -32.210416 | -1511.673923 | -51.159481 | 4952.416922 | ... | 131.703156 | 0.0 | 21.693112 | 28312.303770 | -131.935513 | 102.053699 | 14833.730990 | 6205.846259 | 5233.677307 | 5780.054075 |
| MonthlyRate | 1823.988823 | -92428.502266 | 1585.264627 | -190.148240 | 0.0 | 54198.679015 | 292.537298 | -2213.447553 | -82.667086 | 311.714963 | ... | -31.439933 | 0.0 | -208.164513 | 1464.435332 | 13.461200 | 40.043086 | -1031.535222 | -330.479133 | 35.937006 | -933.244190 |
| NumCompaniesWorked | 6.837739 | 38.457493 | -0.592359 | 0.323165 | 0.0 | -1.881380 | 0.034389 | 1.125195 | 0.026684 | 0.394036 | ... | 0.142425 | 0.0 | 0.064016 | 4.618854 | -0.212734 | -0.014764 | -1.812334 | -0.821380 | -0.296339 | -0.983301 |
| PercentSalaryHike | 0.121489 | 33.529204 | 1.193809 | -0.041648 | 0.0 | -28.520432 | -0.126824 | -0.674252 | -0.044805 | -0.140705 | ... | -0.160226 | 0.0 | 0.023476 | -0.586872 | -0.024636 | -0.008480 | -0.807021 | -0.020156 | -0.261286 | -0.156517 |
| PerformanceRating | 0.006276 | 0.068910 | 0.079300 | -0.009068 | 0.0 | -4.422436 | -0.011654 | -0.015930 | -0.007464 | -0.008476 | ... | -0.012231 | 0.0 | 0.001078 | 0.018933 | -0.007247 | 0.000656 | 0.007594 | 0.045738 | 0.020808 | 0.029389 |
| RelationshipSatisfaction | 0.528776 | 3.423048 | 0.057478 | -0.010097 | 0.0 | -45.473775 | 0.009059 | 0.029244 | 0.026386 | 0.025901 | ... | 1.169013 | 0.0 | -0.042335 | 0.202360 | 0.003480 | 0.014975 | 0.128287 | -0.059242 | 0.116692 | -0.003347 |
| StandardHours | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| StockOptionLevel | 0.291977 | 14.489565 | 0.309961 | 0.016076 | 0.0 | 31.920482 | 0.003197 | 0.870674 | 0.013049 | 0.013190 | ... | -0.042335 | 0.0 | 0.726035 | 0.067200 | 0.012385 | 0.002485 | 0.078607 | 0.156884 | 0.039408 | 0.075091 |
| TotalWorkingYears | 48.361684 | 45.570709 | 0.291951 | 1.181612 | 0.0 | -67.289749 | -0.022905 | -0.369139 | -0.030634 | 6.737044 | ... | 0.202360 | 0.0 | 0.067200 | 60.540563 | -0.357740 | 0.005539 | 29.942577 | 12.978065 | 10.151009 | 12.748396 |
| TrainingTimesLastYear | -0.231093 | 1.275892 | -0.386118 | -0.033143 | 0.0 | 18.320126 | -0.027283 | -0.224036 | -0.014071 | -0.025961 | ... | 0.003480 | 0.0 | 0.012385 | -0.357740 | 1.662219 | 0.025569 | 0.028188 | -0.026801 | -0.008586 | -0.018841 |
| WorkLifeBalance | -0.138695 | -10.789322 | -0.152094 | 0.007105 | 0.0 | 4.384426 | 0.021335 | -0.066170 | -0.007348 | 0.029574 | ... | 0.014975 | 0.0 | 0.002485 | 0.005539 | 0.025569 | 0.499108 | 0.052325 | 0.127616 | 0.020355 | 0.006956 |
| YearsAtCompany | 17.423359 | -84.187085 | 0.472219 | 0.433659 | 0.0 | -41.458396 | 0.009761 | -2.438866 | -0.093097 | 3.626435 | ... | 0.128287 | 0.0 | 0.078607 | 29.942577 | 0.028188 | 0.052325 | 37.534310 | 16.842239 | 12.208813 | 16.815196 |
| YearsInCurrentRole | 7.046750 | 14.520296 | 0.553521 | 0.223515 | 0.0 | -18.357800 | 0.071317 | -1.775575 | 0.022473 | 1.561913 | ... | -0.059242 | 0.0 | 0.156884 | 12.978065 | -0.026801 | 0.127616 | 16.842239 | 13.127122 | 6.398725 | 9.235198 |
| YearsSinceLastPromotion | 6.373743 | -43.206982 | 0.261991 | 0.179056 | 0.0 | -17.496817 | 0.057040 | -1.750142 | -0.055454 | 1.262322 | ... | 0.116692 | 0.0 | 0.039408 | 10.151009 | -0.008586 | 0.020355 | 12.208813 | 6.398725 | 10.384057 | 5.866587 |
| YearsWithCurrManager | 6.587332 | -37.957055 | 0.416715 | 0.252390 | 0.0 | -19.755358 | -0.019496 | -1.459700 | 0.065951 | 1.482250 | ... | -0.003347 | 0.0 | 0.075091 | 12.748396 | -0.018841 | 0.006956 | 16.815196 | 9.235198 | 5.866587 | 12.731595 |
26 rows × 26 columns
df.corr()
| Age | DailyRate | DistanceFromHome | Education | EmployeeCount | EmployeeNumber | EnvironmentSatisfaction | HourlyRate | JobInvolvement | JobLevel | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000000 | 0.010661 | -0.001686 | 0.208034 | NaN | -0.010145 | 0.010146 | 0.024287 | 0.029820 | 0.509604 | ... | 0.053535 | NaN | 0.037510 | 0.680381 | -0.019621 | -0.021490 | 0.311309 | 0.212901 | 0.216513 | 0.202089 |
| DailyRate | 0.010661 | 1.000000 | -0.004985 | -0.016806 | NaN | -0.050990 | 0.018355 | 0.023381 | 0.046135 | 0.002966 | ... | 0.007846 | NaN | 0.042143 | 0.014515 | 0.002453 | -0.037848 | -0.034055 | 0.009932 | -0.033229 | -0.026363 |
| DistanceFromHome | -0.001686 | -0.004985 | 1.000000 | 0.021042 | NaN | 0.032916 | -0.016075 | 0.031131 | 0.008783 | 0.005303 | ... | 0.006557 | NaN | 0.044872 | 0.004628 | -0.036942 | -0.026556 | 0.009508 | 0.018845 | 0.010029 | 0.014406 |
| Education | 0.208034 | -0.016806 | 0.021042 | 1.000000 | NaN | 0.042070 | -0.027128 | 0.016775 | 0.042438 | 0.101589 | ... | -0.009118 | NaN | 0.018422 | 0.148280 | -0.025100 | 0.009819 | 0.069114 | 0.060236 | 0.054254 | 0.069065 |
| EmployeeCount | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| EmployeeNumber | -0.010145 | -0.050990 | 0.032916 | 0.042070 | NaN | 1.000000 | 0.017621 | 0.035179 | -0.006888 | -0.018519 | ... | -0.069861 | NaN | 0.062227 | -0.014365 | 0.023603 | 0.010309 | -0.011240 | -0.008416 | -0.009019 | -0.009197 |
| EnvironmentSatisfaction | 0.010146 | 0.018355 | -0.016075 | -0.027128 | NaN | 0.017621 | 1.000000 | -0.049857 | -0.008278 | 0.001212 | ... | 0.007665 | NaN | 0.003432 | -0.002693 | -0.019359 | 0.027627 | 0.001458 | 0.018007 | 0.016194 | -0.004999 |
| HourlyRate | 0.024287 | 0.023381 | 0.031131 | 0.016775 | NaN | 0.035179 | -0.049857 | 1.000000 | 0.042861 | -0.027853 | ... | 0.001330 | NaN | 0.050263 | -0.002334 | -0.008548 | -0.004607 | -0.019582 | -0.024106 | -0.026716 | -0.020123 |
| JobInvolvement | 0.029820 | 0.046135 | 0.008783 | 0.042438 | NaN | -0.006888 | -0.008278 | 0.042861 | 1.000000 | -0.012630 | ... | 0.034297 | NaN | 0.021523 | -0.005533 | -0.015338 | -0.014617 | -0.021355 | 0.008717 | -0.024184 | 0.025976 |
| JobLevel | 0.509604 | 0.002966 | 0.005303 | 0.101589 | NaN | -0.018519 | 0.001212 | -0.027853 | -0.012630 | 1.000000 | ... | 0.021642 | NaN | 0.013984 | 0.782208 | -0.018191 | 0.037818 | 0.534739 | 0.389447 | 0.353885 | 0.375281 |
| JobSatisfaction | -0.004892 | 0.030571 | -0.003669 | -0.011296 | NaN | -0.046247 | -0.006784 | -0.071335 | -0.021476 | -0.001944 | ... | -0.012454 | NaN | 0.010690 | -0.020185 | -0.005779 | -0.019459 | -0.003803 | -0.002305 | -0.018214 | -0.027656 |
| MonthlyIncome | 0.497855 | 0.007707 | -0.017014 | 0.094961 | NaN | -0.014829 | -0.006259 | -0.015794 | -0.015271 | 0.950300 | ... | 0.025873 | NaN | 0.005408 | 0.772893 | -0.021736 | 0.030683 | 0.514285 | 0.363818 | 0.344978 | 0.344079 |
| MonthlyRate | 0.028051 | -0.032182 | 0.027473 | -0.026084 | NaN | 0.012648 | 0.037600 | -0.015297 | -0.016322 | 0.039563 | ... | -0.004085 | NaN | -0.034323 | 0.026442 | 0.001467 | 0.007963 | -0.023655 | -0.012815 | 0.001567 | -0.036746 |
| NumCompaniesWorked | 0.299635 | 0.038153 | -0.029251 | 0.126317 | NaN | -0.001251 | 0.012594 | 0.022157 | 0.015012 | 0.142501 | ... | 0.052733 | NaN | 0.030075 | 0.237639 | -0.066054 | -0.008366 | -0.118421 | -0.090754 | -0.036814 | -0.110319 |
| PercentSalaryHike | 0.003634 | 0.022704 | 0.040235 | -0.011111 | NaN | -0.012944 | -0.031701 | -0.009062 | -0.017205 | -0.034730 | ... | -0.040490 | NaN | 0.007528 | -0.020608 | -0.005221 | -0.003280 | -0.035991 | -0.001520 | -0.022154 | -0.011985 |
| PerformanceRating | 0.001904 | 0.000473 | 0.027110 | -0.024539 | NaN | -0.020359 | -0.029548 | -0.002172 | -0.029071 | -0.021222 | ... | -0.031351 | NaN | 0.003506 | 0.006744 | -0.015579 | 0.002572 | 0.003435 | 0.034986 | 0.017896 | 0.022827 |
| RelationshipSatisfaction | 0.053535 | 0.007846 | 0.006557 | -0.009118 | NaN | -0.069861 | 0.007665 | 0.001330 | 0.034297 | 0.021642 | ... | 1.000000 | NaN | -0.045952 | 0.024054 | 0.002497 | 0.019604 | 0.019367 | -0.015123 | 0.033493 | -0.000867 |
| StandardHours | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| StockOptionLevel | 0.037510 | 0.042143 | 0.044872 | 0.018422 | NaN | 0.062227 | 0.003432 | 0.050263 | 0.021523 | 0.013984 | ... | -0.045952 | NaN | 1.000000 | 0.010136 | 0.011274 | 0.004129 | 0.015058 | 0.050818 | 0.014352 | 0.024698 |
| TotalWorkingYears | 0.680381 | 0.014515 | 0.004628 | 0.148280 | NaN | -0.014365 | -0.002693 | -0.002334 | -0.005533 | 0.782208 | ... | 0.024054 | NaN | 0.010136 | 1.000000 | -0.035662 | 0.001008 | 0.628133 | 0.460365 | 0.404858 | 0.459188 |
| TrainingTimesLastYear | -0.019621 | 0.002453 | -0.036942 | -0.025100 | NaN | 0.023603 | -0.019359 | -0.008548 | -0.015338 | -0.018191 | ... | 0.002497 | NaN | 0.011274 | -0.035662 | 1.000000 | 0.028072 | 0.003569 | -0.005738 | -0.002067 | -0.004096 |
| WorkLifeBalance | -0.021490 | -0.037848 | -0.026556 | 0.009819 | NaN | 0.010309 | 0.027627 | -0.004607 | -0.014617 | 0.037818 | ... | 0.019604 | NaN | 0.004129 | 0.001008 | 0.028072 | 1.000000 | 0.012089 | 0.049856 | 0.008941 | 0.002759 |
| YearsAtCompany | 0.311309 | -0.034055 | 0.009508 | 0.069114 | NaN | -0.011240 | 0.001458 | -0.019582 | -0.021355 | 0.534739 | ... | 0.019367 | NaN | 0.015058 | 0.628133 | 0.003569 | 0.012089 | 1.000000 | 0.758754 | 0.618409 | 0.769212 |
| YearsInCurrentRole | 0.212901 | 0.009932 | 0.018845 | 0.060236 | NaN | -0.008416 | 0.018007 | -0.024106 | 0.008717 | 0.389447 | ... | -0.015123 | NaN | 0.050818 | 0.460365 | -0.005738 | 0.049856 | 0.758754 | 1.000000 | 0.548056 | 0.714365 |
| YearsSinceLastPromotion | 0.216513 | -0.033229 | 0.010029 | 0.054254 | NaN | -0.009019 | 0.016194 | -0.026716 | -0.024184 | 0.353885 | ... | 0.033493 | NaN | 0.014352 | 0.404858 | -0.002067 | 0.008941 | 0.618409 | 0.548056 | 1.000000 | 0.510224 |
| YearsWithCurrManager | 0.202089 | -0.026363 | 0.014406 | 0.069065 | NaN | -0.009197 | -0.004999 | -0.020123 | 0.025976 | 0.375281 | ... | -0.000867 | NaN | 0.024698 | 0.459188 | -0.004096 | 0.002759 | 0.769212 | 0.714365 | 0.510224 | 1.000000 |
26 rows × 26 columns
df.index
RangeIndex(start=0, stop=1470, step=1)
df.columns
Index(['Age', 'Attrition', 'BusinessTravel', 'DailyRate', 'Department',
'DistanceFromHome', 'Education', 'EducationField', 'EmployeeCount',
'EmployeeNumber', 'EnvironmentSatisfaction', 'Gender', 'HourlyRate',
'JobInvolvement', 'JobLevel', 'JobRole', 'JobSatisfaction',
'MaritalStatus', 'MonthlyIncome', 'MonthlyRate', 'NumCompaniesWorked',
'Over18', 'OverTime', 'PercentSalaryHike', 'PerformanceRating',
'RelationshipSatisfaction', 'StandardHours', 'StockOptionLevel',
'TotalWorkingYears', 'TrainingTimesLastYear', 'WorkLifeBalance',
'YearsAtCompany', 'YearsInCurrentRole', 'YearsSinceLastPromotion',
'YearsWithCurrManager'],
dtype='object')
df.values
array([[41, 'Yes', 'Travel_Rarely', ..., 4, 0, 5],
[49, 'No', 'Travel_Frequently', ..., 7, 1, 7],
[37, 'Yes', 'Travel_Rarely', ..., 0, 0, 0],
...,
[27, 'No', 'Travel_Rarely', ..., 2, 0, 3],
[49, 'No', 'Travel_Frequently', ..., 6, 0, 8],
[34, 'No', 'Travel_Rarely', ..., 3, 1, 2]], dtype=object)
df.describe()
| Age | DailyRate | DistanceFromHome | Education | EmployeeCount | EmployeeNumber | EnvironmentSatisfaction | HourlyRate | JobInvolvement | JobLevel | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.0 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 | ... | 1470.000000 | 1470.0 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 | 1470.000000 |
| mean | 36.923810 | 802.485714 | 9.192517 | 2.912925 | 1.0 | 1024.865306 | 2.721769 | 65.891156 | 2.729932 | 2.063946 | ... | 2.712245 | 80.0 | 0.793878 | 11.279592 | 2.799320 | 2.761224 | 7.008163 | 4.229252 | 2.187755 | 4.123129 |
| std | 9.135373 | 403.509100 | 8.106864 | 1.024165 | 0.0 | 602.024335 | 1.093082 | 20.329428 | 0.711561 | 1.106940 | ... | 1.081209 | 0.0 | 0.852077 | 7.780782 | 1.289271 | 0.706476 | 6.126525 | 3.623137 | 3.222430 | 3.568136 |
| min | 18.000000 | 102.000000 | 1.000000 | 1.000000 | 1.0 | 1.000000 | 1.000000 | 30.000000 | 1.000000 | 1.000000 | ... | 1.000000 | 80.0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 25% | 30.000000 | 465.000000 | 2.000000 | 2.000000 | 1.0 | 491.250000 | 2.000000 | 48.000000 | 2.000000 | 1.000000 | ... | 2.000000 | 80.0 | 0.000000 | 6.000000 | 2.000000 | 2.000000 | 3.000000 | 2.000000 | 0.000000 | 2.000000 |
| 50% | 36.000000 | 802.000000 | 7.000000 | 3.000000 | 1.0 | 1020.500000 | 3.000000 | 66.000000 | 3.000000 | 2.000000 | ... | 3.000000 | 80.0 | 1.000000 | 10.000000 | 3.000000 | 3.000000 | 5.000000 | 3.000000 | 1.000000 | 3.000000 |
| 75% | 43.000000 | 1157.000000 | 14.000000 | 4.000000 | 1.0 | 1555.750000 | 4.000000 | 83.750000 | 3.000000 | 3.000000 | ... | 4.000000 | 80.0 | 1.000000 | 15.000000 | 3.000000 | 3.000000 | 9.000000 | 7.000000 | 3.000000 | 7.000000 |
| max | 60.000000 | 1499.000000 | 29.000000 | 5.000000 | 1.0 | 2068.000000 | 4.000000 | 100.000000 | 4.000000 | 5.000000 | ... | 4.000000 | 80.0 | 3.000000 | 40.000000 | 6.000000 | 4.000000 | 40.000000 | 18.000000 | 15.000000 | 17.000000 |
8 rows × 26 columns
df.head(10)
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 41 | Yes | Travel_Rarely | 1102 | Sales | 1 | 2 | Life Sciences | 1 | 1 | ... | 1 | 80 | 0 | 8 | 0 | 1 | 6 | 4 | 0 | 5 |
| 1 | 49 | No | Travel_Frequently | 279 | Research & Development | 8 | 1 | Life Sciences | 1 | 2 | ... | 4 | 80 | 1 | 10 | 3 | 3 | 10 | 7 | 1 | 7 |
| 2 | 37 | Yes | Travel_Rarely | 1373 | Research & Development | 2 | 2 | Other | 1 | 4 | ... | 2 | 80 | 0 | 7 | 3 | 3 | 0 | 0 | 0 | 0 |
| 3 | 33 | No | Travel_Frequently | 1392 | Research & Development | 3 | 4 | Life Sciences | 1 | 5 | ... | 3 | 80 | 0 | 8 | 3 | 3 | 8 | 7 | 3 | 0 |
| 4 | 27 | No | Travel_Rarely | 591 | Research & Development | 2 | 1 | Medical | 1 | 7 | ... | 4 | 80 | 1 | 6 | 3 | 3 | 2 | 2 | 2 | 2 |
| 5 | 32 | No | Travel_Frequently | 1005 | Research & Development | 2 | 2 | Life Sciences | 1 | 8 | ... | 3 | 80 | 0 | 8 | 2 | 2 | 7 | 7 | 3 | 6 |
| 6 | 59 | No | Travel_Rarely | 1324 | Research & Development | 3 | 3 | Medical | 1 | 10 | ... | 1 | 80 | 3 | 12 | 3 | 2 | 1 | 0 | 0 | 0 |
| 7 | 30 | No | Travel_Rarely | 1358 | Research & Development | 24 | 1 | Life Sciences | 1 | 11 | ... | 2 | 80 | 1 | 1 | 2 | 3 | 1 | 0 | 0 | 0 |
| 8 | 38 | No | Travel_Frequently | 216 | Research & Development | 23 | 3 | Life Sciences | 1 | 12 | ... | 2 | 80 | 0 | 10 | 2 | 3 | 9 | 7 | 1 | 8 |
| 9 | 36 | No | Travel_Rarely | 1299 | Research & Development | 27 | 3 | Medical | 1 | 13 | ... | 2 | 80 | 2 | 17 | 3 | 2 | 7 | 7 | 7 | 7 |
10 rows × 35 columns
df.tail(10)
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1460 | 29 | No | Travel_Rarely | 468 | Research & Development | 28 | 4 | Medical | 1 | 2054 | ... | 2 | 80 | 0 | 5 | 3 | 1 | 5 | 4 | 0 | 4 |
| 1461 | 50 | Yes | Travel_Rarely | 410 | Sales | 28 | 3 | Marketing | 1 | 2055 | ... | 2 | 80 | 1 | 20 | 3 | 3 | 3 | 2 | 2 | 0 |
| 1462 | 39 | No | Travel_Rarely | 722 | Sales | 24 | 1 | Marketing | 1 | 2056 | ... | 1 | 80 | 1 | 21 | 2 | 2 | 20 | 9 | 9 | 6 |
| 1463 | 31 | No | Non-Travel | 325 | Research & Development | 5 | 3 | Medical | 1 | 2057 | ... | 2 | 80 | 0 | 10 | 2 | 3 | 9 | 4 | 1 | 7 |
| 1464 | 26 | No | Travel_Rarely | 1167 | Sales | 5 | 3 | Other | 1 | 2060 | ... | 4 | 80 | 0 | 5 | 2 | 3 | 4 | 2 | 0 | 0 |
| 1465 | 36 | No | Travel_Frequently | 884 | Research & Development | 23 | 2 | Medical | 1 | 2061 | ... | 3 | 80 | 1 | 17 | 3 | 3 | 5 | 2 | 0 | 3 |
| 1466 | 39 | No | Travel_Rarely | 613 | Research & Development | 6 | 1 | Medical | 1 | 2062 | ... | 1 | 80 | 1 | 9 | 5 | 3 | 7 | 7 | 1 | 7 |
| 1467 | 27 | No | Travel_Rarely | 155 | Research & Development | 4 | 3 | Life Sciences | 1 | 2064 | ... | 2 | 80 | 1 | 6 | 0 | 3 | 6 | 2 | 0 | 3 |
| 1468 | 49 | No | Travel_Frequently | 1023 | Sales | 2 | 3 | Medical | 1 | 2065 | ... | 4 | 80 | 0 | 17 | 3 | 2 | 9 | 6 | 0 | 8 |
| 1469 | 34 | No | Travel_Rarely | 628 | Research & Development | 8 | 3 | Medical | 1 | 2068 | ... | 1 | 80 | 0 | 6 | 3 | 4 | 4 | 3 | 1 | 2 |
10 rows × 35 columns
df["Department"]
0 Sales
1 Research & Development
2 Research & Development
3 Research & Development
4 Research & Development
5 Research & Development
6 Research & Development
7 Research & Development
8 Research & Development
9 Research & Development
10 Research & Development
11 Research & Development
12 Research & Development
13 Research & Development
14 Research & Development
15 Research & Development
16 Research & Development
17 Research & Development
18 Sales
19 Research & Development
20 Research & Development
21 Sales
22 Research & Development
23 Research & Development
24 Research & Development
25 Research & Development
26 Research & Development
27 Sales
28 Research & Development
29 Sales
...
1440 Research & Development
1441 Research & Development
1442 Research & Development
1443 Research & Development
1444 Research & Development
1445 Research & Development
1446 Sales
1447 Sales
1448 Sales
1449 Research & Development
1450 Human Resources
1451 Sales
1452 Sales
1453 Sales
1454 Sales
1455 Research & Development
1456 Research & Development
1457 Research & Development
1458 Research & Development
1459 Research & Development
1460 Research & Development
1461 Sales
1462 Sales
1463 Research & Development
1464 Sales
1465 Research & Development
1466 Research & Development
1467 Research & Development
1468 Sales
1469 Research & Development
Name: Department, Length: 1470, dtype: object
df[["Department","Education"]]
| Department | Education | |
|---|---|---|
| 0 | Sales | 2 |
| 1 | Research & Development | 1 |
| 2 | Research & Development | 2 |
| 3 | Research & Development | 4 |
| 4 | Research & Development | 1 |
| 5 | Research & Development | 2 |
| 6 | Research & Development | 3 |
| 7 | Research & Development | 1 |
| 8 | Research & Development | 3 |
| 9 | Research & Development | 3 |
| 10 | Research & Development | 3 |
| 11 | Research & Development | 2 |
| 12 | Research & Development | 1 |
| 13 | Research & Development | 2 |
| 14 | Research & Development | 3 |
| 15 | Research & Development | 4 |
| 16 | Research & Development | 2 |
| 17 | Research & Development | 2 |
| 18 | Sales | 4 |
| 19 | Research & Development | 3 |
| 20 | Research & Development | 2 |
| 21 | Sales | 4 |
| 22 | Research & Development | 4 |
| 23 | Research & Development | 2 |
| 24 | Research & Development | 1 |
| 25 | Research & Development | 3 |
| 26 | Research & Development | 1 |
| 27 | Sales | 4 |
| 28 | Research & Development | 4 |
| 29 | Sales | 4 |
| ... | ... | ... |
| 1440 | Research & Development | 2 |
| 1441 | Research & Development | 4 |
| 1442 | Research & Development | 4 |
| 1443 | Research & Development | 3 |
| 1444 | Research & Development | 2 |
| 1445 | Research & Development | 4 |
| 1446 | Sales | 3 |
| 1447 | Sales | 4 |
| 1448 | Sales | 3 |
| 1449 | Research & Development | 3 |
| 1450 | Human Resources | 4 |
| 1451 | Sales | 2 |
| 1452 | Sales | 4 |
| 1453 | Sales | 4 |
| 1454 | Sales | 3 |
| 1455 | Research & Development | 4 |
| 1456 | Research & Development | 4 |
| 1457 | Research & Development | 4 |
| 1458 | Research & Development | 4 |
| 1459 | Research & Development | 2 |
| 1460 | Research & Development | 4 |
| 1461 | Sales | 3 |
| 1462 | Sales | 1 |
| 1463 | Research & Development | 3 |
| 1464 | Sales | 3 |
| 1465 | Research & Development | 2 |
| 1466 | Research & Development | 1 |
| 1467 | Research & Development | 3 |
| 1468 | Sales | 3 |
| 1469 | Research & Development | 3 |
1470 rows × 2 columns
# 複数列を選択する場合にはリスト表記を使う
df.loc[:, ["Department", "Education"]]
| Department | Education | |
|---|---|---|
| 0 | Sales | 2 |
| 1 | Research & Development | 1 |
| 2 | Research & Development | 2 |
| 3 | Research & Development | 4 |
| 4 | Research & Development | 1 |
| 5 | Research & Development | 2 |
| 6 | Research & Development | 3 |
| 7 | Research & Development | 1 |
| 8 | Research & Development | 3 |
| 9 | Research & Development | 3 |
| 10 | Research & Development | 3 |
| 11 | Research & Development | 2 |
| 12 | Research & Development | 1 |
| 13 | Research & Development | 2 |
| 14 | Research & Development | 3 |
| 15 | Research & Development | 4 |
| 16 | Research & Development | 2 |
| 17 | Research & Development | 2 |
| 18 | Sales | 4 |
| 19 | Research & Development | 3 |
| 20 | Research & Development | 2 |
| 21 | Sales | 4 |
| 22 | Research & Development | 4 |
| 23 | Research & Development | 2 |
| 24 | Research & Development | 1 |
| 25 | Research & Development | 3 |
| 26 | Research & Development | 1 |
| 27 | Sales | 4 |
| 28 | Research & Development | 4 |
| 29 | Sales | 4 |
| ... | ... | ... |
| 1440 | Research & Development | 2 |
| 1441 | Research & Development | 4 |
| 1442 | Research & Development | 4 |
| 1443 | Research & Development | 3 |
| 1444 | Research & Development | 2 |
| 1445 | Research & Development | 4 |
| 1446 | Sales | 3 |
| 1447 | Sales | 4 |
| 1448 | Sales | 3 |
| 1449 | Research & Development | 3 |
| 1450 | Human Resources | 4 |
| 1451 | Sales | 2 |
| 1452 | Sales | 4 |
| 1453 | Sales | 4 |
| 1454 | Sales | 3 |
| 1455 | Research & Development | 4 |
| 1456 | Research & Development | 4 |
| 1457 | Research & Development | 4 |
| 1458 | Research & Development | 4 |
| 1459 | Research & Development | 2 |
| 1460 | Research & Development | 4 |
| 1461 | Sales | 3 |
| 1462 | Sales | 1 |
| 1463 | Research & Development | 3 |
| 1464 | Sales | 3 |
| 1465 | Research & Development | 2 |
| 1466 | Research & Development | 1 |
| 1467 | Research & Development | 3 |
| 1468 | Sales | 3 |
| 1469 | Research & Development | 3 |
1470 rows × 2 columns
# 行は全てを選択するために「:」を入れている。
df.loc[:,"Department"]
0 Sales
1 Research & Development
2 Research & Development
3 Research & Development
4 Research & Development
5 Research & Development
6 Research & Development
7 Research & Development
8 Research & Development
9 Research & Development
10 Research & Development
11 Research & Development
12 Research & Development
13 Research & Development
14 Research & Development
15 Research & Development
16 Research & Development
17 Research & Development
18 Sales
19 Research & Development
20 Research & Development
21 Sales
22 Research & Development
23 Research & Development
24 Research & Development
25 Research & Development
26 Research & Development
27 Sales
28 Research & Development
29 Sales
...
1440 Research & Development
1441 Research & Development
1442 Research & Development
1443 Research & Development
1444 Research & Development
1445 Research & Development
1446 Sales
1447 Sales
1448 Sales
1449 Research & Development
1450 Human Resources
1451 Sales
1452 Sales
1453 Sales
1454 Sales
1455 Research & Development
1456 Research & Development
1457 Research & Development
1458 Research & Development
1459 Research & Development
1460 Research & Development
1461 Sales
1462 Sales
1463 Research & Development
1464 Sales
1465 Research & Development
1466 Research & Development
1467 Research & Development
1468 Sales
1469 Research & Development
Name: Department, Length: 1470, dtype: object
# 番号で選択
df.iloc[:, 0]
0 41
1 49
2 37
3 33
4 27
5 32
6 59
7 30
8 38
9 36
10 35
11 29
12 31
13 34
14 28
15 29
16 32
17 22
18 53
19 38
20 24
21 36
22 34
23 21
24 34
25 53
26 32
27 42
28 44
29 46
..
1440 36
1441 56
1442 29
1443 42
1444 56
1445 41
1446 34
1447 36
1448 41
1449 32
1450 35
1451 38
1452 50
1453 36
1454 45
1455 40
1456 35
1457 40
1458 35
1459 29
1460 29
1461 50
1462 39
1463 31
1464 26
1465 36
1466 39
1467 27
1468 49
1469 34
Name: Age, Length: 1470, dtype: int64
#複数で連番の場合。リスト表記でも行ける
df.iloc[:, 0:2]
| Age | Attrition | |
|---|---|---|
| 0 | 41 | Yes |
| 1 | 49 | No |
| 2 | 37 | Yes |
| 3 | 33 | No |
| 4 | 27 | No |
| 5 | 32 | No |
| 6 | 59 | No |
| 7 | 30 | No |
| 8 | 38 | No |
| 9 | 36 | No |
| 10 | 35 | No |
| 11 | 29 | No |
| 12 | 31 | No |
| 13 | 34 | No |
| 14 | 28 | Yes |
| 15 | 29 | No |
| 16 | 32 | No |
| 17 | 22 | No |
| 18 | 53 | No |
| 19 | 38 | No |
| 20 | 24 | No |
| 21 | 36 | Yes |
| 22 | 34 | No |
| 23 | 21 | No |
| 24 | 34 | Yes |
| 25 | 53 | No |
| 26 | 32 | Yes |
| 27 | 42 | No |
| 28 | 44 | No |
| 29 | 46 | No |
| ... | ... | ... |
| 1440 | 36 | No |
| 1441 | 56 | No |
| 1442 | 29 | Yes |
| 1443 | 42 | No |
| 1444 | 56 | Yes |
| 1445 | 41 | No |
| 1446 | 34 | No |
| 1447 | 36 | No |
| 1448 | 41 | No |
| 1449 | 32 | No |
| 1450 | 35 | No |
| 1451 | 38 | No |
| 1452 | 50 | Yes |
| 1453 | 36 | No |
| 1454 | 45 | No |
| 1455 | 40 | No |
| 1456 | 35 | No |
| 1457 | 40 | No |
| 1458 | 35 | No |
| 1459 | 29 | No |
| 1460 | 29 | No |
| 1461 | 50 | Yes |
| 1462 | 39 | No |
| 1463 | 31 | No |
| 1464 | 26 | No |
| 1465 | 36 | No |
| 1466 | 39 | No |
| 1467 | 27 | No |
| 1468 | 49 | No |
| 1469 | 34 | No |
1470 rows × 2 columns
df.T
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 1460 | 1461 | 1462 | 1463 | 1464 | 1465 | 1466 | 1467 | 1468 | 1469 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 41 | 49 | 37 | 33 | 27 | 32 | 59 | 30 | 38 | 36 | ... | 29 | 50 | 39 | 31 | 26 | 36 | 39 | 27 | 49 | 34 |
| Attrition | Yes | No | Yes | No | No | No | No | No | No | No | ... | No | Yes | No | No | No | No | No | No | No | No |
| BusinessTravel | Travel_Rarely | Travel_Frequently | Travel_Rarely | Travel_Frequently | Travel_Rarely | Travel_Frequently | Travel_Rarely | Travel_Rarely | Travel_Frequently | Travel_Rarely | ... | Travel_Rarely | Travel_Rarely | Travel_Rarely | Non-Travel | Travel_Rarely | Travel_Frequently | Travel_Rarely | Travel_Rarely | Travel_Frequently | Travel_Rarely |
| DailyRate | 1102 | 279 | 1373 | 1392 | 591 | 1005 | 1324 | 1358 | 216 | 1299 | ... | 468 | 410 | 722 | 325 | 1167 | 884 | 613 | 155 | 1023 | 628 |
| Department | Sales | Research & Development | Research & Development | Research & Development | Research & Development | Research & Development | Research & Development | Research & Development | Research & Development | Research & Development | ... | Research & Development | Sales | Sales | Research & Development | Sales | Research & Development | Research & Development | Research & Development | Sales | Research & Development |
| DistanceFromHome | 1 | 8 | 2 | 3 | 2 | 2 | 3 | 24 | 23 | 27 | ... | 28 | 28 | 24 | 5 | 5 | 23 | 6 | 4 | 2 | 8 |
| Education | 2 | 1 | 2 | 4 | 1 | 2 | 3 | 1 | 3 | 3 | ... | 4 | 3 | 1 | 3 | 3 | 2 | 1 | 3 | 3 | 3 |
| EducationField | Life Sciences | Life Sciences | Other | Life Sciences | Medical | Life Sciences | Medical | Life Sciences | Life Sciences | Medical | ... | Medical | Marketing | Marketing | Medical | Other | Medical | Medical | Life Sciences | Medical | Medical |
| EmployeeCount | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ... | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| EmployeeNumber | 1 | 2 | 4 | 5 | 7 | 8 | 10 | 11 | 12 | 13 | ... | 2054 | 2055 | 2056 | 2057 | 2060 | 2061 | 2062 | 2064 | 2065 | 2068 |
| EnvironmentSatisfaction | 2 | 3 | 4 | 4 | 1 | 4 | 3 | 4 | 4 | 3 | ... | 4 | 4 | 2 | 2 | 4 | 3 | 4 | 2 | 4 | 2 |
| Gender | Female | Male | Male | Female | Male | Male | Female | Male | Male | Male | ... | Female | Male | Female | Male | Female | Male | Male | Male | Male | Male |
| HourlyRate | 94 | 61 | 92 | 56 | 40 | 79 | 81 | 67 | 44 | 94 | ... | 73 | 39 | 60 | 74 | 30 | 41 | 42 | 87 | 63 | 82 |
| JobInvolvement | 3 | 2 | 2 | 3 | 3 | 3 | 4 | 3 | 2 | 3 | ... | 2 | 2 | 2 | 3 | 2 | 4 | 2 | 4 | 2 | 4 |
| JobLevel | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 3 | 2 | ... | 1 | 3 | 4 | 2 | 1 | 2 | 3 | 2 | 2 | 2 |
| JobRole | Sales Executive | Research Scientist | Laboratory Technician | Research Scientist | Laboratory Technician | Laboratory Technician | Laboratory Technician | Laboratory Technician | Manufacturing Director | Healthcare Representative | ... | Research Scientist | Sales Executive | Sales Executive | Manufacturing Director | Sales Representative | Laboratory Technician | Healthcare Representative | Manufacturing Director | Sales Executive | Laboratory Technician |
| JobSatisfaction | 4 | 2 | 3 | 3 | 2 | 4 | 1 | 3 | 3 | 3 | ... | 1 | 1 | 4 | 1 | 3 | 4 | 1 | 2 | 2 | 3 |
| MaritalStatus | Single | Married | Single | Married | Married | Single | Married | Divorced | Single | Married | ... | Single | Divorced | Married | Single | Single | Married | Married | Married | Married | Married |
| MonthlyIncome | 5993 | 5130 | 2090 | 2909 | 3468 | 3068 | 2670 | 2693 | 9526 | 5237 | ... | 3785 | 10854 | 12031 | 9936 | 2966 | 2571 | 9991 | 6142 | 5390 | 4404 |
| MonthlyRate | 19479 | 24907 | 2396 | 23159 | 16632 | 11864 | 9964 | 13335 | 8787 | 16577 | ... | 8489 | 16586 | 8828 | 3787 | 21378 | 12290 | 21457 | 5174 | 13243 | 10228 |
| NumCompaniesWorked | 8 | 1 | 6 | 1 | 9 | 0 | 4 | 1 | 0 | 6 | ... | 1 | 4 | 0 | 0 | 0 | 4 | 4 | 1 | 2 | 2 |
| Over18 | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | ... | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
| OverTime | Yes | No | Yes | Yes | No | No | Yes | No | No | No | ... | No | Yes | No | No | No | No | No | Yes | No | No |
| PercentSalaryHike | 11 | 23 | 15 | 11 | 12 | 13 | 20 | 22 | 21 | 13 | ... | 14 | 13 | 11 | 19 | 18 | 17 | 15 | 20 | 14 | 12 |
| PerformanceRating | 3 | 4 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 3 | ... | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 3 | 3 |
| RelationshipSatisfaction | 1 | 4 | 2 | 3 | 4 | 3 | 1 | 2 | 2 | 2 | ... | 2 | 2 | 1 | 2 | 4 | 3 | 1 | 2 | 4 | 1 |
| StandardHours | 80 | 80 | 80 | 80 | 80 | 80 | 80 | 80 | 80 | 80 | ... | 80 | 80 | 80 | 80 | 80 | 80 | 80 | 80 | 80 | 80 |
| StockOptionLevel | 0 | 1 | 0 | 0 | 1 | 0 | 3 | 1 | 0 | 2 | ... | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
| TotalWorkingYears | 8 | 10 | 7 | 8 | 6 | 8 | 12 | 1 | 10 | 17 | ... | 5 | 20 | 21 | 10 | 5 | 17 | 9 | 6 | 17 | 6 |
| TrainingTimesLastYear | 0 | 3 | 3 | 3 | 3 | 2 | 3 | 2 | 2 | 3 | ... | 3 | 3 | 2 | 2 | 2 | 3 | 5 | 0 | 3 | 3 |
| WorkLifeBalance | 1 | 3 | 3 | 3 | 3 | 2 | 2 | 3 | 3 | 2 | ... | 1 | 3 | 2 | 3 | 3 | 3 | 3 | 3 | 2 | 4 |
| YearsAtCompany | 6 | 10 | 0 | 8 | 2 | 7 | 1 | 1 | 9 | 7 | ... | 5 | 3 | 20 | 9 | 4 | 5 | 7 | 6 | 9 | 4 |
| YearsInCurrentRole | 4 | 7 | 0 | 7 | 2 | 7 | 0 | 0 | 7 | 7 | ... | 4 | 2 | 9 | 4 | 2 | 2 | 7 | 2 | 6 | 3 |
| YearsSinceLastPromotion | 0 | 1 | 0 | 3 | 2 | 3 | 0 | 0 | 1 | 7 | ... | 0 | 2 | 9 | 1 | 0 | 0 | 1 | 0 | 0 | 1 |
| YearsWithCurrManager | 5 | 7 | 0 | 0 | 2 | 6 | 0 | 0 | 8 | 7 | ... | 4 | 0 | 6 | 7 | 0 | 3 | 7 | 3 | 8 | 2 |
35 rows × 1470 columns
df.sort_index(axis=1, ascending=False)
| YearsWithCurrManager | YearsSinceLastPromotion | YearsInCurrentRole | YearsAtCompany | WorkLifeBalance | TrainingTimesLastYear | TotalWorkingYears | StockOptionLevel | StandardHours | RelationshipSatisfaction | ... | EmployeeNumber | EmployeeCount | EducationField | Education | DistanceFromHome | Department | DailyRate | BusinessTravel | Attrition | Age | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5 | 0 | 4 | 6 | 1 | 0 | 8 | 0 | 80 | 1 | ... | 1 | 1 | Life Sciences | 2 | 1 | Sales | 1102 | Travel_Rarely | Yes | 41 |
| 1 | 7 | 1 | 7 | 10 | 3 | 3 | 10 | 1 | 80 | 4 | ... | 2 | 1 | Life Sciences | 1 | 8 | Research & Development | 279 | Travel_Frequently | No | 49 |
| 2 | 0 | 0 | 0 | 0 | 3 | 3 | 7 | 0 | 80 | 2 | ... | 4 | 1 | Other | 2 | 2 | Research & Development | 1373 | Travel_Rarely | Yes | 37 |
| 3 | 0 | 3 | 7 | 8 | 3 | 3 | 8 | 0 | 80 | 3 | ... | 5 | 1 | Life Sciences | 4 | 3 | Research & Development | 1392 | Travel_Frequently | No | 33 |
| 4 | 2 | 2 | 2 | 2 | 3 | 3 | 6 | 1 | 80 | 4 | ... | 7 | 1 | Medical | 1 | 2 | Research & Development | 591 | Travel_Rarely | No | 27 |
| 5 | 6 | 3 | 7 | 7 | 2 | 2 | 8 | 0 | 80 | 3 | ... | 8 | 1 | Life Sciences | 2 | 2 | Research & Development | 1005 | Travel_Frequently | No | 32 |
| 6 | 0 | 0 | 0 | 1 | 2 | 3 | 12 | 3 | 80 | 1 | ... | 10 | 1 | Medical | 3 | 3 | Research & Development | 1324 | Travel_Rarely | No | 59 |
| 7 | 0 | 0 | 0 | 1 | 3 | 2 | 1 | 1 | 80 | 2 | ... | 11 | 1 | Life Sciences | 1 | 24 | Research & Development | 1358 | Travel_Rarely | No | 30 |
| 8 | 8 | 1 | 7 | 9 | 3 | 2 | 10 | 0 | 80 | 2 | ... | 12 | 1 | Life Sciences | 3 | 23 | Research & Development | 216 | Travel_Frequently | No | 38 |
| 9 | 7 | 7 | 7 | 7 | 2 | 3 | 17 | 2 | 80 | 2 | ... | 13 | 1 | Medical | 3 | 27 | Research & Development | 1299 | Travel_Rarely | No | 36 |
| 10 | 3 | 0 | 4 | 5 | 3 | 5 | 6 | 1 | 80 | 3 | ... | 14 | 1 | Medical | 3 | 16 | Research & Development | 809 | Travel_Rarely | No | 35 |
| 11 | 8 | 0 | 5 | 9 | 3 | 3 | 10 | 0 | 80 | 4 | ... | 15 | 1 | Life Sciences | 2 | 15 | Research & Development | 153 | Travel_Rarely | No | 29 |
| 12 | 3 | 4 | 2 | 5 | 2 | 1 | 5 | 1 | 80 | 4 | ... | 16 | 1 | Life Sciences | 1 | 26 | Research & Development | 670 | Travel_Rarely | No | 31 |
| 13 | 2 | 1 | 2 | 2 | 3 | 2 | 3 | 1 | 80 | 3 | ... | 18 | 1 | Medical | 2 | 19 | Research & Development | 1346 | Travel_Rarely | No | 34 |
| 14 | 3 | 0 | 2 | 4 | 3 | 4 | 6 | 0 | 80 | 2 | ... | 19 | 1 | Life Sciences | 3 | 24 | Research & Development | 103 | Travel_Rarely | Yes | 28 |
| 15 | 8 | 8 | 9 | 10 | 3 | 1 | 10 | 1 | 80 | 3 | ... | 20 | 1 | Life Sciences | 4 | 21 | Research & Development | 1389 | Travel_Rarely | No | 29 |
| 16 | 5 | 0 | 2 | 6 | 2 | 5 | 7 | 2 | 80 | 4 | ... | 21 | 1 | Life Sciences | 2 | 5 | Research & Development | 334 | Travel_Rarely | No | 32 |
| 17 | 0 | 0 | 0 | 1 | 2 | 2 | 1 | 2 | 80 | 2 | ... | 22 | 1 | Medical | 2 | 16 | Research & Development | 1123 | Non-Travel | No | 22 |
| 18 | 7 | 3 | 8 | 25 | 3 | 3 | 31 | 0 | 80 | 3 | ... | 23 | 1 | Life Sciences | 4 | 2 | Sales | 1219 | Travel_Rarely | No | 53 |
| 19 | 2 | 1 | 2 | 3 | 3 | 3 | 6 | 0 | 80 | 3 | ... | 24 | 1 | Life Sciences | 3 | 2 | Research & Development | 371 | Travel_Rarely | No | 38 |
| 20 | 3 | 1 | 2 | 4 | 2 | 5 | 5 | 1 | 80 | 4 | ... | 26 | 1 | Other | 2 | 11 | Research & Development | 673 | Non-Travel | No | 24 |
| 21 | 3 | 0 | 3 | 5 | 3 | 4 | 10 | 0 | 80 | 2 | ... | 27 | 1 | Life Sciences | 4 | 9 | Sales | 1218 | Travel_Rarely | Yes | 36 |
| 22 | 11 | 2 | 6 | 12 | 3 | 4 | 13 | 0 | 80 | 3 | ... | 28 | 1 | Life Sciences | 4 | 7 | Research & Development | 419 | Travel_Rarely | No | 34 |
| 23 | 0 | 0 | 0 | 0 | 3 | 6 | 0 | 0 | 80 | 4 | ... | 30 | 1 | Life Sciences | 2 | 15 | Research & Development | 391 | Travel_Rarely | No | 21 |
| 24 | 3 | 1 | 2 | 4 | 3 | 2 | 8 | 0 | 80 | 3 | ... | 31 | 1 | Medical | 1 | 6 | Research & Development | 699 | Travel_Rarely | Yes | 34 |
| 25 | 8 | 4 | 13 | 14 | 2 | 3 | 26 | 1 | 80 | 4 | ... | 32 | 1 | Other | 3 | 5 | Research & Development | 1282 | Travel_Rarely | No | 53 |
| 26 | 7 | 6 | 2 | 10 | 3 | 5 | 10 | 0 | 80 | 2 | ... | 33 | 1 | Life Sciences | 1 | 16 | Research & Development | 1125 | Travel_Frequently | Yes | 32 |
| 27 | 2 | 4 | 7 | 9 | 3 | 2 | 10 | 1 | 80 | 4 | ... | 35 | 1 | Marketing | 4 | 8 | Sales | 691 | Travel_Rarely | No | 42 |
| 28 | 17 | 5 | 6 | 22 | 3 | 4 | 24 | 1 | 80 | 4 | ... | 36 | 1 | Medical | 4 | 7 | Research & Development | 477 | Travel_Rarely | No | 44 |
| 29 | 1 | 2 | 2 | 2 | 2 | 2 | 22 | 0 | 80 | 4 | ... | 38 | 1 | Marketing | 4 | 2 | Sales | 705 | Travel_Rarely | No | 46 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 1440 | 2 | 0 | 2 | 4 | 3 | 3 | 18 | 3 | 80 | 2 | ... | 2025 | 1 | Life Sciences | 2 | 4 | Research & Development | 688 | Travel_Frequently | No | 36 |
| 1441 | 9 | 1 | 12 | 13 | 2 | 2 | 13 | 1 | 80 | 1 | ... | 2026 | 1 | Life Sciences | 4 | 1 | Research & Development | 667 | Non-Travel | No | 56 |
| 1442 | 2 | 2 | 2 | 2 | 4 | 3 | 4 | 3 | 80 | 2 | ... | 2027 | 1 | Medical | 4 | 1 | Research & Development | 1092 | Travel_Rarely | Yes | 29 |
| 1443 | 14 | 4 | 6 | 22 | 2 | 2 | 24 | 0 | 80 | 1 | ... | 2031 | 1 | Life Sciences | 3 | 2 | Research & Development | 300 | Travel_Rarely | No | 42 |
| 1444 | 8 | 9 | 9 | 10 | 1 | 4 | 14 | 1 | 80 | 4 | ... | 2032 | 1 | Technical Degree | 2 | 7 | Research & Development | 310 | Travel_Rarely | Yes | 56 |
| 1445 | 10 | 0 | 7 | 20 | 3 | 3 | 21 | 1 | 80 | 3 | ... | 2034 | 1 | Life Sciences | 4 | 28 | Research & Development | 582 | Travel_Rarely | No | 41 |
| 1446 | 7 | 1 | 7 | 8 | 3 | 2 | 8 | 2 | 80 | 4 | ... | 2035 | 1 | Marketing | 3 | 28 | Sales | 704 | Travel_Rarely | No | 34 |
| 1447 | 11 | 11 | 12 | 15 | 2 | 4 | 15 | 1 | 80 | 1 | ... | 2036 | 1 | Marketing | 4 | 15 | Sales | 301 | Non-Travel | No | 36 |
| 1448 | 4 | 0 | 4 | 5 | 3 | 5 | 14 | 1 | 80 | 3 | ... | 2037 | 1 | Life Sciences | 3 | 3 | Sales | 930 | Travel_Rarely | No | 41 |
| 1449 | 2 | 1 | 2 | 4 | 3 | 4 | 4 | 0 | 80 | 4 | ... | 2038 | 1 | Technical Degree | 3 | 2 | Research & Development | 529 | Travel_Rarely | No | 32 |
| 1450 | 7 | 1 | 0 | 9 | 3 | 2 | 9 | 0 | 80 | 3 | ... | 2040 | 1 | Life Sciences | 4 | 26 | Human Resources | 1146 | Travel_Rarely | No | 35 |
| 1451 | 9 | 1 | 7 | 10 | 3 | 1 | 10 | 1 | 80 | 3 | ... | 2041 | 1 | Life Sciences | 2 | 10 | Sales | 345 | Travel_Rarely | No | 38 |
| 1452 | 1 | 0 | 3 | 6 | 3 | 3 | 12 | 2 | 80 | 4 | ... | 2044 | 1 | Life Sciences | 4 | 1 | Sales | 878 | Travel_Frequently | Yes | 50 |
| 1453 | 0 | 0 | 3 | 6 | 2 | 2 | 8 | 1 | 80 | 1 | ... | 2045 | 1 | Marketing | 4 | 11 | Sales | 1120 | Travel_Rarely | No | 36 |
| 1454 | 1 | 0 | 3 | 5 | 3 | 3 | 8 | 0 | 80 | 3 | ... | 2046 | 1 | Life Sciences | 3 | 20 | Sales | 374 | Travel_Rarely | No | 45 |
| 1455 | 2 | 2 | 2 | 2 | 3 | 2 | 8 | 0 | 80 | 4 | ... | 2048 | 1 | Life Sciences | 4 | 2 | Research & Development | 1322 | Travel_Rarely | No | 40 |
| 1456 | 2 | 0 | 2 | 10 | 4 | 2 | 10 | 2 | 80 | 4 | ... | 2049 | 1 | Life Sciences | 4 | 18 | Research & Development | 1199 | Travel_Frequently | No | 35 |
| 1457 | 2 | 0 | 3 | 5 | 3 | 2 | 20 | 3 | 80 | 2 | ... | 2051 | 1 | Medical | 4 | 2 | Research & Development | 1194 | Travel_Rarely | No | 40 |
| 1458 | 1 | 1 | 3 | 4 | 3 | 5 | 4 | 1 | 80 | 4 | ... | 2052 | 1 | Life Sciences | 4 | 1 | Research & Development | 287 | Travel_Rarely | No | 35 |
| 1459 | 3 | 0 | 3 | 4 | 3 | 2 | 10 | 1 | 80 | 1 | ... | 2053 | 1 | Other | 2 | 13 | Research & Development | 1378 | Travel_Rarely | No | 29 |
| 1460 | 4 | 0 | 4 | 5 | 1 | 3 | 5 | 0 | 80 | 2 | ... | 2054 | 1 | Medical | 4 | 28 | Research & Development | 468 | Travel_Rarely | No | 29 |
| 1461 | 0 | 2 | 2 | 3 | 3 | 3 | 20 | 1 | 80 | 2 | ... | 2055 | 1 | Marketing | 3 | 28 | Sales | 410 | Travel_Rarely | Yes | 50 |
| 1462 | 6 | 9 | 9 | 20 | 2 | 2 | 21 | 1 | 80 | 1 | ... | 2056 | 1 | Marketing | 1 | 24 | Sales | 722 | Travel_Rarely | No | 39 |
| 1463 | 7 | 1 | 4 | 9 | 3 | 2 | 10 | 0 | 80 | 2 | ... | 2057 | 1 | Medical | 3 | 5 | Research & Development | 325 | Non-Travel | No | 31 |
| 1464 | 0 | 0 | 2 | 4 | 3 | 2 | 5 | 0 | 80 | 4 | ... | 2060 | 1 | Other | 3 | 5 | Sales | 1167 | Travel_Rarely | No | 26 |
| 1465 | 3 | 0 | 2 | 5 | 3 | 3 | 17 | 1 | 80 | 3 | ... | 2061 | 1 | Medical | 2 | 23 | Research & Development | 884 | Travel_Frequently | No | 36 |
| 1466 | 7 | 1 | 7 | 7 | 3 | 5 | 9 | 1 | 80 | 1 | ... | 2062 | 1 | Medical | 1 | 6 | Research & Development | 613 | Travel_Rarely | No | 39 |
| 1467 | 3 | 0 | 2 | 6 | 3 | 0 | 6 | 1 | 80 | 2 | ... | 2064 | 1 | Life Sciences | 3 | 4 | Research & Development | 155 | Travel_Rarely | No | 27 |
| 1468 | 8 | 0 | 6 | 9 | 2 | 3 | 17 | 0 | 80 | 4 | ... | 2065 | 1 | Medical | 3 | 2 | Sales | 1023 | Travel_Frequently | No | 49 |
| 1469 | 2 | 1 | 3 | 4 | 4 | 3 | 6 | 0 | 80 | 1 | ... | 2068 | 1 | Medical | 3 | 8 | Research & Development | 628 | Travel_Rarely | No | 34 |
1470 rows × 35 columns
# ラベル「Age」の値で昇順で。
df2 = df.sort_values(by=["Age"], ascending=True)
df2.head()
| Age | Attrition | BusinessTravel | DailyRate | Department | DistanceFromHome | Education | EducationField | EmployeeCount | EmployeeNumber | ... | RelationshipSatisfaction | StandardHours | StockOptionLevel | TotalWorkingYears | TrainingTimesLastYear | WorkLifeBalance | YearsAtCompany | YearsInCurrentRole | YearsSinceLastPromotion | YearsWithCurrManager | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1311 | 18 | No | Non-Travel | 1431 | Research & Development | 14 | 3 | Medical | 1 | 1839 | ... | 3 | 80 | 0 | 0 | 4 | 1 | 0 | 0 | 0 | 0 |
| 457 | 18 | Yes | Travel_Frequently | 1306 | Sales | 5 | 3 | Marketing | 1 | 614 | ... | 4 | 80 | 0 | 0 | 3 | 3 | 0 | 0 | 0 | 0 |
| 972 | 18 | No | Non-Travel | 1124 | Research & Development | 1 | 3 | Life Sciences | 1 | 1368 | ... | 3 | 80 | 0 | 0 | 5 | 4 | 0 | 0 | 0 | 0 |
| 301 | 18 | No | Travel_Rarely | 812 | Sales | 10 | 3 | Medical | 1 | 411 | ... | 1 | 80 | 0 | 0 | 2 | 3 | 0 | 0 | 0 | 0 |
| 296 | 18 | Yes | Travel_Rarely | 230 | Research & Development | 3 | 3 | Life Sciences | 1 | 405 | ... | 3 | 80 | 0 | 0 | 2 | 3 | 0 | 0 | 0 | 0 |
5 rows × 35 columns
# 部門名
set(df.Department.tolist())
{'Human Resources', 'Research & Development', 'Sales'}
df.drop("A", axis=1, inplace=True)
df.drop(5, inplace=True)
df.T.dot(df)