How to replace dataframe columns by another dataframe columns with pandas ?

Active February 09, 2022    /    Viewed 372    /    Comments 0    /    Edit


Examples of how to replace dataframe columns by another dataframe columns with pandas

Dataframe 1

Let's create a first dataframe with pandas

import pandas as pd
import random
import numpy as np

data = np.arange(40)
data = data.reshape((10,4))

categorical_data = ['M', 'M', 'F', 'F', 'F']

gender_list = [random.choice(categorical_data) for i in range( data.shape[0]  )]

label_list = [random.choice([0,1]) for i in range( data.shape[0]  )]

df = pd.DataFrame(data,columns=['A','B','C','D'])

df['Gender'] = gender_list
df['Label'] = label_list

gives

    A   B   C   D Gender  Label
0   0   1   2   3      M      0
1   4   5   6   7      F      1
2   8   9  10  11      F      1
3  12  13  14  15      F      0
4  16  17  18  19      F      1
5  20  21  22  23      F      1
6  24  25  26  27      M      1
7  28  29  30  31      F      1
8  32  33  34  35      F      0
9  36  37  38  39      M      0

Dataframe 2

Let's now create another dataframe:

data = np.random.randint(-100,0,size=(10,4))

label_list = [random.choice([0,1]) for i in range( data.shape[0]  )]

df2 = pd.DataFrame(data,columns=['A','B','C','D'])

gives for example

    A   B   C   D
0 -41 -53 -38 -14
1 -28 -87 -70 -49
2 -49 -70 -18 -27
3 -26 -90 -74 -24
4 -48 -28 -14 -40
5  -9 -53 -84 -11
6 -62 -26 -35 -90
7  -5 -16 -45 -73
8 -97 -66 -61 -30
9 -14 -51 -79 -60

Replacing dataframe 1 columns by dataframe 2 columns

To replace dataframe 1 columns by dataframe 2 columns, a solution is to do:

columns=['A','B','C','D']

df[columns] = df2[columns]

gives then

    A   B   C   D Gender  Label
0 -41 -53 -38 -14      M      0
1 -28 -87 -70 -49      F      1
2 -49 -70 -18 -27      F      1
3 -26 -90 -74 -24      F      0
4 -48 -28 -14 -40      F      1
5  -9 -53 -84 -11      F      1
6 -62 -26 -35 -90      M      1
7  -5 -16 -45 -73      F      1
8 -97 -66 -61 -30      F      0
9 -14 -51 -79 -60      M      0

Dataframe 2 with randomly shuffled indexes

Another example: let's shuffle dataframe indexes:

df2 = df2.sample(frac=1)

gives

    A   B   C   D
1 -28 -87 -70 -49
5  -9 -53 -84 -11
8 -97 -66 -61 -30
7  -5 -16 -45 -73
3 -26 -90 -74 -24
9 -14 -51 -79 -60
4 -48 -28 -14 -40
6 -62 -26 -35 -90
0 -41 -53 -38 -14
2 -49 -70 -18 -27

and

df[columns] = df2[columns]

still gives

    A   B   C   D Gender  Label
0 -41 -53 -38 -14      M      0
1 -28 -87 -70 -49      F      1
2 -49 -70 -18 -27      F      1
3 -26 -90 -74 -24      F      0
4 -48 -28 -14 -40      F      1
5  -9 -53 -84 -11      F      1
6 -62 -26 -35 -90      M      1
7  -5 -16 -45 -73      F      1
8 -97 -66 -61 -30      F      0
9 -14 -51 -79 -60      M      0

Dataframe 2 with less rows than dataframe 1

Now if dataframe 2 has less rows than dataframe 1

df2 = df2.sample(frac=0.2)

gives for example

    A   B   C   D
3 -26 -90 -74 -24
0 -41 -53 -38 -14

and

print(df2.index)

gives

Int64Index([3, 0], dtype='int64')

A solution is to do:

df.loc[df2.index,columns] = df2[columns]

gives

    A   B   C   D Gender  Label
0 -41 -53 -38 -14      M      0
1   4   5   6   7      F      1
2   8   9  10  11      F      1
3 -26 -90 -74 -24      F      0
4  16  17  18  19      F      1
5  20  21  22  23      F      1
6  24  25  26  27      M      1
7  28  29  30  31      F      1
8  32  33  34  35      F      0
9  36  37  38  39      M      0

Dataframe 2 with different column name

Another example with dataframe 2 with different column name:

data = np.random.randint(-100,0,size=(10,4))

label_list = [random.choice([0,1]) for i in range( data.shape[0]  )]

df2 = pd.DataFrame(data,columns=['E','F','G','H'])

df[columns] = df2[['E','F','G','H']]

still gives

    A   B    C    D Gender  Label
0 -65 -24  -93  -51      M      0
1 -21 -20 -100  -81      F      1
2 -44 -70  -95  -48      F      1
3 -46 -91  -82  -32      F      0
4 -42 -66   -6  -54      F      1
5 -71 -18   -3 -100      F      1
6 -10 -17  -18  -52      M      1
7 -11 -46  -56  -98      F      1
8 -65 -40  -11  -56      F      0
9 -16 -55   -5  -48      M      0

References

pandas replace columns by another dataframe
- Replacing Columns from one dataframe with columns from another dataframe in pandas
- Python Pandas update a dataframe value from another dataframe
- Replace column values based on another dataframe python pandas - better way?


Card image cap
profile-image
Daidalos

Hi, I am Ben.

I have developed this web site from scratch with Django to share with everyone my notes. If you have any ideas or suggestions to improve the site, let me know ! (you can contact me using the form in the welcome page). Thanks!



Did you find this content useful ?, If so, please consider donating a tip to the author(s). MoonBooks.org is visited by millions of people each year and it will help us to maintain our servers and create new contents.

Amount