Pandas DataFrame ไปยังรายการพจนานุกรม

165

ฉันมี DataFrame ต่อไปนี้:

ลูกค้ารายการ 1 รายการ 2 รายการ 3
มะเขือเทศนม 1 แอปเปิ้ล
2 น้ำมันฝรั่งสีส้ม
3 มะม่วงน้ำผลไม้ชิป

ซึ่งฉันต้องการแปลให้เป็นรายการของพจนานุกรมต่อแถว

rows = [{'customer': 1, 'item1': 'apple', 'item2': 'milk', 'item3': 'tomato'},
    {'customer': 2, 'item1': 'water', 'item2': 'orange', 'item3': 'potato'},
    {'customer': 3, 'item1': 'juice', 'item2': 'mango', 'item3': 'chips'}]

— โมฮัมมัดอิบราฮิม
แหล่งที่มา

2

ยินดีต้อนรับสู่ Stack Overflow! ฉันเยื้องตัวอย่างรหัสของคุณ 4 ช่องว่างเพื่อให้แสดงอย่างถูกต้อง - โปรดดูความช่วยเหลือการแก้ไขสำหรับข้อมูลเพิ่มเติมเกี่ยวกับการจัดรูปแบบ

— ByteHamster

189

แก้ไข

ขณะที่จอห์นกัลท์กล่าวในคำตอบของเขาdf.to_dict('records')คุณอาจจะใช้แทน มันเร็วกว่าการแปลงด้วยตนเอง

In [20]: timeit df.T.to_dict().values()
1000 loops, best of 3: 395 µs per loop

In [21]: timeit df.to_dict('records')
10000 loops, best of 3: 53 µs per loop

คำตอบเดิม

ใช้df.T.to_dict().values()เหมือนด้านล่าง:

In [1]: df
Out[1]:
   customer  item1   item2   item3
0         1  apple    milk  tomato
1         2  water  orange  potato
2         3  juice   mango   chips

In [2]: df.T.to_dict().values()
Out[2]:
[{'customer': 1.0, 'item1': 'apple', 'item2': 'milk', 'item3': 'tomato'},
 {'customer': 2.0, 'item1': 'water', 'item2': 'orange', 'item3': 'potato'},
 {'customer': 3.0, 'item1': 'juice', 'item2': 'mango', 'item3': 'chips'}]

— ComputerFellow
แหล่งที่มา

2

สิ่งที่จะเป็นวิธีแก้ปัญหาในกรณีที่มี dataframe สำหรับลูกค้าแต่ละแถวหลายแถว?

— Aziz

2

เมื่อฉันใช้df.T.to_dict().values()ฉันหลวมลำดับการจัดเรียงด้วย

— ฮุสเซน

เมื่อเปิดไฟล์ csv ไปยังรายการ dicts ฉันจะเพิ่มความเร็วเป็นสองเท่าด้วยunicodecsv.DictReader

— radtek

220

ใช้df.to_dict('records')- ให้เอาต์พุตโดยไม่ต้องแปลงจากภายนอก

In [2]: df.to_dict('records')
Out[2]:
[{'customer': 1L, 'item1': 'apple', 'item2': 'milk', 'item3': 'tomato'},
 {'customer': 2L, 'item1': 'water', 'item2': 'orange', 'item3': 'potato'},
 {'customer': 3L, 'item1': 'juice', 'item2': 'mango', 'item3': 'chips'}]

— ศูนย์
แหล่งที่มา

2

ฉันจะเปลี่ยนเพื่อรวมค่าดัชนีไว้ในแต่ละรายการของรายการผลลัพธ์ได้อย่างไร

— Gabriel L. Oliveira

5

@ GabrielL.Oliveira คุณสามารถทำ df.reset_index (). to_dict ('บันทึก')

— Wei Ma

ลำดับของคอลัมน์ที่สงวนไว้ในแต่ละกรณีคือรายการ nth ในรายการผลลัพธ์มักเป็นคอลัมน์ที่ n เสมอหรือไม่

— Cleb

@Cleb เป็นi.e. is the nth entry in the resulting list always also the nth column?คอลัมน์ที่ n หรือแถวที่ n

— Nauman Naeem

14

ในฐานะที่เป็นส่วนขยายของคำตอบของ John Galt -

สำหรับ DataFrame ต่อไปนี้

   customer  item1   item2   item3
0         1  apple    milk  tomato
1         2  water  orange  potato
2         3  juice   mango   chips

หากคุณต้องการรับรายการพจนานุกรมรวมถึงค่าดัชนีคุณสามารถทำสิ่งต่าง ๆ เช่น

df.to_dict('index')

ซึ่งส่งออกพจนานุกรมของพจนานุกรมที่คีย์ของพจนานุกรมหลักเป็นค่าดัชนี ในกรณีนี้โดยเฉพาะ

{0: {'customer': 1, 'item1': 'apple', 'item2': 'milk', 'item3': 'tomato'},
 1: {'customer': 2, 'item1': 'water', 'item2': 'orange', 'item3': 'potato'},
 2: {'customer': 3, 'item1': 'juice', 'item2': 'mango', 'item3': 'chips'}}

— Hossain Muctadir
แหล่งที่มา

1

หากคุณมีความสนใจในการเลือกเพียงหนึ่งคอลัมน์นี้จะทำงาน

df[["item1"]].to_dict("records")

ด้านล่างจะไม่ทำงานและสร้าง TypeError: ไม่รองรับประเภท: ฉันเชื่อว่านี่เป็นเพราะมันพยายามแปลงซีรีส์เป็น dict ไม่ใช่ Data Frame ไปเป็น dict

df["item1"].to_dict("records")

ฉันมีความต้องการที่จะเลือกเพียงหนึ่งคอลัมน์และแปลงเป็นรายการของ dicts ที่มีชื่อคอลัมน์เป็นกุญแจและติดอยู่กับสิ่งนี้เพื่อคิดว่าฉันจะแบ่งปัน

— โจริเวร่า
แหล่งที่มา