การเขียนโปรแกรม dataframe

7

อ่านไฟล์ของคู่“ key = value” ซ้ำ ๆ กันใน DataFrame

ฉันมีไฟล์ txt พร้อมข้อมูลในรูปแบบนี้ 3 บรรทัดแรกซ้ำไปซ้ำมา name=1 grade=A class=B name=2 grade=D class=A ฉันต้องการส่งออกข้อมูลในรูปแบบตารางตัวอย่างเช่น: name | grade | class 1 | A | B 2 | D | A ฉันกำลังพยายามตั้งค่าส่วนหัวและวนรอบข้อมูล สิ่งที่ฉันได้ลองไปแล้วคือ: def myfile(filename): with open(file1) as f: for line in f: yield line.strip().split('=',1) def pprint_df(dframe): print(tabulate(dframe, headers="keys", tablefmt="psql", showindex=False,)) #f = pd.DataFrame(myfile('file1') …

11 python pandas dataframe

6

ผสาน dataframes ตามหลายคอลัมน์และเกณฑ์

ฉันมีสองdata.frames มีคอลัมน์ร่วมกันหลาย ๆ (ที่นี่: date, city, ctryและ ( other_) number) ตอนนี้ฉันต้องการรวมไว้ในคอลัมน์ด้านบน แต่ยอมรับระดับความแตกต่าง: threshold.numbers <- 3 threshold.date <- 5 # in days หากความแตกต่างระหว่างdateรายการคือ> threshold.date( เป็นวัน) หรือ > threshold.numbersฉันไม่ต้องการรวมบรรทัด ในทำนองเดียวกันถ้ารายการในcityเป็นสตริงย่อยของรายการอื่นdfในcityคอลัมน์ฉันต้องการให้บรรทัดถูกรวมเข้าด้วยกัน [ถ้าใครมีความคิดที่ดีในการทดสอบสำหรับชื่อเมืองที่เกิดขึ้นจริงคล้ายคลึงกันผมยินดีที่จะได้ยินเกี่ยวกับมัน.] (และให้คนแรกdfของรายการของdate, cityและcountryแต่ทั้งสอง ( other_) numberคอลัมน์และคอลัมน์อื่น ๆ dfทั้งหมดใน ลองพิจารณาตัวอย่างต่อไปนี้: df1 <- data.frame(date = c("2003-08-29", "1999-06-12", "2000-08-29", "1999-02-24", "2001-04-17", "1999-06-30", "1999-03-16", "1999-07-16", "2001-08-29", …

11 r dataframe

4

การกรอง DataFrame ในกลุ่มที่จำนวนองค์ประกอบแตกต่างจาก 1

ฉันทำงานกับ DataFrame โดยมีโครงสร้างดังต่อไปนี้: import pandas as pd df = pd.DataFrame({'group':[1,1,1,2,2,2,2,3,3,3], 'brand':['A','B','X','C','D','X','X','E','F','X']}) print(df) group brand 0 1 A 1 1 B 2 1 X 3 2 C 4 2 D 5 2 X 6 2 X 7 3 E 8 3 F 9 3 X เป้าหมายของฉันคือดูเฉพาะกลุ่มที่มีแบรนด์เดียวที่Xเชื่อมโยงกับพวกเขา เนื่องจากกลุ่มหมายเลข 2 มีการสังเกตสองแบบเท่ากับแบรนด์Xจึงควรถูกกรองออกจาก DataFrame ที่ได้ …

10 python pandas dataframe

6

AttributeError: วัตถุ 'DataFrame' ไม่มีแอตทริบิวต์ 'ix'

ฉันได้รับข้อผิดพลาดด้านบนเมื่อฉันพยายามใช้แอตทริบิวต์. ix ของดาต้าดาต้าแพนด้าเพื่อดึงคอลัมน์ออกมาเช่น df.ix [:, 'col_header'] สคริปต์ทำงานเมื่อเช้านี้ แต่บ่ายนี้ฉันวิ่งไปในสภาพแวดล้อม linux ใหม่พร้อม Pandas ใหม่ มีคนอื่นเห็นข้อผิดพลาดนี้มาก่อนหรือไม่ ฉันค้นหาที่นี่และที่อื่น ๆ แต่หาไม่เจอ

9 python pandas dataframe

3

ไม่รู้สึกเฉพาะส่วนของคอลัมน์จาก pandas dataframe

ฉันมีตัวอย่าง dataframe ต่อไปนี้: df = pd.DataFrame(data = {'RecordID' : [1,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5], 'DisplayLabel' : ['Source','Test','Value 1','Value 2','Value3','Source','Test','Value 1','Value 2','Source','Test','Value 1','Value 2','Source','Test','Value 1','Value 2','Source','Test','Value 1','Value 2'], 'Value' : ['Web','Logic','S','I','Complete','Person','Voice','>20','P','Mail','OCR','A','I','Dictation','Understandable','S','I','Web','Logic','R','S']}) ซึ่งสร้าง DataFrame นี้: +-------+----------+---------------+----------------+ | Index | RecordID | Display Label | Value | +-------+----------+---------------+----------------+ | 0 | 1 | Source | Web | …

9 python pandas dataframe pivot melt

2

วิธีค้นหาค่าต่ำสุด N อันดับแรกจาก DataFrame, Python-3

ฉันมี Dataframe ด้านล่างด้วยฟิลด์ 'อายุ' ต้องการค้นหาอายุขั้นต่ำ 3 อันดับแรกจาก DataFrame DF = pd.DataFrame.from_dict({'Name':['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'], 'Age':[18, 45, 35, 70, 23, 24, 50, 65, 18, 23]}) DF['Age'].min() ต้องการอันดับสองอายุเช่น 18, 23 ในรายการวิธีการบรรลุเป้าหมายนี้ หมายเหตุ: DataFrame - DF มีข้อมูลซ้ำอายุเช่น 18 และ 23 ซ้ำสองครั้งต้องการค่าที่ไม่ซ้ำ

9 python python-3.x pandas dataframe pandas-groupby

5

การสร้างคอลัมน์ใหม่ตามเงื่อนไขโดยอิงตามแถวก่อนหน้า

ฉันมีกรอบข้อมูลติดตั้งดังนี้: df <- data.frame("id" = c(111,111,111,222,222,222,222,333,333,333,333), "Location" = c("A","B","A","A","C","B","A","B","A","A","A"), "Encounter" = c(1,2,3,1,2,3,4,1,2,3,4)) id Location Encounter 1 111 A 1 2 111 B 2 3 111 A 3 4 222 A 1 5 222 C 2 6 222 B 3 7 222 A 4 8 333 B 1 9 333 A …

9 r dataframe dplyr duplicates

1

ผสานสอง dataframes และเพิ่มระดับคอลัมน์ด้วยชื่อ

สวัสดีฉันขุดมาด้วยการลงประชามติเข้าร่วมและผสานวิธีการกับแพนด้าและดูเหมือนจะไม่พบสิ่งที่ฉันต้องการ สมมติว่าฉันมีสอง dataframes A = pd.DataFrame("A",index=[0,1,2,3,4],columns=['Col 1','Col 2','Col 3']) B = pd.DataFrame("B",index=[0,1,2,3,4],columns=['Col 1','Col 2','Col 3']) >>> A Col 1 Col 2 Col 3 0 A A A 1 A A A 2 A A A 3 A A A 4 A A A >>> B Col 1 Col 2 Col …

9 python pandas dataframe

3

Pandas - เติม nans จนไม่ใช่ค่า NULL แรก

ฉันมีชื่อไฟล์เช่น A B C 1 nan nan 2 nan 5 3 3 nan 4 nan nan ฉันจะเติม NULL ได้อย่างไร (ด้วย 0) สำหรับแต่ละชุดจนถึงค่า NULL แรกที่นำไปสู่ A B C 1 0 0 2 0 5 3 3 nan 4 nan nan

9 python pandas dataframe null

คำถามติดแท็ก dataframe